Skip to main content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
eLife. 2019; 8: e49662.
Published online 2019 Aug 16. doi: 10.7554/eLife.49662
PMCID: PMC6733595
PMID: 31418692

Apicomplexan-like parasites are polyphyletic and widely but selectively dependent on cryptic plastid organelles

John McCutcheon, Reviewing Editor and Detlef Weigel, Senior Editor
John McCutcheon, University of Montana;

Associated Data

Data Citations
Supplementary Materials
Data Availability Statement

Abstract

The phylum Apicomplexa comprises human pathogens such as Plasmodium but is also an under-explored hotspot of evolutionary diversity central to understanding the origins of parasitism and non-photosynthetic plastids. We generated single-cell transcriptomes for all major apicomplexan groups lacking large-scale sequence data. Phylogenetic analysis reveals that apicomplexan-like parasites are polyphyletic and their similar morphologies emerged convergently at least three times. Gregarines and eugregarines are monophyletic, against most expectations, and rhytidocystids and Eleutheroschizon are sister lineages to medically important taxa. Although previously unrecognized, plastids in deep-branching apicomplexans are common, and they contain some of the most divergent and AT-rich genomes ever found. In eugregarines, however, plastids are either abnormally reduced or absent, thus increasing known plastid losses in eukaryotes from two to four. Environmental sequences of ten novel plastid lineages and structural innovations in plastid proteins confirm that plastids in apicomplexans and their relatives are widespread and share a common, photosynthetic origin.

Research organism: Other

eLife digest

Microscopic parasites known collectively as apicomplexans are responsible for several infectious diseases in humans including malaria and toxoplasmosis. The cells of the malaria parasite and many other apicomplexans contain compartments known as cryptic chloroplasts that produce molecules the parasites need to survive. Cryptic chloroplasts are similar to the chloroplasts found in plant cells, but unlike plants the compartments in apicomplexans are unable to harvest energy from sunlight.

Since the cells of humans and other animals do not contain chloroplasts, cryptic chloroplasts are a potential target for new drugs to treat diseases caused by apicomplexans. However, it remains unclear how widespread cryptic chloroplasts are in these parasites, largely because few apicomplexans have been successfully grown in the laboratory.

To address this question, Janouškovec et al. used an approach called single-cell transcriptomics to study ten different apicomplexans. This provided new data about the genetic make-up of each parasite that the team analysed to find out how they are related to one another. The analysis revealed that, unexpectedly, apicomplexan parasites do not share a close common ancestor and are therefore not a natural grouping from an evolutionary perspective. Instead, their similar physical appearances and lifestyles evolved independently on at least three separate occasions.

Further analysis demonstrated that cryptic chloroplasts are common in apicomplexan parasites, including in lineages where they were not previously known to exist. However, at least three lineages of apicomplexans have independently lost their cryptic chloroplasts.

The findings of Janouškovec et al. shed new light on the importance of chloroplasts in the evolution of life and may help develop new treatments for diseases caused by apicomplexan parasites. Several drugs targeting the cryptic chloroplasts in malaria parasites are currently in clinical trials, and this work suggests that these drugs may also have the potential to be used against other apicomplexan parasites in the future.

Introduction

The phylum Apicomplexa is a major group of protistan parasites important in animal disease globally (we will use the name Apicomplexa hereinafter for the clade of parasites sensu stricto; a synonym of the taxon Sporozoa Leuckart, 1879; Adl et al., 2019). The group includes the human pathogens Plasmodium (haemosporidians), Toxoplasma (eucoccidians), Babesia (piroplasms), and Cryptosporidium (cryptosporidians), whose cell biology and genomes have been extensively studied. Conversely, most apicomplexans from invertebrates such as eugregarines, archigregarines, blastogregarines, protococcidians, and agamococcidians lack detailed genomic information, in part because they cannot be cultured in laboratory conditions. Because these uncultured groups are also deep-branching lineages with unresolved relationships to the medically important taxa (Leander and Ramey, 2006; Rueckert et al., 2011; Simdyanov et al., 2018; Simdyanov et al., 2017), the lack of their genomes and cell biology data hinders our understanding of apicomplexan evolution and the origin of parasitism itself. This, in turn, limits insights into infection mechanisms across the group: the structural and molecular make-up of machineries for host cell invasion such as the apical complex, pellicle, and glideosome, and how these relate to parasite life cycles, host preferences, and habitats.

Apicomplexans are evolutionarily derived from mixotrophic algae, a realization that first came with sequencing a plastid genome in Plasmodium and localizing it into a cryptic, non-photosynthetic organelle called ‘the apicoplast’ (Gardner et al., 1991; McFadden et al., 1996; Wilson et al., 1996). The apicoplast is a four-membrane plastid (a broader term we will use hereinafter to describe the organelle in both parasitic and free-living organisms), which is derived from a secondary endosymbiont. Where exactly this plastid endosymbiont came from was settled by data from two newly discovered photosynthetic relatives of Apicomplexa, Chromera velia and Vitrella brassicaformis (Janouskovec et al., 2010; Moore et al., 2008; Oborník et al., 2012). The photosynthetic plastids in Chromera and Vitrella are also surrounded by four membranes and they share conspicuous similarities with both the apicomplexan plastid and the plastid in peridinin-pigmented dinoflagellates, pointing to their common origin (Janouskovec et al., 2010). Chromera and Vitrella belong to a monophyletic group, called the chrompodellids, with heterotrophic colpodellids Alphamonas edax, Voromonas pontica, and 'Colpodella' angusta, which also retain non-photosynthetic plastids (Gile and Slamovits, 2014; Janouškovec et al., 2015). Plastids are nevertheless absent in Cryptosporidium (Abrahamsen et al., 2004), and they have never been recorded in other deep-branching apicomplexans (Toso and Omoto, 2007). Lack of plastids in these groups once fueled alternative ideas about independent gains of plastids in apicomplexans, dinoflagellates, and even Chromera (Bodył, 2005; Bodył et al., 2009). However, it also brings into question the metabolic importance of the plastid for the apicomplexan cell and its implications for plastid maintenance and loss in eukaryotes more broadly (Cavalier-Smith, 2013; Janouškovec et al., 2015).

Here, we sought to understand the phylogeny and plastid evolution of Apicomplexa by filling in major gaps in their sequence data. We used individual cells of parasites to generate transcriptomes from all major apicomplexan groups that currently lack laboratory cultures and large-scale transcriptomic or genomic data. By resolving their phylogeny, we observe that parasites with apicomplexan-like morphologies are polyphyletic and originated at least three times independently. We also show that gregarines and eugregarines are monophyletic, and blastogregarines are related to archigregarines, highlighting the importance of several traits uniquely shared among them. Many deep-branching apicomplexans contain plastidial metabolism and divergent, AT-rich plastid genomes, but eugregarines lost plastids at least twice independently. Phylogeny of 16S ribosomal RNA genes and structural novelties in plastid proteins demonstrate that plastids are widespread and ancestral in the group.

Results

First multiprotein dataset including all major apicomplexan lineages

We generated 13 transcriptomes for 10 parasites, representing six deep apicomplexan lineages with poor presence of sequence data: protococcidians, agamococcidians, blastogregarines, archigregarines, eugregarines (three different superfamilies), and incertae sedis species (Supplementary file 1). Between 1 and 80 cells per species were isolated from the intestines of marine annelids, molluscs, and barnacles. The parasite cells were washed and preserved for RNA extraction (Materials and methods). Transcriptomes were sequenced from amplified cDNA by pair-end Illumina HiSeq and assembled in Trinity. To resolve deep apicomplexans relationships, we modified a published dataset of slow-evolving nucleus-encoded markers (Derelle et al., 2016) by including broad, representative sampling of genomes and transcriptomes of apicomplexans and related taxa (Supplementary file 2). This produced a phylogenetic matrix of 296 protein sequences, which were individually verified for orthology by maximum likelihood phylogenies – this allowed us to unambiguously identify paralogous and contaminant sequences (Materials and methods). Our apicomplexan transcriptomes typically contained a single ortholog per gene; two exceptions to this were suggestive of cryptic species among the collected cells. In Rhytidocystis sp. 1, the most complete of multiple isoforms was selected, whereas two distinct sequence variants were present in Siedleckia nematoides (Figure 1—figure supplement 1A) and eventually merged into a single taxonomic unit (Figure 1A). Three other taxa were merged in the final dataset due to poor sequence representation: Ascogregarina, and two unidentified parasites of hexapods (Borner and Burmester, 2017) (Figure 1A and Figure 1—figure supplement 1A). All three are members of the superfamily Actinocephaloidea based on a consensus of protein and ribosomal RNA gene (rDNA) phylogenies (Materials and methods). The final phylogenetic matrix contained 50 species, 99908 amino acid positions, and relatively little (10.6%) missing information (Figure 1—source data 1).

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig1.jpg
Multiprotein phylogeny of apicomplexans and related taxa.

(A) Maximum likelihood tree (IQ-TREE) of apicomplexans and their relatives based on 296 concatenated protein markers. Species newly sequenced in this study are in bold. Values at branches correspond to UFBoot2 supports (1000 replicates, LG+G4+F+C60+PMSF model), non-parametric bootstraps (100 replicates, LG+G4+F+C60+PMSF model), and Bayesian posterior probabilities (PhyloBayes, consensus of 10 independent runs, CAT+GTR+G4 model). Black dots indicate 100/100/1 support. Actinocephaloids and Siedleckia nematoides are hybrid taxa (* symbol) composed from sequences of three parasites and two distant sequence variants, respectively (see Figure 1—figure supplement 1). Values in parentheses behind species names show % of missing data in the phylogenetic matrix. Sequence sources and the phylogenetic matrix are found in Supplementary file 2 and Figure 1—source data 1, respectively. Single quotation marks indicate potentially problematic taxonomic assignments (formal group names are in Figure 1—figure supplement 1). (B) Light micrographs of some species studied, left to right: Digyalum oweni, Symbiont X, Selenidium pygospionis, Ancora sagittata, Polyrhabdina sp., with the anterior end facing up.

Figure 1—source data 1.

Phylogenetic matrix used in Figure 1A analysis in FASTA format.

Figure 1—figure supplement 1.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig1-figsupp1.jpg
Multiprotein phylogenies of apicomplexans and related taxa.

(A) Maximum likelihood tree (IQ-TREE) of apicomplexans and their relatives (54 species) based on 296 concatenated protein markers (99948 sites, 14.4% missing data). Values at branches correspond to UFBoot2 supports, black dots indicate 100 support (1000 replicates, LG+G4+F+C60+PMSF model). Sequences of three actinocephaloids and of the two distinct sequence variants in Siedleckia nematoides were merged in the Figure 1A dataset, as shown. Values in parentheses behind species names show % of missing data in the final phylogenetic matrix. Single quotation marks indicate potentially problematic taxonomic assignments. Sequence sources are in Supplementary file 2. (B) Maximum likelihood phylogenies on datasets that excluded the two fastest evolving taxa, Gregarina and Cephaloidophora, either individually or together (IQ-TREE). UFBoot2 supports are shown (LG+G4+F+C60+PMSF model); note that all trees were topologically congruent.

Monophyly of gregarines and eugregarines, and polyphyly of apicomplexan parasites

Maximum likelihood analysis with the LG+C60+F+G4+PMSF model in IQ-TREE (non-parametric and UFBoot2 supports) and PhyloBayes analysis with the CAT+GTR model (ten independent runs) produced congruent topologies that were fully resolved at most internal branches (Figure 1A). The protococcidian Eleutheroschizon duboscqi was a sister of eucoccidians, and rhytidocystids branched as basal coccidiomorphs, the group containing most medically important apicomplexans. The blastogregarine Siedleckia was strongly related to the archigregarine Selenidium pygospionis. Gregarine apicomplexans (sensu lato including blastogregarines but excluding Digyalum; see below) and eugregarines were both monophyletic, in contrast to most interpretations based on ribosomal RNA genes (rDNA) (Cavalier-Smith, 2014; Leander, 2008; Rueckert et al., 2011). To test relationships between major apicomplexan lineages, we analyzed datasets in which the two longest branches in the tree, Cephaloidophora and Gregarina, were excluded either individually or both together (LG+C60+F+G4+PMSF model with UFBoot2 supports). The resulting trees had congruent topologies with all internal branches fully supported except for two eugregarine subclades and the position of cryptosporidians (Figure 1—figure supplement 1B). Similarly, seven statistical tests on 105 tree topologies representing all possible relationships between coccidiomorphs, cryptosporidians, eugregarines, archigregarines, and blastogregarines rejected all alternative topologies at p=0.01 except those differing in the placement of cryptosporidians (Supplementary file 3). The relationship of cryptosporidians therefore requires additional support, although their sister position to gregarines, which was unambiguously recovered in all trees including the ten PhyloBayes runs, is a preferred hypothesis. Two apicomplexan-like parasites branched outside the main apicomplexan clade (Figure 1A,B). Digyalum oweni, a formally described archigregarine was fully resolved as a sister lineage to all apicomplexans and chrompodellids. A previously undescribed parasitic symbiont of the annelid Scoloplos with apicomplexan-like traits, named ‘Symbiont X’, was specifically related to Chromera velia. Thus, apicomplexan parasites in the traditional sense are polyphyletic (see Discussion). We keep using the name ‘Apicomplexa’ for the clade of parasites sensu stricto hereinafter (Figure 1A).

Plastids in Digyalum and deep-branching apicomplexans, and their multiple losses

We next explored the existence of plastids in gregarines, Digyalum, and other parasites, where none have been known. Searching their sequence data with sequences of known plastid-localized proteins (Janouškovec et al., 2015; Ralph et al., 2004; Seeber and Soldati-Favre, 2010) revealed broad presence of plastidial pathways (Materials and methods; Supplementary file 4). Digyalum, Selenidium, Siedleckia, rhytidocystids, and Eleutheroschizon all contain near-complete plastidial biosynthesis of isoprenoid precursors, heme, and fatty acids, ferredoxin redox system, iron-sulfur cluster synthesis, and plastid genomes (Figure 2A). The eugregarine Lankesteria unusually contains fatty acid biosynthesis as the only plastidial pathway, whereas Symbiont X contains only the isoprenoid pathway, similar to piroplasms (Lizundia et al., 2009). Both Lankesteria and Symbiont X appear to lack plastid genomes (Figure 2A). The distribution of control, signature plastid genes involved in polypeptide import, folding, and DNA replication in the plastid (cpn60, sDer-1, PREX), matches the presence of plastid metabolism and genomes (Figure 2A). Maximum likelihood phylogenies of all individual proteins allowed us to readily distinguish the apicomplexan sequences from bacteria and other contaminants in the datasets (Materials and methods). In most phylogenies, the apicomplexan sequences cluster with algal plastid forms, confirming that they came from the plastid endosymbiont rather than the eukaryotic host. The phylogeny is different in several genes that are either derived by horizontal gene transfer from bacteria or in fact localize outside of the plastid in Plasmodium (in heme biosynthesis; see below). N-terminal regions of plastid sequences from our new transcriptomes often carry signal peptides typical for targeting to the plastid (the other have incomplete N-termini or lack targeting signatures by default, such as most triose phosphate translocators). The signal peptides are often followed by transit peptide-like regions, although these are more difficult to predict computationally (Supplementary file 5). In Digyalum, transit peptides have a net positive charge similar to other plastid leaders (they are low in acidic and high in basic residues; Figure 2—figure supplement 1A). Digyalum transit peptides are compositionally similar to transit peptides in Plasmodium, and likewise lack the phenylalanine motif at the first position after the signal peptide cleavage site (Figure 2—figure supplement 1B) (Patron and Waller, 2007). Predicted localizations for plastidial proteins and mitochondrial ALAS correspond closely to experimental evidence in Plasmodium and Toxoplasma (Figure 2A). The only exceptions to this pattern are the last three enzymes in heme biosynthesis, which are predicted to be plastidial in some parasites (see Discussion). The reconstructed plastid pathways also reflect known dependencies between their modules (Figure 2B). Iron-sulphur cluster assembly and ferredoxin system are widely required as co-factors for isoprenoid and fatty acid synthesis, whereas heme biosynthesis can be lost independently of other modules – likely the case in Lankesteria and Symbiont X. The synthesis of 3-phosphoglycerate (GAPDH-II and PGK-II) is present selectively. SufA was not identified in the Chromera genome. Pyruvate dehydrogenase and fatty and lipoic acid synthesis protein sequences in Digyalum are notably divergent, including three that are more closely related to bacterial than plastid sequences (Figure 2A). The cytosolic mevalonate pathway for isoprenoid precursor synthesis is absent in all species, but the mitochondrial cysteine desulphurase (IscS) and cytosolic fatty acid synthase (FASI) and elongase (ELO) pathways are present, as expected (Materials and methods) (Dellibovi-Ragheb et al., 2013; Ramakrishnan et al., 2012; Zhu et al., 2004). No plastid genes were identified in five eugregarine lineages other than Lankesteria, including in the draft genome of Gregarina (Figure 2A; see Discussion about the PREX fragment in Ascogregarina). This result is suggestive of at least two losses of plastids in eugregarines, which were independent of the one in Cryptosporidium (Discussion).

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig2.jpg
Core plastid metabolism in apicomplexans and their relatives.

(A) Presence of genes and pathway modules (top; abbreviations in Supplementary file 4) in representative genomes (G) and transcriptomes (left. Each gene (box) is color-coded as to its evolutionary origin, as determined by a maximum likelihood phylogeny (plastid-encoded rps, rpl, rpo, trn genes were not analyzed). Empty boxes indicate gene absence in completed genomes and blank spaces indicate absence in transcriptomes. Intracellular localization of corresponding proteins is shown by a circle inside the box and summarizes known experimental data (Supplementary file 4) or de novo prediction in silico by SignalP v4.1 (Supplementary file 5); it is missing in proteins with incomplete N-termini. Note that only some enzymes of the heme pathway (HEM) are localized in the plastid (*1) and that signal peptides in FAS:ER and ELO were not predicted (*2). (B) Dependence network of plastid protein modules for the biosynthesis of key metabolites – isoprenoid precursors IPP and DMAP, fatty acids and heme – which underlie dependency on the plastid organelle in Apicomplexa. Colored regions contain modules specific to one pathway: fatty acid (pink), isoprenoid precursor (pale green) and heme biosynthesis (yellow). Interactions are reconstructed from the literature and substrates are shown near arrows (PYR = pyruvate; AcCoA = acetyl coenzyme A, Lip:E2 = lipoylation on PDHE2; C8:ACP = octanoyl:acyl carrier protein; [FE-S]=iron sulphur cluster; PEP = phosphoenolpyruvate; e-=electron reductive power; GA3p=glyceraldehyde-3-phosphate; 3PGA = 3 phosphoglycerate; ALA = δ-aminolevulinic acid; ?=uncertainty).

Figure 2—figure supplement 1.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig2-figsupp1.jpg
Transit peptides in Digyalum plastid proteins have positive charge but lack a conserved phenylalanine in the first position.

(A) Amino acid composition of plastid transit peptides in Digyalum in comparison with downstream sequences of mature plastid proteins and transit peptides in other complex plastids as reported by Patron and Waller (2007) (only the first 14 residues were analyzed to enable consistent comparison) (Patron and Waller, 2007). Average proportions of individual amino acids and their classes based on charge (colored bars) are shown: acidic (D + E) in red, basic (H + K + R) in blue, polar uncharged (C + N + Q + S + T + W + Y) in green and hydrophobic (A + F + G + I + L + M + P + V) in black. Number of proteins analyzed are indicated in square brackets behind species/group names. (B) WebLogo3 diagram of the relative composition of the first 20 amino acids in 41 target peptides of Digyalum oweni. Letter size indicates the probability of an amino acid occurring at a specific position, weighted to account for the relative abundance of each amino acids natural occurrence and unrepresentative results arising from a small sample size.

Figure 2—figure supplement 2.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig2-figsupp2.jpg
Maximum likelihood phylogeny of the triose phosphate translocator, TPT (LG+F+R7 model in IQ-TREE with 10000 UFBoot2 replicates;≥80 are shown; black dots indicate full support).

Apicomplexans, chrompodellids and Digyalum are shown on orange background and other eukaryotes on gray background. Selected characterized proteins are highlighted in bold.

Apicomplexan plastids are widespread, and their genomes are highly divergent

The discovery of plastids with genomes in deep-branching apicomplexans raises questions about whether they correspond to the undescribed apicomplexan-related lineages (ARLs) as defined by plastidial 16S rDNA (Janouškovec et al., 2012). Here, we discovered ten novel ARLs among environmental 16S rDNAs in GenBank and clustered 16S rDNAs obtained from the VAMPS database (Huse et al., 2014) by a phylogenetic sorting approach similar to the one used previously (Janouškovec et al., 2012) (Materials and methods). We also discontinue the use of two ARL-X and ARL-XI described recently (Mathur et al., 2018), which we find to be members of the Vitrella clade (ARL-I; Figure 3A) and bacterial contaminants, respectively (Materials and methods). This brings the total number of unidentified ARLs to 15, in addition to three ARLs represented by Chromera, Vitrella, and corallicolid clades (see Supplementary file 6 for reference sequences for all ARLs). A global, maximum likelihood phylogeny of 16S rDNAs of bacteria, plastids and ARLs (including representative sequences of known ARLs and all GenBank and VAMPS centroid sequences of novel ARLs) readily illustrates that many plastids in apicomplexans and their relatives are yet to be discovered (Figure 3A). It also shows that some new ARLs (ARL-XV, ARL-XVI, ARL-XVII) are comparatively diversified and abundant.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig3.jpg
Plastid genomes in apicomplexans and their relatives are widespread and highly divergent.

(A) Maximum likelihood phylogeny of plastid-encoded 16S ribosomal RNA genes (rDNA) reveals 10 novel apicomplexan-related lineages (ARL-IX and ARL-XII to ARL-XX; Supplementary file 6). Tree was computed with the best-fit TVMe+R5 model in IQ-TREE with UFBoot2 supports at branches (10000 replicates;≥80 are shown; black dots indicate 100 support). Environmental sequences (dark blue) are derived from GenBank or VAMPS (97% identity cluster centroids; numbers of reads are shown in square brackets where > 1). Plastid 16S rDNA transcripts of newly sequenced species are shown in red. Note that sequences in the tree vary greatly in their AT content and substitution rates, which can induce a misleading topology - deep relationships in the tree should therefore be interpreted with caution. The fast-evolving sequences of peridinin dinoflagellates were not included. (B) Extremely high AT content in rhytidocystid plastid genomes (arrows). AT content of representative species from part A and Balanophora laxiflora and B. reflexa parasitic plants (Su et al., 2019), all abbreviated to first three letters, is shown for 16S rDNA and plastid genomes. Plastid genomes in the newly sequenced species are only partially reconstructed from transcripts (red color; see Supplementary file 7). Altered genetic codes are indicated. (C) Euler diagram of plastid genome contents in apicomplexans, dinoflagellates (ancestral gene sets for each), Digyalum, Chromera, and Vitrella. Genes on the green background are associated solely with photosynthesis. Small RNA genes are not shown.

Despite that ARLs have now more than doubled in number, none of them corresponds to the plastid 16S rDNAs of Digyalum, Eleutheroschizon, Siedleckia, Selenidium or Rhytidocystis. Instead, the five genera have some of the fastest-evolving and most AT-rich 16S rDNAs of all apicomplexans, and they cluster with compositionally similar sequences of the more distantly related hematozoans (compare Figure 1A and Figure 3A). Such artificial grouping of highly divergent sequences is a well-known phylogenetic artifact, and deep-level relationships in the tree thus ought to be interpreted with caution. The 16S rDNAs of Digyalum, Eleutheroschizon, Siedleckia, Selenidium or Rhytidocystis were recovered among a set of AT-rich transcriptomic contigs, which encode other genes typical of apicomplexan plastid genomes (Supplementary file 7; Materials and methods). In Rhytidocystis species 1 and 2, the AT content of 16S rDNAs (84% and 86%, respectively) and plastidial transcripts (88% and 91%, respectively) is among the highest of all plastid genomes described to date (Figure 3B) (Su et al., 2019). Digyalum, Siedleckia, Selenidium, and Rhytidocystis sp. one use a non-canonical genetic code in which UGA encodes for tryptophan; UGA is absent in the fragmentary plastid DNA of Rhytidocystis sp. two and encodes for STOP codons in Eleutheroschizon (Figure 3B). The Digyalum plastid encodes six genes never identified in apicomplexan plastids and lacks nine genes they do contain; only one gene (rps18) was relocated to the nucleus in parallel in both lineages (Figure 3C).

Horizontal transfer, fusion and fission events in plastid-associated genes

To test if the plastids in Digyalum, chrompodellids and apicomplexans are likely derived from a common source we searched for shared innovations in their plastid genes. We observed that the unusual plastid DNA replication and repair complex (PREX) (Seow et al., 2005) is found not only in apicomplexans and chrompodellids (Janouškovec et al., 2015) but also in Digyalum (Figure 2A and Figure 4). PREX protein contains N-terminal primase and helicase domains, which are homologous to the mitochondrial primase-helicase Twinkle and fused with an Aquifex-type exonuclease-polymerase downstream. Phylogeny of the polymerase unit confirms that it was acquired from an unknown bacterial source related to Aquifex before the Digyalum-apicomplexan split (Figure 4A,C). The gene subsequently fused with the Twinkle gene and the product became targeted to the plastid (the N-terminus of PREX is incomplete in most species including Digyalum but contains a signal peptide in Chromera; Supplementary file 5). Another unusual fusion took place in 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (IspH), the last enzyme in the plastid isoprenoid precursor biosynthesis. A canonical plastid ispH gene of a cyanobacterial origin, which is also present in dinoflagellates and Perkinsus, was replaced by a Chlamydiae-like variant in the ancestor of Digyalum and Apicomplexa and fused with the plastid gene for sedoheptulose-1,7-bisphosphatase form 3 (SBP3) (Figure 4B,C). The fusion is absent in Toxoplasma, Plasmodium and piroplasms, which lack SBP3, and has been interpreted as a derived characteristic in Chromera (Petersen et al., 2014), but instead we find it is broadly distributed in apicomplexans and their relatives.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig4.jpg
Innovations in plastid and cytosolic genes define early evolution of apicomplexans and their relatives.

(A) Maximum likelihood phylogeny of the exonuclease/polymerase subunit of the plastid replication and repair complex, PREX (LG+R6 model). (B) Maximum likelihood phylogeny of 4-hydroxy-3-methylbut-2-enyl diphosphate reductase, IspH (LG+I+G4 model). Trees were derived from protein sequences in IQ-TREE and have UFBoot2 supports at branches (10000 replicates;≥80 are shown; black dots indicate 100 support). Apicomplexans, chrompodellids and Digyalum are highlighted in orange, and other eukaryotes in gray. Characterized enzymes are highlighted in bold. Fusion ispH genes with SBP3 at the N-terminus are shown in white boxes. (C) Predicted gain, loss, fusion and fission events in the evolution of five plastid-associated and two cytosolic genes. Genes are shown by boxes (jagged edges indicate truncated genes) and are color-coded by origin as in Figure 2A (plastid endosymbiont = green, eukaryotic host = red, bacteria = purple, absent in genome data = white, absence of evidence = blank area). Outgroup (*) shows the state in the closest relevant comparator: dinoflagellates (IspH, HemH, MDH/LDH, GS-I) or other algae (PREX, SBP3, RpoC2, FabG). Note that the ferrochelatase (HemH) is mitochondrial in Plasmodium but probably plastidial in Chromera and some apicomplexans. Abbreviations: POP = plant organellar DNA polymerase, Twinkle = mitochondrial primase/helicase, SBP3 = sedoheptulose-1,7-bisphosphatase form 3, MDH/LDH = malate/lactate dehydrogenase (other abbreviations in Supplementary file 3).

Figure 4—figure supplement 1.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig4-figsupp1.jpg
The region of the split in the apicomplexan plastid-encoded rpoC2.

Plastid genome sequences and their three-frame amino acid translations are shown for representative apicomplexan species (snapshots from Artemis 17.0.1). Coding sequences are highlighted by horizontal rectangles. STOP codons are shown by # (=TAA), + (=TAG), and * (=TGA) symbols. In some taxa, TGA encodes for tryptophan (W). Note that the split occurs in a non-conserved part of rpoC2 so the corresponding region cannot be shown for a non-apicomplexan outgroup - rpoC2 genes in chrompodellids, Digyalum and all other plastids nevertheless lack the split.

Figure 4—figure supplement 2.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig4-figsupp2.jpg
Maximum likelihood phylogeny of the ferrochelatase, HemH (LG+R8 model in IQ-TREE with 10000 UFBoot2 replicates;≥80 are shown).

Apicomplexans, chrompodellids and Digyalum are shown on orange background and other eukaryotes on gray background. Selected characterized enzymes are highlighted in bold.

Figure 4—figure supplement 3.

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig4-figsupp3.jpg
Maximum likelihood phylogeny of the beta-ketoacyl-acyl carrier protein reductase, FabG (LG+R6 model in IQ-TREE with 10000 UFBoot2 replicates;≥80 are shown).

Apicomplexans, chrompodellids and Digyalum are shown on orange background and other eukaryotes on gray background. Selected characterized enzymes are highlighted in bold.

Another unique evolutionary event in apicomplexan plastids is a fission in a non-conserved region of the plastid-encoded rpoC2 gene. The unprecedented split was once interpreted as a read-through frame shift (in Plasmodium) or read-through STOP codons (in Toxoplasma and Eimeria) in a continuous rpoC2 (Cai et al., 2003; Wilson et al., 1996). We instead find that all apicomplexan rpoC2 genes are split within the same region, including those in the deep-branching Selenidium and Siedleckia (Figure 4C and Figure 4—figure supplement 1). The two rpoC2 moieties are found in different reading frames in most species, but two different in-frame STOP codons (UAA and UAG) separate them in Toxoplasma. Both frame shifting and STOP codon read-through would be required for continuous rpoC2 translation in Eleutheroschizon (Figure 4—figure supplement 1). Such variable gene arrangements are incongruent with the expression of the apicomplexan rpoC2 as a single protein. Messenger RNA editing likewise does not correct the rpoC2 reading frame in Plasmodium (Nisbet et al., 2016). Indeed, the downstream rpoC2 moiety almost unequivocally possesses an ATG start codon near the split site allowing it to be translated independently (Figure 4—figure supplement 1). The evidence altogether points to a fission event in rpoC2, which is absent in all other plastids, including those of Digyalum and chrompodellids, and thus represents a defining ancestral characteristic of the apicomplexan plastid. Two additional plastid-associated proteins in apicomplexan and chrompodellid plastids derive from horizontally acquired genes. The first is Alphaproteobacteria-like ferrochelatase (HemH), which is localized to mitochondria in Plasmodium but likely targeted to plastids in Chromera and possibly in some apicomplexans (see Discussion) (Koreny et al., 2011; Sato and Wilson, 2003; Varadharajan et al., 2004), and the other is Verrucomicrobia-like beta-ketoacyl-acyl carrier protein reductase (FabG) (Janouškovec et al., 2015). Digyalum encodes the same alphaproteobacterial HemH but it has an unusual FabG, which is related to other bacteria (Figure 4C, Figure 4—figure supplement 2 and Figure 4—figure supplement 3). Finally, genes of two well-known cytosolic proteins (Huang et al., 2004; Zhu and Keithly, 2002) were also acquired from bacteria in the Digyalum-Apicomplexa ancestor: lactate/malate dehydrogenase and glutamine synthase type I (Figure 4C).

Discussion

Convergent evolution of apicomplexan-like morphology

Generating transcriptomes from uncultured apicomplexans across their evolutionary diversity provides the first comprehensive insights into relationships between major apicomplexan groups. Two species with apicomplexan-like morphology either described as apicomplexans (Digyalum) or yet unclassified (Symbiont X) are not members of Apicomplexa sensu stricto. This shows that apicomplexan-like parasites are polyphyletic and evolved at least three times independently. Specifically, large trophont stages attached to intestines of marine invertebrates by specialized apical structures are products of convergent evolution. The trophonts of Digyalum, for example, parasitize the gut epithelium of Littorina snails and their attachment structure contains an apical complex with a protruded polar ring, which provides a gateway for rhoptry-mediated secretion – a combination of traits typical for gregarine apicomplexans (Dyson et al., 1994; Dyson et al., 1993). Symbiont X is known only from light microscopy data but would be also readily classified as an apicomplexan based on crude characteristics: it parasitizes the gut of Scoloplos armiger polychaetes being attached to the host epithelium by its apical end (unpublished data). Tracing the evolution of such parasite characteristics, however, indicates that the basis for convergence lies in evolution acting on similar preconditions. Apical complexes with rhoptries, micronemes and pseudoconoids are present in free-living relatives of apicomplexans, such as in the predatory Colpodella and Psammosa and photosynthetic Chromera, and in more distantly related parasites such as Perkinsus and Parvilucifera (Foissner and Foissner, 1984; Norén et al., 1999; Oborník et al., 2011; Okamoto and Keeling, 2014; Perkins, 1996). Such broad distribution points to a single origin of the apical complex in the ancestor of apicomplexans and dinoflagellates in a non-parasitic context (Figure 1A). Because the apical complex mediates secretion often associated with cell-to-cell interactions, it is likely an important precondition in multiple origins of parasitism in both apicomplexans and dinoflagellates. In similar host environments the structure may have also promoted convergent parasite morphologies. The use of the apical complex in extracellular attachment and secretion in the gut epithelium of animal hosts, for example, may have triggered convergent expansion in the cell size of gregarine, Digyalum and Symbiont X trophonts. Unsurprisingly, convergent similarities in the three lineages are accompanied by considerable differences in detailed morphology: Digyalum and Symbiont X do not glide or twist but they pulsate, and detailed ultrastructure of their apical complex and pellicle (Dyson et al., 1994; Dyson et al., 1993) (unpublished data) is distinct from that in gregarines (Kováčiková et al., 2017; Paskerova et al., 2018; Valigurová et al., 2017). Similar divergence characterizes their molecular make-ups: apicomplexans are well-known auxotrophs for purines, but Digyalum contains a pathway for their synthesis (data not shown) and, despite that both lineages lost photosynthesis, their plastid genomes have been reduced in different ways (Figure 3C). The convergent morphologies of Digyalum, Symbiont X and apicomplexans are therefore rather superficial similarities, but the possibility that they were driven by the presence of shared ancestral traits (such as the apical complex) in similar host habitats highlights the importance of preconditions in the origin of parasites (Janouskovec and Keeling, 2016).

Relationships and morphological transitions in early apicomplexan evolution

Key findings of multiprotein phylogenies are that eugregarines and gregarines are unequivocally monophyletic (Figure 1 and Figure 1—figure supplement 1). This broadly supports traditional morphological classifications (Grassé, 1953) in contrast to many recent proposals based on rDNA phylogenies, which regard both groups as polyphyletic (Cavalier-Smith, 2013). Indeed, protein sequences allow for building larger phylogenetic matrices and have more even substitution rates than rDNAs, some of which are notoriously divergent and phylogenetically unstable. Although the sampling of eugregarine diversity is incomplete, our phylogeny contains six of their seven main lineages at the superfamily level (Simdyanov et al., 2017) – one of them being Polyrhabdina, which is apparently not a lecudinoid (Figure 1A and unpublished data). The eugregarine monophyly provides the first unambiguous phylogenetic support for their two candidate synapomorphies: the ultrastructure of the epimerite, and the ultrastructure of epicytic crests (Simdyanov et al., 2017). It also partially resolves the little understood relationships between eugregarine superfamilies (Cavalier-Smith, 2014; Simdyanov et al., 2017; Simdyanov et al., 2015). The blastogregarine Siedleckia groups strongly with the archigregarine Selenidium. The sampling of both groups is incomplete, especially in the archigregarines, but they do share a combination of characteristics that are rare or absent in other apicomplexans, namely active bending and twisting movement in trophozoites, pellicular folds running along the body length in many species, and one or more layers of longitudinal microtubules underlying the pellicle (Schrével et al., 2016; Simdyanov et al., 2018). Both lineages also feed by myzocytosis, which is known in some relatives of apicomplexans (Foissner and Foissner, 1984; Mylnikov and Mylnikova, 2008) but not in eugregarines, cryptosporidians or coccidiomorphs. Because the blastogregarine ultrastructure and life cycle share additional similarities with coccidians (Simdyanov et al., 2018), confirming their phylogenetic position is important for reconstructing character evolution in Apicomplexa as a whole. The monophyly of gregarines sensu lato in our trees (Figure 1 and Figure 1—figure supplement 1) provides phylogenetic support for a tentative synapomorphy of this group, the forming of a gametocyst during the life cycle (Simdyanov et al., 2017), although this characteristic is absent in blastogregarines. The position of cryptosporidians is still not fully resolved, but the group is consistently recovered next to gregarines (Figure 1 and Figure 1—figure supplement 1). Eleutheroschizon is clearly important to understanding the origin of eucoccidians (Figure 1A) (Valigurová et al., 2015), although its classification as a protococcidian is apparently incorrect (unpublished data). Two Rhytidocystis species are positioned as a basal group in the class Coccidiomorpha (Figure 1A), which supports traditional views on their classification closer to coccidians (Levine, 1979; Porchet Hennere, 1972) rather than to gregarines (Cavalier-Smith, 2014). The bigger group that includes rhytidocystids and related parasites of annelids and molluscs (Kristmundsson et al., 2011) therefore provides a stepping stone for understanding the evolution of coccidiomorphs, including the majority of medically important apicomplexans.

Widespread maintenance but multiple losses of plastids in apicomplexa

Evidence for endosymbiotic genes with plastid targeting signals and plastid genomes shows that plastids are common in deep-branching apicomplexans, despite not having been previously recognized (Figure 2, Figure 3 and Supplementary file 5). Their plastid metabolic networks are similar to those in Toxoplasma and Plasmodium and generally serve to produce three essential metabolites: isoprenoid precursors for the synthesis of carotenoids, ubiquinone and prenyls; heme as a protein co-factor; and fatty acids as constituents of cellular lipids (Imlay and Odom, 2014; Ramakrishnan et al., 2012; van Dooren et al., 2012). Heme biosynthesis is partially localized in the cytosol and mitochondria, and fatty acids can be extended by non-plastidial elongases (Figure 2A) (Ramakrishnan et al., 2012; Zhu et al., 2004), but de novo biosynthesis of all three metabolites outside of the plastid appears to be absent. This creates dependence on the plastid for the production of isoprenoid precursors, heme, and fatty acids, which can be overcome only by metabolite uptake from the environment. Salvage of host isoprenoids, heme, and fatty acids is common in parasites, but obtaining them in sufficient amounts across the life cycle is apparently challenging. Most parasites retain all three plastidial pathways, and only two lineages have lost plastid organelles altogether (Figure 5): cryptosporidians (Abrahamsen et al., 2004), and the dinoflagellates Hematodinium and Amoebophrya (Gornik et al., 2015; John et al., 2019). In free-living organisms, plastids are always retained, perhaps because replacing their metabolism by salvage is even more difficult (Janouškovec et al., 2017). Most consistently required is the biosynthesis of isoprenoid precursors (Figure 5), which is the only indispensable plastidial pathway in piroplasms, the Plasmodium blood stage, and possibly in Symbiont X (Figure 2A) (Lizundia et al., 2009; Yeh and DeRisi, 2011). A fully unprecedented situation exists in the eugregarine Lankesteria, where fatty acid biosynthesis appears to be the only retained plastidial anabolic pathway (the transcriptome is fragmentary but enzymes for the biosynthesis of isoprenoid precursors and heme are absent). Interestingly, the existence of plastids in Selenidium and Rhytidocystis is also consistent with ultrastructural reports of multimembrane organelles in some of their species, which correspond to non-photosynthetic plastids in other apicomplexans by size, appearance, and their position near the nucleus (Figure 5) (Leander and Ramey, 2006; Porchet Hennere, 1972; Schrével, 1971; Schrével et al., 2016; Wakeman et al., 2014). Five eugregarines lack any evidence for plastid presence; these include Gregarina where a draft genome is available (Figure 2A). Since Lankesteria is phylogenetically nested among them, eugregarines must have lost plastids at least twice (Figure 5). Ascogregarina contains a fragment of PREX (Figure 2A and Figure 4A), but because the species lacks evidence for other endosymbiont-derived genes we conservatively consider it encodes for an ex-plastidial protein or is a contaminant. Our results indicate that the two plastid losses in eugregarines are independent of the one in cryptosporidians (Figure 1A), unlike previous assumptions (Cavalier-Smith, 2014; Toso and Omoto, 2007). This increases total known losses of plastid organelles in apicomplexans to three and in all eukaryotes to four (Figure 5).

An external file that holds a picture, illustration, etc.
Object name is elife-49662-fig5.jpg
Plastid evolution in apicomplexans and their relatives.

Plastid-related characteristics (A–D), origin of parasitism, and predicted losses of photosynthesis, plastid genomes, and plastid organelles were mapped on the updated phylogeny. The phylogeny is fully resolved except one branch where more support is needed (*1). Predicted core plastid anabolic capabilities, presence of plastid genomes, and transmission electron microscopy evidence (TEM) for multimembrane organelles corresponding to plastids by their size, appearance, and position within cells are shown on the right.

Plastids in apicomplexans and their relatives were inherited vertically from a common ancestor

Plastids in Chromera and Vitrella share several unique characteristics with the apicomplexan plastid (Janouskovec et al., 2010) but their common origin was doubted in the past (Bodył et al., 2009) and other early endosymbiotic events could underlie plastid evolution in the broader group including dinoflagellates (Waller and Kořený, 2017). Testing whether plastids in apicomplexans, chrompodellids and Digyalum derive from the same endosymbiont is therefore relevant to understanding plastid evolution and frequency of plastid losses in general. We find three lines of evidence that support a common origin. Firstly, phylogenies of individual plastid genes repeatedly group Digyalum with apicomplexans and chrompodellids, and their topologies either reiterate nuclear phylogenies directly or are unresolved without consistently supporting alternatives (Figure 4A,B, Figure 4—figure supplement 2 and Figure 4—figure supplement 3). Secondly, unidentified lineages that carry plastid-encoded 16S rDNAs related to apicomplexans and chrompodellids (ARLs) are growing in number and diversity (Figure 3A). Some are apicomplexans (Janouškovec et al., 2013; Kwong et al., 2019) and others may be free-living, but they altogether support the idea that plastids in the lineage are widespread (Janouškovec et al., 2012). Finally, unique evolutionary innovations in the plastidial PREX, ispH, and hemH link together the plastids in apicomplexans, chrompodellids, and Digyalum (bacterial fabG and split rpoC2 further link plastids in some of them). It is difficult to see how endosymbiosis would move plastids in or out of the lineage without changing the distribution of these genes. Altogether, the evidence suggests that plastids in apicomplexans, chrompodellids, and Digyalum were inherited vertically from a common ancestor, are widely distributed in the group, and are most likely retained by default, particularly in free-living representatives (Janouškovec et al., 2015; Janouskovec et al., 2010).

Metabolic evolution in the plastid

Core plastid metabolism in Digyalum, Symbiont X, and apicomplexans has been remarkably conserved across long time scales and reveals only a few outstanding variations. In Plasmodium falciparum, triose phosphate sugars are imported by two triose phosphate translocators (TPTs) residing in the outermost (PfoTPT) and innermost (PfiTPT) plastid membranes, respectively (Mullin et al., 2006). Toxoplasma contains only one TPT phylogenetically corresponding to PfoTPT. Orthologs of the PfoTPT lack N-terminal targeting signatures and are ubiquitous in apicomplexans, chrompodellids, and Digyalum (Figure 2—figure supplement 2). PfiTPT orthologs are less common (Gile and Slamovits, 2014), but they possess N-terminal signal peptides in Chromera, Vitrella, and Rhytidocystis sp. two compatible with their targeting to the inner plastid membrane (Figure 2A and Figure 2—figure supplement 2). The split between the two forms therefore likely predated Apicomplexa and the loss of the PfiTPT form in the lineage leading to Toxoplasma is apparently a derived evolutionary state. One enzyme that processes the TPT substrate dihydroxyacetone phosphate (DHAP) is triose phosphate isomerase (TPI-II), and the failure to identify TPI-II in piroplasms once led to the proposition that their plastids import glyceraldehyde-3-phosphate, the TPI-II product (Fleige et al., 2010). We find that TPI-II, although highly divergent in piroplasms, is present in all apicomplexans with plastids, and it frequently possesses N-terminal signal peptides for plastid targeting (Figure 2A). This suggests that the import and conversion of DHAP is conserved across apicomplexan plastids.

Analysis of heme biosynthesis enzymes suggests that the pathway consistently starts in the mitochondrion, as in heterotrophic eukaryotes – the algal C5-pathway must have been lost prior to the Digyalum-Apicomplexa divergence. Delta-aminolevulinic acid is likely imported to the plastid and processed to coproporphyrinogen III (Ralph et al., 2004; Sato et al., 2004). The last three enzymes in some apicomplexans sequenced by us are unexpectedly predicted to be also plastidial, similarly to Chromera and Vitrella (Koreny et al., 2011). In Plasmodium, and perhaps also in Toxoplasma and Digyalum, the last three heme biosynthesis enzymes localize in the cytosol and mitochondria, where heme is most needed (Varadharajan et al., 2004). It would be interesting to explore these contrasting localization predictions experimentally, including the possibility that some enzymes are dually targeted, or their isoforms (where present; Figure 2A) are differentially targeted to different cellular compartments.

Highly divergent plastid genomes and their multiple losses in chrompodellids

Unlike metabolism in the plastid, plastid genome structure shows unexpected variations across Apicomplexa. Plastid genomes in some of the newly sequenced taxa, as partially reconstructed from transcripts, have unusually AT-rich and fast-evolving sequences (Figure 3A). Completing the fragmentary plastid genome of Rhytidocystis species 2 (Supplementary file 7) is of particular interest because it has the most AT-rich 16S rDNA, and potentially the most AT-rich genome, ever recorded among plastids (Figure 3B) (Su et al., 2019). Plastid genome reduction in Digyalum and Apicomplexa from their common ancestor is primarily explained by the loss of photosynthesis (Figure 3C). This was accompanied by relatively modest transfer of genes to the nucleus, which involved remarkably different gene sets in the two lineages (only rps18 was relocated in both in parallel).

Underlying the very existence of plastid DNA in Plasmodium and Toxoplasma is sufB, which encodes one of only two broadly conserved, plastid-encoded proteins with function other than transcribing and translating the genome itself. Of the newly sequenced species with plastids, only Symbiont X lacks any evidence for the sufB gene or plastid genome (Figure 3A). It would be expected that Symbiont X sufB is encoded in the nucleus and was relocated there in its common ancestor with Chromera, Voromonas, Colpodella, and Alphamonas (Janouškovec et al., 2015), thus allowing heterotrophs in this lineage to lose plastid genomes at least three times independently (Figure 5).

Summary and future directions

We provide a strongly resolved phylogeny and large-scale sequence data for major apicomplexan groups, but the sampling of Apicomplexa is far from complete. Expanding the phylogenetic dataset with new species and genes will allow for testing key conclusions of our study (e.g., monophyly of gregarines and eugregarines) and understanding the relationships of taxa that remain poorly sampled (archigregarines, blastogregarines, protococcidians) or lack genome-level sequences (corralicolids, adeleid and aggregatid coccidians, and other incertae sedis taxa). Parasites with characteristics similar to apicomplexans could provide a study system for morphological and molecular convergence and insights into the transition from free-living species to obligate symbionts. Plastids in apicomplexans and their relatives are apparently ancestral and widespread, and more are likely to be discovered. The plastid function in apicomplexans, chrompodellids and Digyalum rarely includes photosynthesis, but it always involves synthesis of one or more indispensable metabolites (Figure 5). Losses of plastid organelles and their genomes are infrequent but did occur several times in the broader group, and are likely to provide an unparalleled model for understanding factors that mediate plastid maintenance and loss in eukaryotes as a whole.

Interestingly, two key conclusions of our study are independently reinforced by a bioRxiv preprint (Mathur et al., 2019), which describes a complementary set of parasite transcriptomes from the same group. Firstly, the manuscript reports two parasites with apicomplexan-like morphologies that likewise branch outside Apicomplexa. The early-branching Platyproteum, a representative of ‘squirmids’ (Cavalier-Smith, 2014), is the sister group of Digyalum, as has been apparent to us from 18S rDNA phylogenies (unpublished data). Piridium then represents a sister taxon to Vitrella and therefore a fourth independent emergence of apicomplexan-like parasitism in the lineage. Secondly, plastids appear to be consistently absent in most eugregarine superfamilies except for the Lecudinoidea (Lankesteria in our study and Lecudina and Pterospora in the other study), where fatty acid biosynthesis is the only core plastidial pathway. The presence of two additional eugregarine superfamilies in our trees (Ancoroidea and Polyrhabdina) points to an extra case of plastid loss (Figure 5), but the relationships of eugregarine superfamilies which are present in both studies are fully congruent. Integrating sequence datasets from both studies will be a first step in creating a phylogenetic framework for apicomplexan evolution. This framework will likely be useful in illuminating steps in the emergence of parasitism and in predicting cell biology of less known parasites by the methods of comparative genomics.

Materials and methods

Key resources table

Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional
information
Commercial assay or kitRNAqueous-Micro Total RNA Isolation KitThermoFishercat no. AM1931
Commercial assay or kitSMART-Seq v4 Ultra Low Input RNA KitTakaracat no. 634888
Software, algorithmMAFFT v7.402Katoh and Standley, 2013RRID:SCR_011811
Software, algorithmIQ-TREE v1.6.5Nguyen et al., 2015RRID:SCR_017254
Software, algorithmPhyloBayes v4.1c (and MPI v1.7b)Lartillot et al., 2009RRID:SCR_006402

Parasite sampling, transcriptome sequencing and assembly

Parasite cells (1 to approximately 70 individuals) were isolated from marine annelid, mollusc and barnacle hosts collected at the White Sea Biological Station of Moscow State University during August of 2016 (Supplementary file 1). Cells were hand-picked by using a glass micropipette (30% ethanol was used to detach the cells of Digyalum and Eleutheroschizon), washed 1 to 4 times in clean seawater and transferred into a clean tube in the final volume of 2–5 ul of seawater (excess seawater removed if necessary after a brief spin). Care was taken to avoid contamination by animal host cells and gut contents when isolating parasite cells but we were not able to fully prevent it (host contaminants were observed in the sequence data but they were clearly distinguishable from the parasites; see below). A 100 ul of Lysis buffer (RNAqueous-Micro Total RNA Isolation Kit; Ambion/ThermoFisher, cat no. AM1931) was added into the tube with parasite cells and the samples were stored at −80C for several weeks. Total RNA was extracted from samples by RNAqueous-Micro Total RNA Isolation Kit according to manufacturer instructions but without the DNase I digest step. In Digyalum and Eleutheroschizon, RNA from two independent cell isolations was combined before further processing (Supplementary file 1). Cells of Digyalum oweni WS3-2017 was stored in RNALater and used for reverse-transcription without extracting RNA. RNA was reverse-transcribed by using SMART-Seq v4 Ultra Low Input RNA Kit (Takara; cat no. 634888), however20-30 amplification cycles; optimization was done as in SMARTer Pico PCR cDNA Synthesis Kit). Indexed TruSeq libraries were built at Edinburgh Genomics and sequenced as paired-end 150 bp reads in two multiplexed lanes on the Illumina HiSeq 4000 machine. For the Digyalum oweni WS3-2017 sample, paired-end 150 bp HiSeq 4000 reads were produced independently of other samples. Demultiplexed reads were processed by cutadapt v1.8.3 to remove low quality nucleotides (-q 20 setting), polyA tails, SMARTer adapters and leftover Illumina barcodes (on 3' ends of shorter sequence fragments). Reads of minimum length of 100 nucleotides were assembled in Trinity v2.4.0 by using the default settings. Predicted proteomes were generated as six-frame translations of transcriptomic contigs. Analysis of assembled reads revealed minor cross-contamination between samples due to adapter swapping, but contaminants were well distinguishable by read coverage. Raw sequence reads and our assemblies were deposited in NCBI Bioprojects PRJNA557242 and PRJNA556465. Photographs of studied parasites (Figure 1B) were made using Leica DM 2500 microscopes equipped with DIC optics and a digital camera (Leica, Germany) at the White Sea Biological Station of Moscow State University and the Marine Biological Station of St. Petersburg State University.

Single protein sequence alignments for nuclear phylogeny

We built our dataset from 339 protein sequence alignments previously used in stramenopile phylogenies and tested for orthology (Derelle et al., 2016). Firstly, slow-evolving stramenopile sequence query for each alignment (mostly the Phytophthora sequence) was used to retrieve five best BLASTP hits (e-value cutoff of 1e-5) and select the closest ortholog in translated alveolate, stramenopile, and rhizarian genomes (Supplementary file 2). Secondly, a new alveolate protein sequence query (primarily from Vitrella) was used in the same way to expand the dataset with our newly generated and existing transcriptomes. Closest hits from animal, fungal, microsporidian and bodonid genomes were also included to distinguish sequences derived from animal host and other contaminants in the samples. Orthologous sequences were identified by multiple rounds of alignment clean-up as guided by Maximum Likelihood phylogenies. Initial rounds of phylogenies were focused on removing out-paralogs and contaminants by using default alignment in MAFFT v7.402 (Katoh and Standley, 2013), alignment trimming in BMGE v1.12 (Criscuolo and Gribaldo, 2010) with the -b 3 g 0.4 setting and phylogeny in Fasttree v2.1.10 (Price et al., 2010) with the -lg -gamma setting. Later rounds were focused on resolving difficult cases of paralogy, horizontal gene transfer, and the orthology of fast-evolving sequences: localpair, --linsi, alignment in MAFFT; -b 4 g 0.4 settings in BMGE; and IQ-TREE v1.6.5 (Nguyen et al., 2015) phylogeny with the LG+I+G4+F model and 1000 UFBoot2 supports. Of the original 339 single protein sequence alignments (see Supplementary file 1 in Derelle et al., 2016), 43 were excluded in the process: 10 were absent in apicomplexans and dinoflagellates (alignments #72, #290, #291, #292, #294, #295, #300, #301, #304 and #315) and 33 were sparsely sampled or strongly incongruent with known organismal relationships (alignments #29, #57, #70, #73, #82, #149, #151, #155, #157, #159, #163, #164, #167, #171, #183, #184, #191, #193, #194, #198, #241, #246, #260, #265, #270, #283, #285, #289, #302, #303, #305, #330, #331). In the remaining 296 alignments, paralogs were reduced to the most slowly evolving sequence with non-conflicting phylogenetic placement (else both were removed) and multiple gene isoforms were reduced to the most complete sequence. Cross-contaminant sequences were identified based on read coverage and excluded. Divergent regions of sequences resulting from frame shift errors or imperfect gene models were removed from alignments manually. Adjacent protein sequence fragments that were apparently derived from the same gene fragmented by wrong genome annotation were merged.

Multiprotein phylogeny and topology tests

The final 296 protein sequence datasets were realigned by the localpair algorithm in MAFFT, trimmed by using the -b 4 g 0.4 settings in BMGE and concatenated in Scafos v1.25 (Roure et al., 2007). During the latter step, sequences derived from different strains of the same species (in Colpodella angusta, Sarcocystis neurona, Voromonas pontica, and unidentified actinocephaloid parasites of Helicoverpa armigera and Helicoverpa assulta) were merged into single operational taxonomic units (Supplementary file 2). The initial phylogenetic matrix contained 54 species, 99948 sites and 14.4% missing data (Figure 1—figure supplement 1A). To further reduce missing data we merged sequences of two Oxyrrhis marina strains and of two distinct variants found in Siedleckia nematoides transcriptomes; the latter are possibly derived from different cryptic species (variant one was preferentially identified and/or found to be more complete in the transcriptome of the WS1 strain; variant two in the transcriptome of the WS2 strain). We also merged sequences of three representatives of the superfamily Actinocephaloidea with low sequence presence: Ascogregarina and two unidentified parasites of the insects Teleopsis and Helicoverpa, which contaminate the host transcriptomes (Borner and Burmester, 2017). The 18S rDNA of the Teleopsis parasite branches among other actinocephaloids (data not shown) and both unidentified group with Ascogregarina in the multiprotein phylogeny (Figure 1—figure supplement 1A). Merging all these taxa produced the main phylogenetic matrix with 50 species, 99908 positions and 10.6% missing data (Figure 1—source data 1). Three additional matrices were created by excluding all sequences of Gregarina, Cephaloidophora, or both species (Figure 1—figure supplement 1B). Final matrices were first analyzed in IQ-TREE by using the LG+I+G4+F model and the best tree was used as a guide tree in a more thorough analysis with the LG+G4+F+C60+PMSF model: 1000 UFBoot2 replicates were computed for all datasets (Figure 1A and Figure 1—figure supplement 1) and 100 non-parametric bootstrap were computed for the main matrix (Figure 1A). Seven statistical tests of 105 tree topologies corresponding to all possible relationships between coccidiomorphs, cryptosporidians, eugregarines, archigregarines, and blastogregarines in Figure 1A were calculated in IQ-TREE v1.6.5 with the LG+I+G4+F model and 10000 replicates (Supplementary file 3). The main phylogenetic matrix (Figure 1A) was also analyzed by 10 independent PhyloBayes runs (either standard version 4.1 c or MPI version 1.7b) (Lartillot et al., 2009) with the GTR+CAT model and constant sites removed (-dc setting). Each chain was run for 1000 cycles, of which the initial 250 were discarded and the remaining (10 × 750) were combined to compute a consensus tree (maxdiff = 0.23, meandiff = 0.00057).

Analysis of plastidial metabolism and horizontally acquired genes

We next searched the orthologs of apicomplexan plastid proteins with focus on core pathways (Figure 2A) in transcriptomes and genomes of apicomplexans and their relatives (including Perkinsus and dinoflagellates) by using similar approach as for the nuclear genes above. Five best BLASTP hits at the e-value threshold of 1e-5 were retrieved for each species by using a comparatively slow evolving sequence query (typically Vitrella or Chromera). These hits were included in plastid protein alignments created previously (Janouškovec et al., 2017; Janouškovec et al., 2015), or in new alignments created here by retrieving representative outgroup sequences from GenBank and the local database by BLASTP searches with the same query sequences (Supplementary file 4). This process included non-plastidial genes of the heme biosynthesis pathway, and three genes representing controls for non-plastidial pathways: iscS (mitochondrial iron-sulfur biosynthesis),enoyl reductase domain (FASI, cytosolic fatty acid synthesis), and ELO (endoplasmic reticulum-localized fatty acid elongation). Alignments were reduced by an iteration process analogous to that used for nuclear genes. Final phylogenetic matrices were prepared by localpair alignment in MAFFT and -b 4 g 0.4 trimming in BMGE, and analyzed in IQ-TREE by using the built-in ModelFinder to select the best model with the preset LG substitution matrix (for example, LG+F+R7 model for TPT in Figure 2—figure supplement 2). This approach allowed to distinguish true apicomplexan sequences from contaminants in our transcriptomes and confirmed their origins to be in the plastid endosymbiont (grouping with other plastidial sequences) or eukaryotic host (grouping with eukaryotic cytosolic or mitochondrial sequences; Figure 2A). Protein phylogenies of six horizontally acquired genes (Figure 4A,B, Figure 4—figure supplement 2 and Figure 4—figure supplement 3) were built by expanding previous datasets (Janouškovec et al., 2015) and computed by using the same method as plastid phylogenies above.

Analysis of leader sequences in plastid proteins

Proteins of core plastidial pathways (Figure 2A) derived from our transcriptomes and those of Plasmodium falciparum, Babesia microti and Toxoplasma gondii were scanned for the presence of identifiable N-terminal signal peptides in SignalP v4.1 (Petersen et al., 2011). All methionines downstream of the predicted protein start were also tested for targeting signals. Positive sequences were checked for the presence of N-terminal extensions – truncated proteins were filtered out as false positives. The remaining signal peptide-positive proteins were recorded in Figure 2A and further checked for identifiable transit peptides in ChloroP 1.1 (Emanuelsson et al., 1999). The SignalP and ChloroP statistics were listed in Supplementary file 5. Apicomplexan proteins that have known experimental localization (as described in primary literature and Apiloc3) or that were classified as high-confidence plastid-localized proteins in Plasmodium falciparum by BioID (Boucher et al., 2018), were recorded in Figure 2A and listed in Supplementary file 4. To characterize transit peptides in Digyalum oweni, we expanded its identified plastid protein set. Additional plastidial proteins in the WS1+two isolates were searched by BLASTP with apicomplexan queries, comprising known Toxoplasma and Plasmodium plastid proteins primarily involved in transcript and protein processing (e.g., stromal processing peptidase, clpP chaperone, histone-like protein HU, tRNA-Met ligase, etc.). Matched sequences were verified by reverse BLASTP searches on KEGG and NCBI BLAST websites (aided by distance tree phylogenies at the latter site). Pooling positive hits with those identified previously (Figure 2A) and filtering for sequences with N-terminal extension carrying a signal peptide (as identified by SignalP v4.1) produced 41 plastid proteins. The first 14 transit peptide residues downstream of the signal peptide cleavage site (an approach to allow comparison with results in Patron and Waller, 2007 were merged and analyzed altogether for their amino acid composition by ‘Protein Stats’ script at the Sequence Manipulation Suite website. Composition of the 41 mature proteins (without transit peptides) was also analyzed – because transit peptide cleavage sites are difficult to predict in silico the initial 50 residues were removed arbitrarily (Figure 2—figure supplement 1A). The amino acid frequency at the first 20 positions across the 41 Digyalum transit peptides was analyzed at the WebLogo3 website (Figure 2—figure supplement 1B).

Plastid genome analysis and 16S rDNA phylogeny

Plastids transcripts were identified by three search strategies: BLASTN searches with plastidial 16S rDNA; BLASTP searches with sequences of plastid-encoded protein sequences of apicomplexans or chrompodellids;and searches for high AT contigs (typically contigs > 70% AT and 1–3 kb in length). Hits from the former search were examined in a 16S rDNA tree. Hits those from the latter two search strategies were combined and reversely compared by BLASTX against the NCBI nr database to search if they match genes in apicomplexan, chrompodellids and algal plastid genomes. Positive hits were limited to representative, non-redundant contigs of 1 kb or longer (Supplementary file 7). Identification of individual genes was based on NCBI BLAST searches and tRNAscan-SE, and it was further aided by Artemis 17.0.1 and MFannot website. AT content was examined in 16S rDNAs and in whole plastid genomes or combined plastid transcripts by the ‘DNA Stats’ script on the Sequence Manipulation Suite website (Figure 3B and Supplementary file 7). The phylogeny of 16S rDNA was based on an earlier dataset with representatives of ARL-I to ARL-VIII clades (Janouškovec et al., 2012). We also requested sequences of ARL-X and ARL-XI as reported in Mathur et al. (2018) from the authors of the study. Of the 34 sequences of clustered centroids we received, only one centroid was named ‘ARL-XI’ and this was a Pelagibacter-like bacterial contaminant; we therefore consider ARL-XI to be invalid. The remaining 33 centroids were named ‘ARL-X’: nine of them were bacterial contaminants and the remaining formed two non-overlapping groups of 3 and 21 sequences. We included in our phylogeny all three sequences from the first group and four slowly evolving sequences from the second group, but they all branched within ARL-I (Figure 3A), which leads us to synonymize ARL-X with ARL-I. Finally, we included in the phylogeny 10 new ARLs identified by phylogenetic sorting of the VAMPS database (Huse et al., 2014). Briefly, we obtained all VAMPS reads annotated as ‘Organelle’ or ‘Unknown’, selected those being 350 bp or longer, and clustered each group in Usearch at 97% identity. Centroids were individually classified by maximum likelihood phylogenies in a dataset containing plastids, bacteria and mitochondria by an approach used previously (Janouškovec et al., 2012). The initial round of phylogenies was computed by using Fasttree2 (-lg -gamma setting) and later rounds by using IQ-TREE (LG+I+G4+F model; additional rounds with modified taxon sampling were used for sequences that were difficult to classify). Candidate ARL sequences were used to retrieve additional ARLs from centroid databases by BLASTN searches (i.e., those that had been misplaced by Fasttree2). The apicomplexan affiliation of all ARL sequences was verified in the Figure 1A dataset. Representative sequences for all known and novel ARLs were listed in Supplementary file 6. The final 16S rDNA phylogeny was based on a localpair MAFFT alignment trimmed in BMGE (-h 0.4 g 0.65 settings) and computed by using IQ-TREE with ModelFinder selection of the best fit model and 10000 UFBoot2 supports (Figure 3A).

Acknowledgements

We thank Ross F Waller and reviewers and editors in Elife for providing helpful comments and suggestions about the draft manuscript. We thank the staff of the White Sea Biological Station of Lomonosov Moscow State University and the Marine Biological Station of St. Petersburg State University for assistance during field sampling. The authors acknowledge the use of the UCL Myriad High Performance Computing Facility (Myriad@UCL), and associated support services, in the completion of this work. Digyalum oweni WS3 transcriptome sequencing was funded by the Russian Science Foundation (grant no. 18-14-00123).

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

John McCutcheon, University of Montana.

Detlef Weigel, Max Planck Institute for Developmental Biology, Germany.

Funding Information

This paper was supported by the following grants:

  • University College London Excellence Research Fellowship to Jan Janouškovec.
  • Russian Foundation for Basic Research 18-04-00324 to Gita G Paskerova, Timur G Simdyanov, Tatiana S Miroliubova.
  • Russian Science Foundation 18-14-00123 to Vladimir V Aleoshin.
  • Saint Petersburg State University 1.42.1099.2016 to Gita G Paskerova.
  • Saint Petersburg State University 1.42.723.2017 to Gita G Paskerova.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Funding acquisition, Investigation, Methodology, Writing—review and editing.

Investigation, Methodology, Writing—review and editing.

Formal analysis, Methodology, Writing—review and editing.

Formal analysis, Visualization, Writing—review and editing.

Supervision, Funding acquisition, Methodology, Writing—review and editing.

Funding acquisition, Investigation, Methodology, Writing—review and editing.

Additional files

Supplementary file 1.

Information about parasites collected (single quotation marks indicate problematic affiliations; asterisk denotes a species complex).

Supplementary file 2.

Sources of sequence datasets used in this study (single quotation marks indicate arbitrarily assigned strain name).

Supplementary file 3.

Results of seven tree topology tests as computed in IQ-TREE (of 105 tested topologies those not rejected at p=0.001 in at least one test are listed; those not rejected at p=0.01 are highlighted in bold).

Supplementary file 4.

Names, abbreviations, accessions and localizations of plastidial proteins and modules (plastid-localized = green; mitochondrion-localized = blue; endoplasmic reticulum = red).

Supplementary file 5.

Prediction statistics for signal (SignalP v4.1: noTM, Dmaxcut = 0.45) and transit peptides (ChloroP v1.1).

Supplementary file 6.

Reference 16S rDNA sequences for apicomplexan related lineages (ARLs), Digyalum, and newly sequenced apicomplexans.

Supplementary file 7.

Plastid transcriptomic contigs in species newly sequenced in this study.

Transparent reporting form

Data availability

Sequence data have been deposited in NCBI under the Bioproject accessions PRJNA557242 and PRJNA556465. Sources of data for individual analyses are provided in Supplemental Tables S1 to S7.

The following datasets were generated:

Janouskovec J, Paskerova GG, Symdianov TG. 2019. Transcriptomes of apicomplexan parasites. NCBI Bioproject. PRJNA557242

Miroliubova TS, Mikhailov KV, Aleoshin VV. 2019. Transcriptome of Digyalum oweni. NCBI Bioproject. PRJNA556465

References

  • Abrahamsen MS, Templeton TJ, Enomoto S, Abrahante JE, Zhu G, Lancto CA, Deng M, Liu C, Widmer G, Tzipori S, Buck GA, Xu P, Bankier AT, Dear PH, Konfortov BA, Spriggs HF, Iyer L, Anantharaman V, Aravind L, Kapur V. Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science. 2004;304:441–445. doi: 10.1126/science.1094786. [PubMed] [CrossRef] [Google Scholar]
  • Adl SM, Bass D, Lane CE, Lukeš J, Schoch CL, Smirnov A, Agatha S, Berney C, Brown MW, Burki F, Cárdenas P, Čepička I, Chistyakova L, Del Campo J, Dunthorn M, Edvardsen B, Eglit Y, Guillou L, Hampl V, Heiss AA, Hoppenrath M, James TY, Karnkowska A, Karpov S, Kim E, Kolisko M, Kudryavtsev A, Lahr DJG, Lara E, Le Gall L, Lynn DH, Mann DG, Massana R, Mitchell EAD, Morrow C, Park JS, Pawlowski JW, Powell MJ, Richter DJ, Rueckert S, Shadwick L, Shimano S, Spiegel FW, Torruella G, Youssef N, Zlatogursky V, Zhang Q. Revisions to the classification, nomenclature, and diversity of eukaryotes. The Journal of Eukaryotic Microbiology. 2019;66:4–119. doi: 10.1111/jeu.12691. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Bodył A. Do plastid-related characters support the chromalveolate hypothesis? Journal of Phycology. 2005;41:712–719. doi: 10.1111/j.1529-8817.2005.00091.x. [CrossRef] [Google Scholar]
  • Bodył A, Stiller JW, Mackiewicz P. Chromalveolate plastids: direct descent or multiple endosymbioses? Trends in Ecology & Evolution. 2009;24:119–121. doi: 10.1016/j.tree.2008.11.003. [PubMed] [CrossRef] [Google Scholar]
  • Borner J, Burmester T. Parasite infection of public databases: a data mining approach to identify apicomplexan contaminations in animal genome and transcriptome assemblies. BMC Genomics. 2017;18:100. doi: 10.1186/s12864-017-3504-1. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Boucher MJ, Ghosh S, Zhang L, Lal A, Jang SW, Ju A, Zhang S, Wang X, Ralph SA, Zou J, Elias JE, Yeh E. Integrative proteomics and bioinformatic prediction enable a high-confidence apicoplast proteome in malaria parasites. PLOS Biology. 2018;16:e2005895. doi: 10.1371/journal.pbio.2005895. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Cai X, Fuller AL, McDougald LR, Zhu G. Apicoplast genome of the coccidian Eimeria tenella. Gene. 2003;321:39–46. doi: 10.1016/j.gene.2003.08.008. [PubMed] [CrossRef] [Google Scholar]
  • Cavalier-Smith T. Symbiogenesis: mechanisms, evolutionary consequences, and systematic implications. Annual Review of Ecology, Evolution, and Systematics. 2013;44:145–172. doi: 10.1146/annurev-ecolsys-110411-160320. [CrossRef] [Google Scholar]
  • Cavalier-Smith T. Gregarine site-heterogeneous 18S rDNA trees, revision of gregarine higher classification, and the evolutionary diversification of Sporozoa. European Journal of Protistology. 2014;50:472–495. doi: 10.1016/j.ejop.2014.07.002. [PubMed] [CrossRef] [Google Scholar]
  • Criscuolo A, Gribaldo S. BMGE (Block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evolutionary Biology. 2010;10:210. doi: 10.1186/1471-2148-10-210. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Dellibovi-Ragheb TA, Gisselberg JE, Prigge ST. Parasites FeS up: iron-sulfur cluster biogenesis in eukaryotic pathogens. PLOS Pathogens. 2013;9:e1003227. doi: 10.1371/journal.ppat.1003227. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Derelle R, López-García P, Timpano H, Moreira D. A phylogenomic framework to study the diversity and evolution of stramenopiles (=heterokonts) Molecular Biology and Evolution. 2016;33:2890–2898. doi: 10.1093/molbev/msw168. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Dyson J, Grahame J, Evennett PJ. The mucron of the gregarine Digyalum oweni (Protozoa: Apicomplexa), parasitic in Littorina species (Mollusca: Gastropoda) Journal of Natural History. 1993;27:557–564. [Google Scholar]
  • Dyson J, Grahame J, Evennett PJ. The apical complex of the gregarine Digyalum oweni (Protozoa: Apicomplexa) Journal of Natural History. 1994;28:1–7. [Google Scholar]
  • Emanuelsson O, Nielsen H, von Heijne G, Heijne GV. ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein science : a Publication of the Protein Society. 1999;8:978–984. doi: 10.1110/ps.8.5.978. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Fleige T, Limenitakis J, Soldati-Favre D. Apicoplast: keep it or leave it. Microbes and Infection. 2010;12:253–262. doi: 10.1016/j.micinf.2009.12.010. [PubMed] [CrossRef] [Google Scholar]
  • Foissner W, Foissner I. First record of a ectoparasitic flagellate on ciliates: an ultrastructural investigation of the morphology and the mode of attachment of Spiromonas gonderi nov.spec. (Zoomastigophora, Spiromonadidae) invading the pellicle of ciliates of the genus Colpoda (Ciliophora, Colpodidae) Protistologica. 1984;20:635–648. [Google Scholar]
  • Gardner MJ, Williamson DH, Wilson RJ. A circular DNA in malaria parasites encodes an RNA polymerase like that of prokaryotes and chloroplasts. Molecular and Biochemical Parasitology. 1991;44:115–123. doi: 10.1016/0166-6851(91)90227-W. [PubMed] [CrossRef] [Google Scholar]
  • Gile GH, Slamovits CH. Transcriptomic analysis reveals evidence for a cryptic plastid in the colpodellid Voromonas pontica, a close relative of chromerids and apicomplexan parasites. PLOS ONE. 2014;9:e96258. doi: 10.1371/journal.pone.0096258. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Gornik SG, Febrimarsa, Cassin AM, MacRae JI, Ramaprasad A, Rchiad Z, McConville MJ, Bacic A, McFadden GI, Pain A, Waller RF. Endosymbiosis undone by stepwise elimination of the plastid in a parasitic dinoflagellate. PNAS. 2015;112:5767–5772. doi: 10.1073/pnas.1423400112. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Grassé P-P. Traité De Zoologie. Anatomie, Systématique, Biologie. Paris: Masson: AAAS; 1953. Classe des Grégarinomorphes; pp. 550–690. [Google Scholar]
  • Huang J, Mullapudi N, Lancto CA, Scott M, Abrahamsen MS, Kissinger JC. Phylogenomic evidence supports past endosymbiosis, intracellular and horizontal gene transfer in Cryptosporidium parvum. Genome Biology. 2004;5:R88. doi: 10.1186/gb-2004-5-11-r88. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Huse SM, Mark Welch DB, Voorhis A, Shipunova A, Morrison HG, Eren AM, Sogin ML. VAMPS: a website for visualization and analysis of microbial population structures. BMC Bioinformatics. 2014;15:41. doi: 10.1186/1471-2105-15-41. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Imlay L, Odom AR. Isoprenoid metabolism in apicomplexan parasites. Current Clinical Microbiology Reports. 2014;1:37–50. doi: 10.1007/s40588-014-0006-7. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Janouskovec J, Horák A, Oborník M, Lukes J, Keeling PJ. A common red algal origin of the apicomplexan, dinoflagellate, and heterokont plastids. PNAS. 2010;107:10949–10954. doi: 10.1073/pnas.1003335107. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Janouškovec J, Horák A, Barott KL, Rohwer FL, Keeling PJ. Global analysis of plastid diversity reveals apicomplexan-related lineages in coral reefs. Current Biology. 2012;22:R518–R519. doi: 10.1016/j.cub.2012.04.047. [PubMed] [CrossRef] [Google Scholar]
  • Janouškovec J, Horák A, Barott KL, Rohwer FL, Keeling PJ. Environmental distribution of coral-associated relatives of apicomplexan parasites. The ISME Journal. 2013;7:444–447. doi: 10.1038/ismej.2012.129. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Janouškovec J, Tikhonenkov DV, Burki F, Howe AT, Kolísko M, Mylnikov AP, Keeling PJ. Factors mediating plastid dependency and the origins of parasitism in apicomplexans and their close relatives. PNAS. 2015;112:10200–10207. doi: 10.1073/pnas.1423790112. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Janouškovec J, Gavelis GS, Burki F, Dinh D, Bachvaroff TR, Gornik SG, Bright KJ, Imanian B, Strom SL, Delwiche CF, Waller RF, Fensome RA, Leander BS, Rohwer FL, Saldarriaga JF. Major transitions in dinoflagellate evolution unveiled by phylotranscriptomics. PNAS. 2017;114:E171–E180. doi: 10.1073/pnas.1614842114. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Janouskovec J, Keeling PJ. Evolution: causality and the origin of parasitism. Current Biology. 2016;26:R174–R177. doi: 10.1016/j.cub.2015.12.057. [PubMed] [CrossRef] [Google Scholar]
  • John U, Lu Y, Wohlrab S, Groth M, Janouškovec J, Kohli GS, Mark FC, Bickmeyer U, Farhat S, Felder M, Frickenhaus S, Guillou L, Keeling PJ, Moustafa A, Porcel BM, Valentin K, Glöckner G. An aerobic eukaryotic parasite with functional mitochondria that likely lacks a mitochondrial genome. Science Advances. 2019;5:eaav1110. doi: 10.1126/sciadv.aav1110. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution. 2013;30:772–780. doi: 10.1093/molbev/mst010. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Koreny L, Sobotka R, Janouskovec J, Keeling PJ, Oborník M. Tetrapyrrole synthesis of photosynthetic chromerids is likely homologous to the unusual pathway of apicomplexan parasites. The Plant Cell. 2011;23:3454–3462. doi: 10.1105/tpc.111.089102. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Kováčiková M, Simdyanov TG, Diakin A, Valigurová A. Structures related to attachment and motility in the marine eugregarine Cephaloidophora cf. communis (Apicomplexa) European Journal of Protistology. 2017;59:1–13. doi: 10.1016/j.ejop.2017.02.006. [PubMed] [CrossRef] [Google Scholar]
  • Kristmundsson Á, Helgason S, Bambir SH, Eydal M, Freeman MA. Margolisiella islandica sp. nov. (Apicomplexa: Eimeridae) infecting Iceland scallop Chlamys islandica (Müller, 1776) in Icelandic waters. Journal of Invertebrate Pathology. 2011;108:139–146. doi: 10.1016/j.jip.2011.08.001. [PubMed] [CrossRef] [Google Scholar]
  • Kwong WK, Del Campo J, Mathur V, Vermeij MJA, Keeling PJ. A widespread coral-infecting apicomplexan with chlorophyll biosynthesis genes. Nature. 2019;568:103. doi: 10.1038/s41586-019-1072-z. [PubMed] [CrossRef] [Google Scholar]
  • Lartillot N, Lepage T, Blanquart S. PhyloBayes 3: a bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics. 2009;25:2286–2288. doi: 10.1093/bioinformatics/btp368. [PubMed] [CrossRef] [Google Scholar]
  • Leander BS. Marine gregarines: evolutionary prelude to the apicomplexan radiation? Trends in Parasitology. 2008;24:60–67. doi: 10.1016/j.pt.2007.11.005. [PubMed] [CrossRef] [Google Scholar]
  • Leander BS, Ramey PA. Cellular identity of a novel small subunit rDNA sequence clade of apicomplexans: description of the marine parasite Rhytidocystis polygordiae n. sp. (host: Polygordius sp., Polychaeta) The Journal of Eukaryotic Microbiology. 2006;53:280–291. doi: 10.1111/j.1550-7408.2006.00109.x. [PubMed] [CrossRef] [Google Scholar]
  • Levine ND. Agamococcidiorida ord. n. and Rhytidocystidae fam. n. for the coccidian genus Rhytidocystis Henneguy, 1907. The Journal of Protozoology. 1979;26:167–168. doi: 10.1111/j.1550-7408.1979.tb02756.x. [CrossRef] [Google Scholar]
  • Lizundia R, Werling D, Langsley G, Ralph SA. Theileria apicoplast as a target for chemotherapy. Antimicrobial Agents and Chemotherapy. 2009;53:1213–1217. doi: 10.1128/AAC.00126-08. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Mathur V, del Campo J, Kolisko M, Keeling PJ. Global diversity and distribution of close relatives of apicomplexan parasites. Environmental Microbiology. 2018;20:2824–2833. [PubMed] [Google Scholar]
  • Mathur V, Kolisko M, Hehenberger E, Irwin NA, Leander BS, Á K, Freeman MA, Keeling PJ. Multiple independent origins of apicomplexan-like parasites. bioRxiv. 2019 doi: 10.1101/636183. [PubMed] [CrossRef]
  • McFadden GI, Reith ME, Munholland J, Lang-Unnasch N. Plastid in human parasites. Nature. 1996;381:482. doi: 10.1038/381482a0. [PubMed] [CrossRef] [Google Scholar]
  • Moore RB, Oborník M, Janouskovec J, Chrudimský T, Vancová M, Green DH, Wright SW, Davies NW, Bolch CJ, Heimann K, Slapeta J, Hoegh-Guldberg O, Logsdon JM, Carter DA. A photosynthetic alveolate closely related to apicomplexan parasites. Nature. 2008;451:959–963. doi: 10.1038/nature06635. [PubMed] [CrossRef] [Google Scholar]
  • Mullin KA, Lim L, Ralph SA, Spurck TP, Handman E, McFadden GI. Membrane transporters in the relict plastid of malaria parasites. PNAS. 2006;103:9572–9577. doi: 10.1073/pnas.0602293103. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Mylnikov AP, Mylnikova ZM. Feeding spectra and pseudoconoid structure in predatory alveolate flagellates. Inland Water Biology. 2008;1:210–216. doi: 10.1134/S1995082908030036. [CrossRef] [Google Scholar]
  • Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution. 2015;32:268–274. doi: 10.1093/molbev/msu300. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Nisbet RER, Kurniawan DP, Bowers HD, Howe CJ. Transcripts in the Plasmodium apicoplast undergo cleavage at tRNAs and editing, and include antisense sequences. Protist. 2016;167:377–388. doi: 10.1016/j.protis.2016.06.003. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Norén F, Moestrup Øjvind, Rehnstam-Holm A-S. Parvilucifera infectans Norén et Moestrup gen. et sp. nov. (Perkinsozoa phylum nov.): a parasitic flagellate capable of killing toxic microalgae. European Journal of Protistology. 1999;35:233–254. doi: 10.1016/S0932-4739(99)80001-7. [CrossRef] [Google Scholar]
  • Oborník M, Vancová M, Lai DH, Janouškovec J, Keeling PJ, Lukeš J. Morphology and ultrastructure of multiple life cycle stages of the photosynthetic relative of apicomplexa, Chromera velia. Protist. 2011;162:115–130. doi: 10.1016/j.protis.2010.02.004. [PubMed] [CrossRef] [Google Scholar]
  • Oborník M, Modrý D, Lukeš M, Cernotíková-Stříbrná E, Cihlář J, Tesařová M, Kotabová E, Vancová M, Prášil O, Lukeš J. Morphology, ultrastructure and life cycle of Vitrella brassicaformis n. sp., n. gen., a novel chromerid from the great barrier reef. Protist. 2012;163:306–323. doi: 10.1016/j.protis.2011.09.001. [PubMed] [CrossRef] [Google Scholar]
  • Okamoto N, Keeling PJ. The 3D structure of the apical complex and association with the flagellar apparatus revealed by serial TEM tomography in Psammosa pacifica, a distant relative of the Apicomplexa. PLOS ONE. 2014;9:e84653. doi: 10.1371/journal.pone.0084653. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Paskerova GG, Miroliubova TS, Diakin A, Kováčiková M, Valigurová A, Guillou L, Aleoshin VV, Simdyanov TG. Fine structure and molecular phylogenetic position of two marine gregarines, Selenidium pygospionis sp. n. and S. pherusae sp. n., with notes on the phylogeny of Archigregarinida (Apicomplexa) Protist. 2018;169:826–852. doi: 10.1016/j.protis.2018.06.004. [PubMed] [CrossRef] [Google Scholar]
  • Patron NJ, Waller RF. Transit peptide diversity and divergence: a global analysis of plastid targeting signals. BioEssays. 2007;29:1048–1058. doi: 10.1002/bies.20638. [PubMed] [CrossRef] [Google Scholar]
  • Perkins FO. The structure of Perkinsus marinus (Mackin, Owen and Collier, 1950) Levine, 1978 with comments on taxonomy and phylogeny of Perkinsus spp. Journal of Shellfish Research. 1996;15:67–87. [Google Scholar]
  • Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature Methods. 2011;8:785–786. doi: 10.1038/nmeth.1701. [PubMed] [CrossRef] [Google Scholar]
  • Petersen J, Ludewig AK, Michael V, Bunk B, Jarek M, Baurain D, Brinkmann H. Chromera velia, endosymbioses and the rhodoplex hypothesis – plastid evolution in cryptophytes, alveolates, stramenopiles, and haptophytes (CASH lineages) Genome Biology and Evolution. 2014;6:666–684. doi: 10.1093/gbe/evu043. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Porchet Hennere E. Observations en microscopie photonique et électronique sur la sporogenèse de Dehornia (1) sthenelais (n. gen., sp. n.), sporozoaire parasite de l’annelide polychete Sthenelais boa (Aphroditides) Protistologica. 1972;8:245–255. [Google Scholar]
  • Price MN, Dehal PS, Arkin AP. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLOS ONE. 2010;5:e9490. doi: 10.1371/journal.pone.0009490. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Ralph SA, van Dooren GG, Waller RF, Crawford MJ, Fraunholz MJ, Foth BJ, Tonkin CJ, Roos DS, McFadden GI. Tropical infectious diseases: metabolic maps and functions of the Plasmodium falciparum apicoplast. Nature Reviews. Microbiology. 2004;2:203–216. doi: 10.1038/nrmicro843. [PubMed] [CrossRef] [Google Scholar]
  • Ramakrishnan S, Docampo MD, Macrae JI, Pujol FM, Brooks CF, van Dooren GG, Hiltunen JK, Kastaniotis AJ, McConville MJ, Striepen B. Apicoplast and endoplasmic reticulum cooperate in fatty acid biosynthesis in apicomplexan parasite Toxoplasma gondii. Journal of Biological Chemistry. 2012;287:4957–4971. doi: 10.1074/jbc.M111.310144. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Roure B, Rodriguez-Ezpeleta N, Philippe H. SCaFoS: a tool for selection, concatenation and fusion of sequences for phylogenomics. BMC Evolutionary Biology. 2007;7:S2. doi: 10.1186/1471-2148-7-S1-S2. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Rueckert S, Simdyanov TG, Aleoshin VV, Leander BS. Identification of a divergent environmental DNA sequence clade using the phylogeny of gregarine parasites (Apicomplexa) from crustacean hosts. PLOS ONE. 2011;6:e18163. doi: 10.1371/journal.pone.0018163. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Sato S, Clough B, Coates L, Wilson RJ. Enzymes for heme biosynthesis are found in both the mitochondrion and plastid of the malaria parasite Plasmodium falciparum. Protist. 2004;155:117–125. doi: 10.1078/1434461000169. [PubMed] [CrossRef] [Google Scholar]
  • Sato S, Wilson RJ. Proteobacteria-like ferrochelatase in the malaria parasite. Current Genetics. 2003;42:292–300. doi: 10.1007/s00294-002-0360-5. [PubMed] [CrossRef] [Google Scholar]
  • Schrével J. Observations biologiques et ultrastructurales sur les Selenidiidae et leurs conséquences sur la systématique des grégarinomorphes. The Journal of Protozoology. 1971;18:448–470. doi: 10.1111/j.1550-7408.1971.tb03355.x. [CrossRef] [Google Scholar]
  • Schrével J, Valigurová A, Prensier G, Chambouvet A, Florent I, Guillou L. Ultrastructure of Selenidium pendula, the type species of archigregarines, and phylogenetic relations to other marine Apicomplexa. Protist. 2016;167:339–368. doi: 10.1016/j.protis.2016.06.001. [PubMed] [CrossRef] [Google Scholar]
  • Seeber F, Soldati-Favre D. Metabolic pathways in the apicoplast of Apicomplexa. In: Jeon K. W, editor. International Review of Cell and Molecular Biology. Academic Press; 2010. pp. 161–228. [PubMed] [Google Scholar]
  • Seow F, Sato S, Janssen CS, Riehle MO, Mukhopadhyay A, Phillips RS, Wilson RJ, Barrett MP. The plastidic DNA replication enzyme complex of Plasmodium falciparum. Molecular and Biochemical Parasitology. 2005;141:145–153. doi: 10.1016/j.molbiopara.2005.02.002. [PubMed] [CrossRef] [Google Scholar]
  • Simdyanov TG, Diakin AY, Aleoshin VV. Ultrastructure and 28S rDNA phylogeny of two gregarines: Cephaloidophora cf. communis and Heliospora cf. longissima with remarks on gregarine morphology and phylogenetic analysis. Acta Protozoologica. 2015;2015:241–262. [Google Scholar]
  • Simdyanov TG, Guillou L, Diakin AY, Mikhailov KV, Schrével J, Aleoshin VV. A new view on the morphology and phylogeny of eugregarines suggested by the evidence from the gregarine Ancora sagittata (Leuckart, 1860) Labbé, 1899 (Apicomplexa: Eugregarinida) PeerJ. 2017;5:e3354. doi: 10.7717/peerj.3354. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Simdyanov TG, Paskerova GG, Valigurová A, Diakin A, Kováčiková M, Schrével J, Guillou L, Dobrovolskij AA, Aleoshin VV. First ultrastructural and molecular phylogenetic evidence from the blastogregarines, an early branching lineage of plesiomorphic Apicomplexa. Protist. 2018;169:697–726. doi: 10.1016/j.protis.2018.04.006. [PubMed] [CrossRef] [Google Scholar]
  • Su HJ, Barkman TJ, Hao W, Jones SS, Naumann J, Skippington E, Wafula EK, Hu JM, Palmer JD, dePamphilis CW. Novel genetic code and record-setting AT-richness in the highly reduced plastid genome of the holoparasitic plant Balanophora. PNAS. 2019:e201816822. doi: 10.1073/pnas.1816822116. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Toso MA, Omoto CK. Gregarina niphandrodes may lack both a plastid genome and organelle. The Journal of Eukaryotic Microbiology. 2007;54:66–72. doi: 10.1111/j.1550-7408.2006.00229.x. [PubMed] [CrossRef] [Google Scholar]
  • Valigurová A, Paskerova GG, Diakin A, Kováčiková M, Simdyanov TG. Protococcidian Eleutheroschizon duboscqi, an unusual apicomplexan interconnecting gregarines and cryptosporidia. PLOS ONE. 2015;10:e0125063. doi: 10.1371/journal.pone.0125063. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Valigurová A, Vaškovicová N, Diakin A, Paskerova GG, Simdyanov TG, Kováčiková M. Motility in blastogregarines (Apicomplexa): Native and drug-induced organisation of Siedleckia nematoides cytoskeletal elements. PLOS ONE. 2017;12:e0179709. doi: 10.1371/journal.pone.0179709. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • van Dooren GG, Kennedy AT, McFadden GI. The use and abuse of heme in apicomplexan parasites. Antioxidants & Redox Signaling. 2012;17:634–656. doi: 10.1089/ars.2012.4539. [PubMed] [CrossRef] [Google Scholar]
  • Varadharajan S, Sagar BK, Rangarajan PN, Padmanaban G. Localization of ferrochelatase in Plasmodium falciparum. The Biochemical Journal. 2004;384:429–436. doi: 10.1042/BJ20040952. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Wakeman KC, Heintzelman MB, Leander BS. Comparative ultrastructure and molecular phylogeny of Selenidium melongena n. sp. and S. terebellae Ray 1930 demonstrate niche partitioning in marine gregarine parasites (Apicomplexa) Protist. 2014;165:493–511. doi: 10.1016/j.protis.2014.05.007. [PubMed] [CrossRef] [Google Scholar]
  • Waller RF, Kořený L. Plastid complexity in dinoflagellates: A picture of gains, losses, replacements and revisions. In: Hirakawa Y, editor. Advances in Botanical Research, Secondary Endosymbioses. Academic Press; 2017. pp. 105–143. [Google Scholar]
  • Wilson RJ, Denny PW, Preiser PR, Rangachari K, Roberts K, Roy A, Whyte A, Strath M, Moore DJ, Moore PW, Williamson DH. Complete gene map of the plastid-like DNA of the malaria parasite Plasmodium falciparum. Journal of Molecular Biology. 1996;261:155–172. doi: 10.1006/jmbi.1996.0449. [PubMed] [CrossRef] [Google Scholar]
  • Yeh E, DeRisi JL. Chemical rescue of malaria parasites lacking an apicoplast defines organelle function in blood-stage Plasmodium falciparum. PLOS Biology. 2011;9:e1001138. doi: 10.1371/journal.pbio.1001138. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Zhu G, Li Y, Cai X, Millership JJ, Marchewka MJ, Keithly JS. Expression and functional characterization of a giant type I fatty acid synthase (CpFAS1) gene from Cryptosporidium parvum. Molecular and Biochemical Parasitology. 2004;134:127–135. doi: 10.1016/j.molbiopara.2003.11.011. [PubMed] [CrossRef] [Google Scholar]
  • Zhu G, Keithly JS. Alpha-proteobacterial relationship of apicomplexan lactate and malate dehydrogenases. The Journal of Eukaryotic Microbiology. 2002;49:255–261. doi: 10.1111/j.1550-7408.2002.tb00532.x. [PubMed] [CrossRef] [Google Scholar]
2019; 8: e49662.
Published online 2019 Aug 16. doi: 10.7554/eLife.49662.028

Decision letter

John McCutcheon, Reviewing Editor, Christopher Howe, Reviewer, and Geoff McFadden, Reviewer
John McCutcheon, University of Montana;

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Convergent origins of parasitism, new phylogenetic relationships and complex distribution of plastids in Apicomplexa." for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by John McCutcheon as the Reviewing Editor and Detlef Weigel as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Christopher Howe (Reviewer #1); Geoff McFadden (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The phylum Apicomplexa includes many medically important and biologically fascinating organisms. However, only a few species of medical importance have been studied with genomics, leaving many important questions related to the evolution of their unusual non-photosynthetic plastids and their sometimes parasitic lifestyles unanswered. Using transcriptome sequencing from apicomplexan cells isolated from various marine invertebrates, the authors resolved some of the long-standing mysteries of Apicomplexa. They show that parasitism likely evolved not once but many times in Apicomplexa, and they show that plastid loss is likely more common that originally appreciated. The evolution of this important group of organisms is now much clearer.

Essential revisions:

There are five essential revisions that need to be made.

1) The difficulty of reconstructing the deeper regions of the phylogenies (because of biased sequence compositions, rates of evolution) is mentioned and dealt with thoroughly but needs to be made more clear in the text for readers that are not experts in phylogenetic reconstructions.

2) The potential problems associated with bacterial contamination in transcriptomes should be discussed more clearly in the main text. The reviewers thought that some reporting on gene-level support for each key gene not being the result of contamination would be useful. If most genes follow a certain pattern (for example, if all are found on the plastids of close relatives), then the results could be summarized in a sentence or two in the Results. If the results are more variable, then perhaps a table would be more appropriate. We appreciate that some detail on this is provided in the Materials and methods, and the main line of evidence seems to be that the transcripts group with plastid-genome-encoded sequences from other species where genomes are available, but this needs to be clearly articulated in the appropriate Results section. For example, were analyses performed in which all of the sequences corresponding to the previously-reported ARLs were included, or just centroids? More detail on these analyses might help the reader to understand the issue more clearly.

3) The existence of a related bioRxiv preprint (http://dx.doi.org/10.1101/636183) which reports many of the same findings needs to be acknowledged and discussed. The reviewers and editors feel that this would strengthen the paper and is the fair thing to do. The authors needn't worry about eLife rejecting their manuscript because of this bioRxiv preprint (see this article on eLife's policy on scooping: https://elifesciences.org/articles/30076).

4) The terminology related to Apicomplexa, apicomplexans, plastids, apicoplasts, etc. needs to be clarified and made consistent throughout the manuscript. It is clear that parasitism has arisen multiple times in this lineage. That's interesting. What is murky is whether the apical complex has arisen independently, which is what some of the language in this paper hints at (e.g Abstract, Introduction, last paragraph and subsection “Monophyly of gregarines and eugregarines, and polyphyly of apicomplexan parasites”. Where does this paper stand on parallel origins of the apical complex? Given that we have some structures in dinoflagellates that we think are homologues, when does a case of parallel evolution from a common ancestral structure become identifiable? Bat and bird wings is a nice example. Both arose independently from a common ancestral structure, namely the vertebrate forelimb. They are still homologous, but modification to aid flight arose convergently. Are the authors of this submission arguing that the apical complex arose independently multiple times, or are they simply saying that parasitism arose multiple times amongst the lineage that has Phylum Apicomplexa at its tip? There is a really big difference of course. The problem is that select lineages could also have lost or degraded apical complexes. We certainly see many parasites with stages that don't develop an apical complex. It seems that the authors are saying (at least in the title) that the apical complex arose once, and that parasitism has arisen multiple times amongst a lineage possessed of an apical complex, but the conclusion from the text is less clear.

The authors have chosen not to use the term apicoplast. Given its wide usage in the literature, a mention of the term early on, and justification for using the more general term plastid to embrace the homologous organelles in those relatives without an apicoplast complex such as the dinozoa, would be appropriate in the Introduction.

Another language issue is whether or not the authors are using formal or informal names for taxa. Happily, they adopt the systematics of Adl et al., 2018, which is widely accepted in the protistology field. The problems encountered in this submission were mainly when the informal name was capitalised (e.g. the tree in Figure 1). This is confusing and should be avoided as it hints at the name being a formal taxon. It is therefore preferable to avoid using informal taxon names at the start of sentences. Clear usage would also include the taxon rank (e.g. Phylum Apicomplexa) for formal taxon names, but lower case for informal categories (e.g. apicomplexans). It is also important to be consistent. For instance, in Figure 1 the authors use the term Cryptosporidians (the capital C should go) but in the text the frequently use the term cryptosporidia. Choose one informal name, and stick to it throughout.

5) The title needs to be made more precise and exciting.

Contributor Information

John McCutcheon, University of Montana.

Christopher Howe, University of Cambridge, United Kingdom.

Geoff McFadden, University of Melbourne, Australia.

2019; 8: e49662.
Published online 2019 Aug 16. doi: 10.7554/eLife.49662.029

Author response

Essential revisions:

There are five essential revisions that need to be made.

1) The difficulty of reconstructing the deeper regions of the phylogenies (because of biased sequence compositions, rates of evolution) is mentioned and dealt with thoroughly but needs to be made more clear in the text for readers that are not experts in phylogenetic reconstructions.

This is a good suggestion. To address it, we extended the description of the plastid 16S rDNA phylogeny (the bias affects exclusively the analysis in Figure 3A). We clarified that the deep-level topology needs to be interpreted with caution and explained that unrelated but compositionally similar sequences can cluster artificially together. Although none of this bias relates to our interpretations it is indeed better that readers are made aware of it (we ask them to compare trees in Figures 3A and 1A). The relevant sentence in the Figure 3 legend was also reworded to clarify that “sequences in the tree vary greatly in their AT content and substitution rates, which can induce a misleading topology – deep relationships in the tree should therefore be interpreted with caution”.

2) The potential problems associated with bacterial contamination in transcriptomes should be discussed more clearly in the main text. The reviewers thought that some reporting on gene-level support for each key gene not being the result of contamination would be useful. If most genes follow a certain pattern (for example, if all are found on the plastids of close relatives), then the results could be summarized in a sentence or two in the Results. If the results are more variable, then perhaps a table would be more appropriate. We appreciate that some detail on this is provided in the Materials and methods, and the main line of evidence seems to be that the transcripts group with plastid-genome-encoded sequences from other species where genomes are available, but this needs to be clearly articulated in the appropriate Results section. For example, were analyses performed in which all of the sequences corresponding to the previously-reported ARLs were included, or just centroids? More detail on these analyses might help the reader to understand the issue more clearly.

We resolved these issues in the following ways. In nuclear phylogenies in the Results, we highlighted that single gene orthologs were identified by “maximum likelihood phylogenies – this allowed us to unambiguously identify paralogous and contaminant sequences (Materials and methods”. In the plastid phylogeny section, we clarified how contaminants were identified and how the origin of individual genes was assigned, as requested: “Maximum likelihood phylogenies of all individual proteins allowed us to readily distinguish the apicomplexan sequences from bacteria and other contaminants in datasets (Materials and methods). In most phylogenies, the apicomplexan sequences form a cluster that is related to algal plastidial forms confirming an origin in the plastid endosymbiont rather than eukaryotic host). The phylogenetic pattern is only different in genes that are derived by horizontal gene transfer from bacteria or that in fact localize outside of the plastid in Plasmodium (heme biosynthesis; see below).” Finally, a description of the global ARL phylogeny with regard to selection of environmental sequences was added, as requested. We included it in Results where we briefly describe the tree and explain that it includes “newly identified VAMPS centroids and representative sequences of known ARLs”.

3) The existence of a related bioRxiv preprint (http://dx.doi.org/10.1101/636183) which reports many of the same findings needs to be acknowledged and discussed. The reviewers and editors feel that this would strengthen the paper and is the fair thing to do. The authors needn't worry about eLife rejecting their manuscript because of this bioRxiv preprint (see this article on eLife's policy on scooping: https://elifesciences.org/articles/30076).

We agree. This is an exciting report that reaches some similar conclusions based on a fully complementary sets of apicomplexan transcriptomes. We referenced the preprint in the last paragraph of Discussion entitled “Summary and future directions”. We briefly compared the principal findings of the two studies, and the agreement of their conclusions in two main areas: the polyphyly of Apicomplexa and multiple plastid losses in eugregarines. Importantly, while the two studies do not fully overlap, they do not fundamentally disagree on any particulars. We note that integrating the two datasets will provide a strong framework for understanding apicomplexan evolution.

4) The terminology related to Apicomplexa, apicomplexans, plastids, apicoplasts, etc. needs to be clarified and made consistent throughout the manuscript. It is clear that parasitism has arisen multiple times in this lineage. That's interesting. What is murky is whether the apical complex has arisen independently, which is what some of the language in this paper hints at (e.g Abstract, Introduction, last paragraph and subsection “Monophyly of gregarines and eugregarines, and polyphyly of apicomplexan parasites”. Where does this paper stand on parallel origins of the apical complex? Given that we have some structures in dinoflagellates that we think are homologues, when does a case of parallel evolution from a common ancestral structure become identifiable? Bat and bird wings is a nice example. Both arose independently from a common ancestral structure, namely the vertebrate forelimb. They are still homologous, but modification to aid flight arose convergently. Are the authors of this submission arguing that the apical complex arose independently multiple times, or are they simply saying that parasitism arose multiple times amongst the lineage that has Phylum Apicomplexa at its tip? There is a really big difference of course. The problem is that select lineages could also have lost or degraded apical complexes. We certainly see many parasites with stages that don't develop an apical complex. It seems that the authors are saying (at least in the title) that the apical complex arose once, and that parasitism has arisen multiple times amongst a lineage possessed of an apical complex, but the conclusion from the text is less clear.

This is an important point that was now clarified in the text. The evidence is clear that apicomplexan parasites are polyphyletic, but this makes no assumption about single vs. multiple origins of their apical complex organelles. The reviewer understood us correctly that the apical complex originated only once but this had not been stated explicitly. We have now expanded the text and references in corresponding section of the Discussion, among others clearly saying that the “distribution points to a single, early origin of the apical complex in the ancestor of apicomplexans and dinoflagellates, in a non-parasitic context”. We then link the early presence of the apical complex to the emergence of convergent parasite morphologies, as outlined previously, so this new information fits in very well.

The authors have chosen not to use the term apicoplast. Given its wide usage in the literature, a mention of the term early on, and justification for using the more general term plastid to embrace the homologous organelles in those relatives without an apicoplast complex such as the dinozoa, would be appropriate in the Introduction.

Yes, we now introduced the term in the Introduction. We explain that “The apicoplast is a four-membrane plastid (a broader term we will use hereinafter to describe the organelle in both parasitic and free-living organisms)”.

Another language issue is whether or not the authors are using formal or informal names for taxa. Happily, they adopt the systematics of Adl et al., 2018, which is widely accepted in the protistology field. The problems encountered in this submission were mainly when the informal name was capitalised (e.g. the tree in Figure 1). This is confusing and should be avoided as it hints at the name being a formal taxon. It is therefore preferable to avoid using informal taxon names at the start of sentences. Clear usage would also include the taxon rank (e.g. Phylum Apicomplexa) for formal taxon names, but lower case for informal categories (e.g. apicomplexans). It is also important to be consistent. For instance, in Figure 1 the authors use the term Cryptosporidians (the capital C should go) but in the text the frequently use the term cryptosporidia. Choose one informal name, and stick to it throughout.

We agree. Given the complex taxonomic history of the group, different formal names or ranks are still being used for Apicomplexa or its subgroups (e.g., phylum vs. subphylum for Apicomplexa; note that Adl et al., 2019, move away from taxonomic ranks altogether). Since the taxonomy is not in the focus here either, we have adopted informal names more widely in the text and in Figure 1A. We kept formal taxonomic group names in Figure 1—figure supplement 1 for comparison, and we point to this in the Figure 1 legend. The one formal name that we keep using more widely is “Apicomplexa”, which we now introduce as a phylum in the Introduction. We avoided instances of informal names at sentence beginnings and used single forms (“cryptosporidians”) throughout the manuscript, as suggested.

5) The title needs to be made more precise and exciting.

We rephrased the title in an active voice, which makes it more interesting and explicit at the same time: it highlights two key discoveries, which the reviewers also pointed out in their summary.


Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

-