Introduction
When organisms evolve a parasitic life strategy, this often results in gene losses and size reduction of the genomes (Tamas et al. Reference Tamas, Klasson, Sandstrom and Andersson2001; Sakharkar et al. Reference Sakharkar, Sakharkar and Chow2007; Lee and Marx, Reference Lee and Marx2012). This is mainly because parasites utilize metabolites and proteins from their hosts, making some metabolic machineries redundant (Kemen et al. Reference Kemen, Gardiner, Schultz-Larsen, Kemen, Balmuth, Robert-Seilaniantz, Bailey, Holub, Studholme, MacLean and Jones2011). However, different host–parasite combinations and life history traits will cause the loss of different genes and pathways, and in some cases parasites might even acquire or evolve new metabolic pathways to complement those of its host (Pombert et al. Reference Pombert, Selman, Burki, Bardell, Farinelli, Solter, Whitman, Weiss, Corradi and Keeling2012).
In most organisms, thiamine (vitamin B1) acts as an essential co-factor for several enzymes involved in carbohydrate and amino acid metabolism. Fungi, bacteria and plants can synthesize thiamine de novo, whereas most other organisms must salvage it from external sources.
De novo synthesis of thiamine requires enzymes that produce two essential moieties [4-amino-2-methyl-5-diphosphomethylpyrimidine and 4-methyl-5-(2-phosphoethyl)-thiazole), as well as an enzyme that combines these moieties into thiamine monophosphate (Helliwell et al. Reference Helliwell, Wheeler and Smith2013, Fig. 1). In the primate malaria parasites, Plasmodium falciparum, Plasmodium vivax and Plasmodium knowlesi, the genes coding for these three key enzymes (hydroxyethylthiazole kinase ThiM, hydroxymethylpyrimidine kinase ThiD and thiamine-phosphate diphosphorylase ThiE) have been found (Frech and Chen, Reference Frech and Chen2011). Further, in P. falciparum, the proteins coded by the genes ThiM, ThiD and ThiE have been shown experimentally to exhibit the expected enzymatic functions in the thiamine pathway (Wrenger et al. Reference Wrenger, Eschbach, Müller, Laun, Begley and Walter2006). However, these enzymes have been shown to be absent in the rodent malaria parasites (Plasmodium berghei, Plasmodium chabaudi and Plasmodium yoelii) (Frech and Chen, Reference Frech and Chen2011). Likewise, in the sequenced genomes of apicomplexan parasites more distantly related to malaria parasites (i.e. Theileria, Babesia, Toxoplasma spp., Neospora spp., Eimeria spp. and Cryptosporidia spp.), none of the required enzymes for this de novo synthesis have been found (Müller and Kappes, Reference Müller and Kappes2007; Shanmugasundram et al. Reference Shanmugasundram, Gonzalez-Galarza, Wastling, Vasieva and Jones2013). Although thiamine is essential for survival (Helliwell et al. Reference Helliwell, Wheeler and Smith2013), loss of this pathway leads to thiamine auxotrophy (i.e. vitamin dependence from an external source) and has occurred repeatedly in the course of evolution in eukaryotes, suggesting a trade-off in the cost of synthesizing the vitamin against the probability of salvaging it externally. The absence of the pathway in all apicomplexan parasites except Plasmodium parasites infecting primates has led to the speculation that the essential genes for the pathway in primate Plasmodium were retrieved by horizontal gene transfer from a bacterium (Frech and Chen, Reference Frech and Chen2011).
Malaria parasites are not exclusive to rodents and primates, however, as many related Plasmodium species exist in lizards, bats and birds (Valkiunas, Reference Valkiunas2005; Martinsen et al. Reference Martinsen, Perkins and Schall2008; Schaer et al. Reference Schaer, Perkins, Decher, Leendertz, Fahr, Weber and Matuschewski2013). The presence of the thiamine pathway in primate malaria species and its absence in rodent malaria parasites raises questions of its origin and possible evolutionary losses. Is the functional de novo thiamine synthesis pathway exclusive to Plasmodium parasites infecting primates? Or is thiamine synthesis a common trait in other Plasmodium parasites but has for unknown reasons been lost in rodent malaria parasites? In order to gain insights into the evolutionary origin of the thiamine pathway in Plasmodium parasites, we searched for orthologous thiamine genes in two transcriptomes of the avian malaria parasites Plasmodium ashfordi and Plasmodium relictum, and in the genomes of the avian parasites Plasmodium gallinaceum and Haemoproteus tartakovskyi. The existence of this pathway in the avian Plasmodium and Haemoproteus parasites could indicate that de novo thiamine synthesis is the ancestral state of Plasmodium parasites and that its absence in the rodent malaria species is a secondary loss. To get a more detailed picture of evolutionary gains and losses of the thiamine pathway in the malaria parasite clade, we extended the search for thiamine genes to several genomes of recently sequenced species of primate and rodent Plasmodium.
Method
Sequences of the single-exon genes that encode the three key enzymes necessary for de novo synthesis of thiamine in P. falciparum, hydroxyethylthiazole kinase (PFL1920c, EC 2.7.1.50), hydroxymethylpyrimidine kinase, (PFE1030c, EC 2.7.1.49) and thiamine-phosphate diphosphorylase (PFF0680c, EC 2.5.1.3), were downloaded from GenBank. They were then used in searches of orthologous genes in the two avian Plasmodium transcriptomes, P. ashfordi (Videvall et al. Reference Videvall, Cornwallis, Ahren, Palinauskas, Valkiūnas and Hellgren2017), P. relictum and in the genome of H. tartakovskyi (Bensch et al. Reference Bensch, Canbäck, DeBarry, Johansson, Hellgren, Kissinger, Palinauskas, Videvall and Valkiūnas2016). Because the P. relictum transcriptome is unpublished at present date, retrieved contigs were verified against the genome of P. relictum (Boehme et al. Reference Boehme, Otto, Cotton, Steinbiss, Sanders, Oyola, Nicot, Gandon, Patra, Herd, Bushell, Modrzynska, Billker, Vinetz, Rivero, Newbold and Berriman2016, www.plasmodb.org) in order to validate that the contigs had assembled correctly. Searches were performed using local BLAST with the software Geneious ver. 6.1. Parameters for the local BLAST searches were set to: scoring (match mismatch): 2–3, seed length: 18, word size: 11, gap costs (open extend): 5 2. The contig with the lowest E-value was kept and used in subsequent analyses. The obtained sequences were translated and used in a BLAST search against the NCBI non-redundant protein database (BLASTP: BLOSOM62, word size 3, exp. threshold 10, gap cost 11 1. BLASTN: exp. threshold 10, word size 11, match mismatch scores 2,−3, gap costs 5 2) to ensure that they were true orthologues.
The genes from P. falciparum were further used to search the published mammalian malaria genomes of Plasmodium malariae, Plasmodium coatneyi, Plasmodium inui, Plasmodium fragile, Plasmodium gaboni, Plasmodium reichenowi and Plasmodium vinckei, retrieved from Plasmodb.org, release 34 (Aurrecoechea et al. Reference Aurrecoechea, Brestelli, Brunk, Dommer, Fischer, Gajria, Gao, Gingle, Grant, Harb, Heiges, Innamorato, Iodice, Kissinger, Kraemer, Li, Miller, Nayak, Pennington, Pinney, Roos, Ross, Stoeckert, Treatman and Wang2009). Obtained genes from P. relictum were used to search the genome of P. gallinaceum in GeneDB.org (Logan-Klumpler et al. Reference Logan-Klumpler, De SIlva, Boehme, Rogers, Vilarde, McQuillan, Carver, Aslett, Olsen, Phan, Farris, Mitra, Ramasamy, Wang, Tivey, Jackson, Houston, Parkhill, Holden, Harb, Brunk, Myler, Roos, Carrington, Smith, Hertz-Fowler and Berriman2013).
Protein alignments were conducted using the MUSCLE-alignment algorithm, with 100 iterations and the default settings, as implemented in Geneious ver. 6.1.6.
Gene phylogenies of the obtained protein sequences for each of the genes were constructed using a maximum-likelihood method and each tree was resampled using 500 bootstrap iterations as implemented in MEGA7 (Kumar et al. Reference Kumar, Stecher and Tamura2016). Rates of molecular change were set to a JTT + G + F (with five discrete γ distributions) model for the genes ThiD and ThiE and a LG + G (with five discrete γ distributions) model for the gene ThiM. Models were selected based on the lowest Bayesian information criterion scores after running the model test as implemented in MEGA7 (Kumar et al. Reference Kumar, Stecher and Tamura2016).
Results
The orthologous genes encoding the three key enzymes involved in de novo synthesis of thiamine (hydroxyethylthiazole kinase, hydroxymethylpyrimidine kinase and thiamine-phosphate diphosphorylase) were found in all three newly sequenced unannotated avian haemosporidians P. relictum, P. ashfordi and H. tartakovskyi (GenBank nr: KP784838, KP784836, KP78483, KP784835, KP784833, KP784837, KP784841, KP784839, KP784841), and in the genome of the annotated avian parasite P. gallinaceum (GeneDB GeneID: PGAL8A_00461300, PGAL8A_00067800, PGAL8A_00142600). Local Blast of the three avian parasites P. relictum, P. ashfordi and H. tartakovskyi yielded highly significant matches (E-values <9·3 × 10−31) (Table 1). The second best hits had considerably higher E-values (ranges between 1 × 10−7 and 1 × 10−2, Appendix 1). Gene annotations of the retrieved sequences were confirmed with BLAST searches against GenBank, which yielded highly significant E-values both for the nucleotide sequences (between 1 × 10−32 and <1 × 10−100) and for the translated sequences (1 × 10−46 to 1 × 10−164), all matching the expected annotated genes (Appendices 2 and 3). These results strongly indicate that the orthologous thiamine genes have been identified correctly in P. ashfordi, P. relictum and H. tartakovskyi. Within the genomes of P. malariae, P. coatneyi, P. inui, P. fragile, P. gaboni and P. reichenowi, all three genes were located and the orthology was confirmed with conserved protein alignments (Supplementary Fig. S1a–c) and phylogenetic clustering with the previously annotated genes (Fig. 2). However, none of the thiamine genes were found in the genome of the newly sequenced rodent malaria parasite P. vinckei, which together with P. yoelii, P. berghei and P. chabaudi (Frech and Chen, Reference Frech and Chen2011), makes it a total of four rodent malaria species discovered without these genes.
All protein sequence alignments showed large regions with highly conserved blocks of amino acids (Supplementary Fig. S1a–c) as well as an overall high level of sequence similarity, further strengthening the case that we had received the orthologues of the genes coding for the enzymes in the thiamine biosynthesis pathway.
In no cases did the retrieved orthologous genes contain stop codons within the proposed exons. The transcriptomes of P. relictum and P. ashfordi were derived from the erythrocytic phase of infection (for methods see Videvall et al. Reference Videvall, Cornwallis, Palinauskas, Valkiūnas and Hellgren2015), and the presence of expressed thiamine genes during this stage demonstrates that the genes not only are present in these genomes, but are indeed activated and transcribed during this part of the infection cycle. The full length of the genes was obtained for P. relictum and H. tartakovskyi, whereas the ThiD and THiE transcripts from P. ashfordi were slightly shorter. The shorter transcripts are most likely an artefact due to insufficient sequence coverage of the P. ashfordi transcriptome, which have resulted in a slightly lower mean and median length of the transcripts in comparison to P. falciparum (Videvall et al. Reference Videvall, Cornwallis, Ahren, Palinauskas, Valkiūnas and Hellgren2017).
Discussion
In apicomplexan parasites, the metabolic pathway for de novo synthesis of thiamine is not exclusive to malaria parasites infecting primates (Table 2, Fig. 3). This conclusion is based on our finding of the key enzymes in the thiamine pathway being expressed in two avian Plasmodium species and their gene orthologues present in the genomes of a third avian Plasmodium species and a parasite in the sister genus Haemoproteus.
Species in bold represent findings presented in this study, whereas non-bold species represent previously reported findings from Frech and Chen (Reference Frech and Chen2011).
a Represents a degenerated gene fragment of ThiM found in P. yoelii.
There are several competing hypotheses regarding the phylogeny of Plasmodium species, and in particular the evolutionary relationship between P. falciparum and avian Plasmodium species (Perkins, Reference Perkins2014). However, there is a strong consensus that the rodent malaria parasites form a single monophyletic clade (Martinsen et al. Reference Martinsen, Perkins and Schall2008; Pick et al. Reference Pick, Ebersberger, Spielmann, Bruchhaus and Burmester2011; Bensch et al. Reference Bensch, Canbäck, DeBarry, Johansson, Hellgren, Kissinger, Palinauskas, Videvall and Valkiūnas2016; Borner et al. Reference Borner, Pick, Thiede and Kolawole2016). Further, recent phylogenetic analyses based on genome sequencing of H. tartakovskyi show that Haemoproteus is a sister taxon to a monophyletic clade of all Plasmodium species (Bensch et al. Reference Bensch, Canbäck, DeBarry, Johansson, Hellgren, Kissinger, Palinauskas, Videvall and Valkiūnas2016). Thus, our findings indicate that the thiamine pathway is ancestral to the whole group of Plasmodium as they are found in the sister genus Haemoproteus as well. In the genomes of rodent malaria parasites, a remnant of these genes can be seen in P. yoelii, which has one gene relict of the hydroxyethylthiazole kinase (ThiM) (Frech and Chen, Reference Frech and Chen2011). The reason why rodent Plasmodium species have lost the genes involved in the thiamine synthesis pathway remains unknown. The four rodent malaria parasites that have been sequenced are all derived from isolates kept for a number of generations in laboratory mice fed on grain which is rich in thiamine. Loss of the thiamine genes due to such artificial selection is possible, however unlikely, as it would require parallel independent losses in all four species. A more parsimonious explanation is that the loss happened prior to the radiation of the rodent malaria parasites.
Haemoproteus and avian Plasmodium are extremely species-rich genera of pathogens (Bensch et al. Reference Bensch, Hellgren and Pérez-Tris2009), with numerous and mainly unstudied host–pathogen combinations. These parasites might therefore prove to be suitable systems for investigating repeated gene losses in order to find common evolutionary denominators for when this essential pathway is lost. It would be of particular interest to investigate the presence of the thiamine pathway in avian malaria species that are specialists on granivore and insectivore hosts, to elucidate whether gene losses can be associated with the hosts’ food preferences.
Frech and Chen (Reference Frech and Chen2011) speculated that the three key enzymes for thiamine synthesis in primate malaria parasites originated via horizontal gene transfer from a bacterium. They found that these genes had both sequence similarities with Clostridium spp. and their genomic location next to each other on the same strand in the Clostridium genome enabled the formation of a potential operon (Frech and Chen, Reference Frech and Chen2011). If these genes have been acquired through horizontal gene transfer, then this event must have happened prior to the split of the genera Haemoproteus and Plasmodium (Fig. 3), thus being ancestral to all Plasmodium species. On the other hand, we cannot exclude that these genes constitute the ancestral stage of all apicomplexans, followed by repeated losses throughout the phylogeny but being kept within the clade of the haemosporidian species (Fig. 3).
The presence of the de novo pathway of thiamine across primate Plasmodium species have made it a potential target for antimalarial drugs research (Chan et al. Reference Chan, Wrenger, Stahl, Bergmann, Winterberg, Müller and Saliba2013). As this metabolic pathway does not exist in the most important model organisms in malaria research, rodent Plasmodium, it has been difficult to explore the potential for using the thiamine pathway in drug development. In the early days of malaria research, birds were frequently used as model organisms and served in some of the most fundamental breakthroughs in malaria research and drug development (von Wasielewski, Reference von Wasielewski1904–08; McGhee et al. Reference McGhee, Singh and Weathersby1977; Cox, Reference Cox2010). The presence of the thiamine pathway in avian malaria species highlights the value of studying birds as a complement model system in the searches for antimalarial drug targets.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0031182017002219.
Authors’ contributions
OH conceived the study, performed the analysis and wrote the first draft of the manuscript. EV performed the bioinformatic work on P. ashfordi and P. relictum. SB performed the bioinformatic work on H. tartakovskyi. All authors participated in interpreting the data, contributed to the writing and approved of the final manuscript. Genbank accession numbers of the ThiD, ThiM and ThiE genes from P. ashfordi, P. relictum and H. tartakovskyi have been submitted by the authors and are to be public upon publication.
Funding
This research was founded by the Swedish research council (VR) and The Crafoord foundation.
Competing interests
The authors declare to have no financial, non-financial competing interests or competing interest from commercial organizations.
Availability of data and material
All data used in the study have been published in public databases and can be available from the authors on reasonable request.
Appendix 1
Appendix 2
Appendix 3
BLASTn results of retrieved genes in the transcriptomes and genome of Plasmodium relictum (SGS1), Plasmodium ashfordi (GRW2) and Haemoproteus tartakovskyi (Ht).
GRW2_hydroxyethylthiazole kinase (KP784835)
Plasmodium falciparum 3D7 hydroxyethylthiazole kinase, putative (PFL1920c) mRNA, complete cds, XM_001350754.1, e = 8 × 10−147
GRW2_hydroxylmethylpyrimidine kinase KP784833
Plasmodium falciparum 3D7 phosphomethylpyrimidine kinase, putative (PFE1030c) mRNA, complete cds, XM_001351727.1, e = 3 × 10−101
GRW2_Thiamine-phosphate dihydrophosphorylase KP784837
Plasmodium falciparum 3D7 thiamine-phosphate pyrophosphorylase, putative (PFF0680c) mRNA, complete cds, XM_961034.2, e = 9 × 10−71
Ht_hydroxyethylthiazole kinase KP784841
Plasmodium falciparum 3D7 hydroxyethylthiazole kinase, putative (PFL1920c) mRNA, complete cds, XM_001350754.1, e = 3 × 10−96
Ht_hydroxylmethylpyrimidine kinase KP784839
Plasmodium falciparum 3D7 phosphomethylpyrimidine kinase, putative (PFE1030c) mRNA, complete cds, XM_001351727.1, e = 1 × 10−45
Ht_Thiamine-phosphate dihydrophosphorylase KP784841
Plasmodium falciparum 3D7 thiamine-phosphate pyrophosphorylase, putative (PFF0680c) mRNA, complete cds, XM_961034.2, e = 2 × 10−32
SGS1_hydroxyethylthiazole kinase KP784838
Plasmodium falciparum 3D7 hydroxyethylthiazole kinase, putative (PFL1920c) mRNA, complete cds, XM_001350754.1, e = 0.0
SGS1_hydroxylmethylpyrimidine kinase KP784836
Plasmodium falciparum 3D7 phosphomethylpyrimidine kinase, putative (PFE1030c) mRNA, complete cds, XM_001351727.1 e = 2 × 10−179
SGS1_Thiamine-phosphate dihydrophosphorylase KP78483
Plasmodium falciparum 3D7 thiamine-phosphate pyrophosphorylase, putative (PFF0680c) mRNA, complete cds, XM_961034.2, e = 8 × 10−137