INTRODUCTION
“Parasitologists have more chances to make mistakes than most other biologists, and they probably do so.”
- Harold W. Manter, 1969The practice of systematic biology involves two main components. The first of these is taxonomy, which aims to understand the diversity of living organisms and involves the description of species and the classification of organisms into higher groups. The second piece is phylogenetic analysis, or the inference of evolutionary relationships, with which it is possible to understand the processes that generate biological diversity (Whitehead, Reference Whitehead1990). The two components are complementary: phylogenetics can often be used to inform taxonomy in that the detection of monophyletic groups within a phylogeny supports hypotheses of classification. In the field of parasitology, there exists a rich history in the former component – taxonomy – but phylogenetic studies have not been nearly as common. For example, in the year 2009, the journal Systematic Parasitology published only a single article that contained a phylogenetic analysis, and yet in the same year, over 50 taxonomic papers appeared in the same journal. However, in order to understand the evolutionary history of a group of organisms and to conduct accurate comparisons of traits or properties between organisms, a reliable phylogenetic tree must be in hand. The importance of parasites in medical and agricultural settings and the need to understand their origins and patterns of evolution begs for a comprehensive and accurate understanding of their histories. In what follows, we attempt to give an overview of some of the challenges of doing systematic work on parasites, including issues that can affect both the morphological study of parasites as well as how molecular data, while certainly powerful in its amount of information and without the problems concerning morphological studies, is still not without its own set of caveats. This is not meant to be a synthetic review of the morphological and molecular studies of parasite systematics; for sure, that would take a volume on its own. Rather, we hope that the reader will come away with a tidy summary of these issues and have illustrated these whenever possible with examples from various parasite groups.
The very first ‘phylogenetic trees’ for parasites, and for other organisms, began as simple, mostly hand-drawn sketches depicting suggestions of how various taxa might be related. Quite often, these phylogenies incorporated the hosts that were used as a predominant organizer. For example, one of the first phylogenies in the literature of the malaria parasites (Mattingly, Reference Mattingly1983) shows the parasites divided into bird, mammal and squamate reptile ‘branches’ with Plasmodium as a terminal taxon in all three. The cladistic revolution of the second half of the 20th century provided a straight-forward method to produce such a reliable tree: specimens are collected and examined, with data collected for a set of homologous characters (characters shared through common ancestry). These data, in the form of either discontinuous characters, such as the presence or absence of a particular structure or continuous characters, such as morphometric information, are then used to produce a matrix. A phylogenetic algorithm is applied to this matrix to produce a diagram in the form of a bifurcating tree that displays the cladogenesis or branching of species over time.
Now, in the 21st century, we are witnessing the increasing utility of phylogenetics in various fields. Phylogenetic trees have become integral to studies in developmental evolution, biogeography, life history evolution, and conservation biology and their popularity in other fields is growing, as are entire fields of their own, such as community phylogenetics and phylogeography. This revolution has been fueled, in part, by an increasing ability and ease at which we are able to incorporate molecular characters, in the form of DNA sequence data, into phylogenetic analysis. Also, improvements in the computational speed of programs for statistical phylogenetic algorithms such as maximum likelihood and Bayesian analyses has made them increasingly popular and powerful.
WHY MORPHOLOGY CAN FAIL
Paucity of morphological characters
Parasites are often cited as “Cheshire cats” of the animal kingdom – organisms whose body plans are sometimes reduced to little more than nutrient-absorbing egg factories, with losses of appendages and many other morphological features. The classic example is that of Myxozoa, unicellular parasites that are now known to be highly reductive version of some metazoan – though whether it is a cnidarian or a bilaterian remains somewhat up for debate (Evans et al. Reference Evans, Holder, Barbeitos, Okamura and Cartwright2010). This generalization is not entirely accurate, however. Parasitic nematodes, for example, are no less complex than their free-living relatives (Poulin, Reference Poulin2007). However, loss of certain characters, small size, and stasis in other morphological features in parasites can present a real challenge in terms of acquiring a sizeable data matrix for phylogenetic analysis, without which, character conflict and polytomies are almost certain to occur. For example, Brooks's (Reference Brooks1979) study of the Proterodiplostomidae (Trematoda: 18 genera) consisted of 14 characters in the data matrix, Carmichael's (Reference Carmichael1984) analysis of the Schistosomatidae (Trematoda: 14 genera) only contained 24 characters, and Morand and Müller-Graf's (Reference Morand and Müller-Graf2000) re-analysis of Carmichael (Reference Carmichael1984), which involved re-coding and splitting his data, still only yielded 37 characters. Not surprisingly, all of these studies produced topologies that show a lack of resolution in one or more parts of the tree. Larger data-sets such as Hoberg et al.'s (1997) matrix of 49 morphological characters for the eucestode orders do sometimes produce well-resolved topologies; in this case, a perfectly resolved, but also perfectly pectinate tree.
Unicellular parasites, of course, present an even bigger challenge for constructing morphological character matrices in that these individual cells are often incredibly difficult to classify morphologically. For parasites within the order Haemosporidia, which contain the malaria parasites and closely related genera, ‘morphology’ is sometimes little more than measurements of the length and width of the cells, with occasional data of the areas of the parasite. Ultrastructural data from electron microscopy have provided some morphological characters for several workers, however, due to the intensity of time and technique necessary for proper parasite preparation, limitation of equipment and the sensitivity of the parasite cells themselves, these data have primarily been collected only for organisms that can be cultured in the laboratory either in vitro or in vivo.
The paucity of morphological characters in parasites makes them especially prone to harbouring cryptic diversity. Perkins (Reference Perkins2000) examined a malaria parasite of Caribbean lizards that had been described as a single taxon that shifted between its host's erythrocytes and leucocytes. Molecular data from the parasites’ mitochondrial cytochrome b genes, however, showed a complete segregation in the genotypes from infections observed in red blood cells versus white blood cells, suggesting the occurrence of reproductive isolation. Cryptic species have also been observed after molecular analysis in other malaria parasites including those that infect birds and humans (e.g. Win et al. Reference Win, Jalloh, Tantular, Tsuboi, Ferreira, Kimura and Kawamoto2004; Martinsen et al. Reference Martinsen, Paperna and Schall2006) as well as in diplomonads (e.g. Monis, Reference Monis1999), acanthocephalans (e.g. Steinauer et al. Reference Steinauer, Nickol and Ortí2007; Martínez-Aquino et al. Reference Martínez-Aquino, Reyna-Fabián, Rosas-Valdez, Razo-Mendivil, Pérez-Ponce de León and García-Varela2009) trematodes (e.g. Donald et al. Reference Donald, Kennedy, Poulin and Spencer2004; Leung et al. Reference Leung, Keeney and Poulin2009; Hayward, Reference Hayward2010), nematodes (e.g. Li et al. Reference Li, DíAmelio, Paggi, He, Gasser, Lun, Abollo, Turchetto and Zhu2005), cestodes (e.g. Marques et al. Reference Marques, Santos, Gibson, Cabral and Olson2007; Lavikainen et al. Reference Lavikainen, Haukisalmi, Lehtinen, Henttonen, Oksanen and Meri2008), and trypanosomes (e.g. Sehgal et al. Reference Sehgal, Jones and Smith2001). Indeed, with the increasing application of molecular analysis to different parasite groups, the occurrence of cryptic diversity has become a very common and well-supported hypothesis (de León and Nadler, Reference de León and Nadler2010). A number of such molecular studies suggest that we have severely underestimated species diversity in some parasite groups (e.g. Bensch et al. Reference Bensch, Stjernman, Hasselquist, Ostman, Hansson, Westerdahl and Pinheiro2000; Locke et al. Reference Locke, McLaughlin and Marcogliese2010; Poulin, Reference Poulin2011a). These results indicate that molecular analysis may be necessary to better understand the distribution of parasite diversity amongst hosts as well as to reveal the range of host use and degree of host specialization that cannot fully be captured by morphological study alone.
Fixation issues
For many blood parasites including the trypanosomes, haemogregarines, and piroplasmids, morphological characters are typically gleaned by light microscopy of stained thin blood smears. These samples may not be prepared in uniform conditions in the field, resulting in variation in parasite size and shape. The fixation process, whereby blood cells are air-dried on a glass microscope slide and then dehydrated with alcohol, can distort the parasites. Differences in the staining procedure, including techniques and reagents used can also produce variation in the parasites. In addition, environmental conditions surrounding slide preparation such as temperature and humidity may induce differences in parasite shape, colour and form.
For helminths, a number of factors affect the ability to recover morphological characters, including the source of the host (e.g. market, freshly caught, iced), the degree of care with which specimens are removed, how the worms are relaxed, the method of preservation, and how the specimens are mounted. Cribb and Bray (Reference Cribb and Bray2010) detail how parasitologists have varied in the techniques used to flatten or compress trematodes on slides, with some using simply the weight of the cover glass and others developing more elaborate techniques of applying slight pressure to the worms. Clearly, this variation can make it difficult to produce consistent specimens between labs or even amongst different researchers. Preparation methods are not consistent between groups of parasitic organisms, such that the techniques that yield nice specimens in one group may not be suitable for another group (Bullock, Reference Bullock and Schmidt1969). Specimens that have been removed from hosts that have already been fixed, i.e. in situ fixation of parasites versus their proper removal from recently deceased hosts prior to fixation, can seriously distort the size and shape of parasite structures and species descriptions from any material obtained in this manner may differ in morphological characteristics from fresh material (Criscione and Font, Reference Criscione and Font2001).
Host-induced variation
Populations of organisms in the natural world are expected to show some degree of variation; after all, it is this variation that is the fodder for evolutionary change to occur. However, the challenge for systematic biologists is to determine first, to what degree that variation transcends boundaries for what they hope to establish as species and, second, to tease apart variation that may have been induced by the environment and which does not have a heritable, genetic basis, so that true autapomorphies can be found. Parasites can present special challenges in this regard, particularly when their ‘environments’ consist of different host individuals and different host species as they traverse through their life cycles. Indeed, there are numerous examples of parasites that exhibit different morphologies when found in different host environments. Within a single host species, variation in size of parasites due to variation in their host's body condition is well documented in the literature and reflects that given that the larger the host or the better its condition, the better the nutritional resources will be for the parasite and the larger it may grow. Read and Rothman (Reference Read and Rothman1957) observed, for example, that the size (length) of the cestode, Hymenelopis dimimuta, was greater in rat hosts that were either older or that had a better quality of carbohydrates in their diet. Other factors that can also influence the size of parasites include the total number of parasites per host, a host's previous exposure to the parasite, and the presence of other parasite species within the same host (Haley, Reference Haley1962).
Variation in parasite size and other morphological features has also been noted for a wide range of parasites found to infect different host species, with a few examples provided here. Bruce et al. (Reference Bruce, Llewellyn and Sadun1961) experimentally infected thirteen different species of mammals from the Washington, DC area and from Florida with Schistosoma mansoni and observed a large degree of variation in the size of the worms as well as variation in infection sites amongst the various host species. Variation has also been noted for the malaria parasite, Plasmodium floridense, a generalist parasite capable of infecting a diversity of lizard species. Jordan (Reference Jordan1975) and Jordan and Friend (Reference Jordan and Friend1971) showed that in Georgia, where the parasite infects Anolis carolinensis and Sceloporus undulatus, the parasite produced a different number of merozoites per meront, a character that has often been used for taxonomic purposes in defining species of Plasmodium, in each of the two host species.
Study of the monogenean genus Gyrodactylus, a speciose group of ectoparasites of fish, also reveals the plastic nature of parasite morphology and the heavy influence of the host environment. The primary character used to differentiate species for these fish parasites has been the size and shape of the haptoral hard parts. Dmitrieva and Dimitrov (Reference Dmitrieva and Dimitrov2002), however, found that both size and shape of components of these haptors varied depending on the temperature of the water, geographic location and host species. Although these authors conclude that such variation may not hamper accurate taxonomic identification if large enough series are examined and the locale and the host are known, it is easy to see how naïve investigators or inaccurate labeling could result in errors. And, though this is a single group, it calls into question the possibility of plasticity for other helminths due to the environmental conditions experienced during the phases of growth and development.
Further examples of host-dependent parasite variation in parasite morphology demonstrate that intermediate or vector hosts can also play a large role in the host environment of a parasite. One of the classic examples of host-induced morphology is that of Echinococcus granulosus in Australia, where the type of intermediate host that the parasite uses (ungulate versus macropod) will alter the morphology of the hooks that can be observed in the adult worms (Hobbs et al. Reference Hobbs, Lymbery and Thompson1990). Blasco-Costa et al. (Reference Blasco-Costa, Balbuena, Raga, Kostadinova and Olson2010), upon examination of digenean species of the genus Saccocoelium found to infect mullets in the Mediterranean, suggest a strong influence of host type on parasite morphology. Although molecular data corroborate some of the morphologically determined species of Saccocoelium, the authors also observed discordance between morphological and molecular variation. A number of quite distinct morphotypes of parasites displayed no genetic variation upon examination of both 28S and ITS2 gene sequences, suggesting an environmental cause of changes in parasite morphology (Blasco-Costa et al. Reference Blasco-Costa, Balbuena, Raga, Kostadinova and Olson2010). In addition, the authors noted that divergent clades in a phylogeny corresponded not to differences in the definitive fish host species, as the parasites were found to infect the same fish species, but rather the divergence was correlated with changes in intermediate snail host groups. Similarly, Martinsen et al. (Reference Martinsen, Perkins and Schall2008) found that the blood parasites of birds that identify to the traditionally defined genus Haemoproteus, are in fact comprised of two quite distinct clades upon phylogenetic analysis, with each clade corresponding to a different insect vector. Again, these examples point to the benefit of incorporating molecular data into the study of parasite diversity.
Convergence
The general phenomenon of convergence in parasites is frequently discussed and was recently deftly reviewed by Poulin (Reference Poulin2011b), where he outlines the six “universal strategies” for animal parasites. Convergence can also be a problem for the collection of morphological data for parasite systematics. Wiens et al. (Reference Wiens, Chippendale and Hillis2003) suggested that convergence is expected when species invade and become adapted to similar selective environments, and that these species evolve a shared phenotype as a result of these selection pressures, as opposed to common ancestry. Certainly, the internal or external environments of many host species may be functionally similar to the parasites that inhabit them and adaptation to these common environments may be expected to sometimes engender convergent morphologies among divergent parasite species. For example, in a comparative study on morphological characters versus molecular data for the braconid wasps, Quicke and Belshaw (Reference Quicke and Belshaw1999) discovered that the incongruence observed in the topologies obtained from the two sources of data were the result of convergence in morphologies to an endoparasitic lifestyle in those species. Detailed study of the Parahaemoproteus parasites of birds suggests a similar scenario (Martinsen, unpublished observations). Through multi-gene phylogenetic analysis of over a dozen species of parasites distributed across a number of bird families, parasites infecting closely related bird species shared identical morphology upon microscopic examination of several key diagnostic morphological characters. Many of these parasites ascribed to the same Parahaemoproteus species were quite divergent, falling into different parts of a molecular phylogeny, and suggest a convergence in morphology based on host cell environment. However, it is unclear whether this variation in morphology is adaptive or not or whether it may be a developmental response to the internal conditions within the host cell.
BUT, MOLECULES COME WITH THEIR OWN PROBLEMS
Molecular characters in the form of DNA sequences offer numerous advantages over morphological data. In addition to the scenarios described above, the sheer abundance of characters is an important reason for choosing to incorporate molecular data into a phylogenetic analysis. Evolutionary models can also be more easily incorporated in molecular systematic algorithms that allow for more sophisticated statistical analyses of phylogenetic trees such as with maximum likelihood and Bayesian methods. The use of molecular data for studying the systematics of parasites is rapidly accelerating and will undoubtedly continue to do so. However molecular phylogenetic study of parasites again presents its own set of problems and caveats.
Isolation of parasite DNA
The isolation of DNA from specimens presents, in itself, the first challenge to the molecular systematic analysis of parasites. For unicellular parasites, the problem may simply be a matter of procuring enough quality material, though recently developed whole-genome amplification methods are allowing for better recovery of sequences – even whole genomes – from single cells (Palinauskas et al. Reference Palinauskas, Dolnik, Valkiunas and Bensch2009; Wang and Bodovitz, Reference Wang and Bodovitz2010). Unicellular parasites that live within host cells can present additional challenges in that separation from host cells may be near impossible. In this case, only via the use of taxon-specific primers can sequences be recovered. Past methods of fixation of specimens also hampers efforts to obtaining genetic data from specimens, which can pose challenges to revisionary systematic work when fresh samples of some isolates or even host species may no longer be obtainable. Many parasites are fragile, if not also single celled which inevitably results in difficulty in isolating quality DNA products, as the various procedures needed to prepare parasites for morphological examination also result in DNA degradation. Although genetic data have been successfully obtained from parasite taxa from fixed and stained blood smears (Beadell et al. Reference Beadell, Ishtiaq, Covas, Melo, Warren, Atkinson, Bensch, Graves, Jhala, Peirce, Rahmani, Fonseca and Fleischer2006), this has been the case only when the gene fragments to be amplified are extremely short and the slides were not additionally preserved with balsam and cover slips. Helminth samples are frequently either vouchered as formalin-fixed or mounted specimens, so obtaining high quality DNA is troublesome in these cases as well. Although fixation methods that attempt to strike a balance between a solution that will not distort specimens for morphological studies, but also yield usable macromolecules for molecular studies, such as DESS (dimethyl sulfoxide, disodium EDTA, and NaCl); but even this method is not without limitations and is still a compromise for both (Naem et al. Reference Naem, Pagan and Nadler2010). What further complicates this entire issue is that it is extremely difficult, if not impossible, to have a specimen that serves simultaneously as both a morphological voucher and one that can be used for molecular studies, as the preparation for the latter will often be destructive of the former. Unlike systematic studies of vertebrates, where a small piece of tissue can be removed, or insects, where one leg can be used for extraction of DNA (or even, sometimes, the whole body; Gilbert et al. Reference Gilbert, Moore, Melchior and Worobey2007), without compromising the specimen greatly, for many other invertebrate groups, including parasites, this is more challenging. For single-celled organisms, it becomes nearly impossible, although individual extracellular parasite stages have been successfully used for both a morphological specimen and a DNA sequence (e.g. Dolnik et al. Reference Dolnik, Palinauskas and Bensch2009).
Traditional molecular markers
In the early days of molecular systematics, the most popular gene used was the small subunit ribosomal gene, or the 18S rRNA gene. In most eukaryotes, the ribosomal RNA genes have two features that make them attractive for molecular work: they exist in multiple copies (sometimes hundreds) in the genome, so there is abundant transcript to amplify; and they contain both highly conserved regions, which allow for relatively simple design of PCR primers, and variable regions, which contain phylogenetic information. In the 1990s, a multitude of systematic studies were produced based on these markers, many of which came with unexpected results. An often-cited surprise from the use of these genes was the grouping of the virulent human malaria parasite, Plasmodium falciparum, with a species infecting chickens, Plasmodium gallinaceum (Waters et al. Reference Waters, Higgins and McCutchan1991). However, it soon became known that the 18S genes of malaria parasites not only exist as a small number of distinct, single-locus copies that are not evolving concertedly (Rogers et al. Reference Rogers, McConkey, Li and McCutchan1995), but that these copies are differentially expressed during the life cycle of the parasite and thus are subjected to different forces of selection, and subsequent studies using other genes and/or taxa did not support the grouping of P. falciparum with avian malaria parasites (Qari et al. Reference Qari, Shi, Pieniazek, Collins and Lal1996; Escalante et al. Reference Escalante, Freeland, Collins and Lal1998, Perkins and Schall, Reference Perkins and Schall2002; Martinsen et al. Reference Martinsen, Perkins and Schall2008).
The issue of the paralogy of rRNA genes is not generally of concern in phyla other than Apicomplexa, and certainly, for the reasons mentioned above, they formed the basis of virtually all of the early molecular systematic studies of various parasite groups, including those on ciliates (e.g. Wright and Lynn, Reference Wright and Lynn1995), myxosporeans (e.g. Eszterbauer, Reference Eszterbauer2004), and platyhelminths (e.g. Campos et al. Reference Campos, Cummings, Reyes and Laclette1998; Mariaux, Reference Mariaux1998). But, there are still caveats to be heeded when using these data, particularly the potential difficulty and thus ambiguity in their alignment. Indeed, it was a study of apicomplexan 18S sequences that first showed that discrepancies in sequence alignment affected the topology of the tree far more than the phylogenetic algorithm used to analyze the data (Morrison and Ellis, Reference Morrison and Ellis1997).
Other molecular phylogenetic studies of parasites have incorporated the internal transcribed spacer regions (ITS1 and ITS2), which flank the rRNA genes, as the markers of choice. These fragments share many of the same advantages as described above for the ribosomal RNA markers, but are generally thought to show more rapid rates of evolution due to decreased selectional constraints. However, these loci should also be used with caution as intra-individual variation has been observed in many different kinds of organisms including cestodes (Kralova-Hromadova et al. Reference Kralova-Hromadova, Stefka, Spakulova, Orosova, Bombarova, Hanzelova, Bazsalovicsova and Scholz2010), trematodes (Van Herwerden et al. Reference Van Herwerden, Blair and Agatsuma1998), protists (Gondim et al. Reference Gondim, Laski, Gao and McAllister2004), and parasitic arthropods and vectors (Rich et al. Reference Rich, Rosenthal and Telford1997; Leo and Barker, Reference Leo and Barker2002; Bezzhonova and Goryacheva, Reference Bezzhonova and Goryacheva2008).
The other popular choices for markers for molecular systematic studies are the mitochondrial genes, such as cytochrome b, cytochrome oxidase I and the mitochondrial rRNA genes, 16S and 12S. These offer many of the same advantages as the nuclear rRNA genes for molecular systematists such as high copy number and conserved primer sites, but also many of these genes evolve at a higher rate than rDNA and thus can be more useful for discriminating closely related species. In addition, these genes are maternally inherited and thus haploid by nature and as such are valued for molecular based phylogenetic analysis due to their lack of recombination. Systematists who work on metazoan parasites have used a variety of mitochondrial loci (e.g. Brant and Orti, Reference Brant and Orti2003; Ketmaier et al. Reference Ketmaier, Joyce, Horton and Mariani2007). The gene cytochrome oxidase subunit I (coxI) in particular, the preferred marker in ‘DNA barcoding’ (see below), has been employed in many studies across a diversity of parasite taxa (e.g. Hu et al. Reference Hu, Gasser, Chilton and Beveridge2005; Steinauer et al. Reference Steinauer, Nickol and Ortí2007; Ferri et al. Reference Ferri, Barbuto, Bain, Galimberti, Uni, Guerrero, Ferté, Bandi, Martin and Casiraghi2009). Among the advantages of this marker is the occurrence of highly conserved primer sites across disparate metazoan groups (e.g. Folmer et al. Reference Folmer, Black, Hoeh, Lutz and Vrijenhoek1994; Simon et al. Reference Simon, Frati, Beckenbach, Crespi, Liu and Flook1994, Reference Simon, Buckley, Frati, Stewart and Beckenbach2006), but these conserved regions are not shared among many parasites. For example, the design of universal primers within Platyhelminthes is inhibited by an absence of conserved mitochondrial regions, necessitating a more laborious process of primer development at lower taxonomic levels (Moszczynska et al. 2009). Additionally, for many species of apicomplexan parasites, including Plasmodium, Theileria, and Babesia, the mitochondrial genomes are extremely reduced and can contain just three protein-coding genes and only fragmented pieces of the 12S and 16S ribosomal genes (Hikosaka et al. Reference Hikosaka, Watanabe, Tsuji, Kita, Kishine, Arisue, Palacpac, Kawazu, Sawai, Horii, Igarashi and Tanabe2010). Although these genes have been widely used for systematics of the malaria parasites (Escalante and Ayala, 1994; Perkins and Schall, Reference Perkins and Schall2002; Ricklefs and Fallon, Reference Ricklefs and Fallon2002; Martinsen et al. Reference Martinsen, Perkins and Schall2008; Perkins, Reference Perkins2008), their use has not easily been extended to other genera within this phylum or to broader systematic study of the great diversity of apicomplexans. Another cautionary note is that introgression (owing to historical hybridization between species), incomplete lineage sorting, and gene duplication of mitochondrial loci that translocate to the nuclear genome (i.e. “numts”) may produce highly supported but erroneous gene tree estimations (e.g. Zietara et al. Reference Zietara, Rokicka, Stojanovski and Lumme2010; for a general review see Maddison, 1997; Funk and Omland, 2003; Wiens and Penkrot, 2002). As a result, inclusion of both mitochondrial and nuclear loci is strongly encouraged for phylogenetic analyses.
Genomes: goldmines…or not?
Contemporary molecular systematic analyses now strive to incorporate a large number of loci, particularly those encoded by the nuclear genome into analyses. Finding nuclear markers other than the ribosomal rRNA genes has not been easy for most parasite groups. The complete genomes of several species of both unicellular and multicellular parasites have now been sequenced and so might be thought of as a practically limitless resource for new molecular markers, as these resources can facilitate the discovery of potential primer sites. However, in the vast majority of cases, the species for which genomic data exist represent a small proportion of the diversity of the group (such as in Plasmodium), or may be unusual species for the group (as with Schistosoma mansoni), and in all cases, because of their important roles in disease, have been under intense selective pressures due to the use of therapeutics and remedial treatments. For parasite groups that are not of major medical or veterinary importance (e.g. Oxyurida), genome data are even fewer. As a result, in many cases, whole genome resources have not been the goldmine that they were hoped to be for workers interested in the broad systematic study of parasite groups. As genome sequencing technologies advance and the cost of obtaining these types of data decreases, a better taxonomic sampling of parasites will certainly be possible to obtain. The problems of obtaining a sufficient quantity of parasite genetic material may be the key limiting factor and an issue that persists.
OTHER ISSUES WITH MOLECULAR MARKERS IN THE STUDY OF PARASITES
DNA sequences as diagnostic characters
A controversial topic in recent years has been the use of DNA characters in species delimitation, which is sometimes called ‘DNA taxonomy’ or ‘DNA barcoding (though the latter term may have nothing to do with taxonomy.) Parasites have been poster children for the use of incorporating molecular information into species descriptions, partly due to the very issues described above, but also because so many parasites exhibit complex life cycles and/or multiple forms. For example, the species description of a single species of Plasmodium should, ideally, entail morphological observations of the trophozoite, gametocyte, and schizont stages in the blood of the vertebrate host, as well as examination of any exo-erythrocytic stages in the vertebrate and then all stages present in the insect vector. In practice, this has rarely been the case and in most situations would be extremely difficult. The vectors for almost every species of Plasmodium that infect lizard hosts remain unknown (Schall, Reference Schall1996) and most descriptions have relied solely on the stages present in the blood. However, due to seasonality or intensity levels of the infections observed, very often only gametocytes are likely to be observed. Thus, as we have argued, a combination of morphological and molecular data used in species descriptions is probably the best compromise (Perkins and Austin, Reference Perkins and Austin2009). Unique molecular synapomorphies will allow for future workers who may find parasites in other host species or who may be sampling potential vectors to identify candidate species that are already known. These same principles can also apply to other groups of parasites where linking various life stages that are found in different hosts (e.g. tapeworms, trematodes) or pairing of males and females that show different morphologies is necessary. These combined approaches can also be very useful when the morphology of a parasite changes drastically during its ontogenetic development. For instance, a study of tetraphyllidean tapeworms infecting marine mammals used a parallel approach where molecular data were paired with morphological observations of larvae (merocercoids), which, because they show progressive degeneration of the apical organ during ontological development, can produce morphologically different specimens depending on when in their development they are sampled (Agustí et al. Reference Agustí, Aznar, Oldon, Littlewood, Kostadinova and Raga2005). In another example, Locke et al. (Reference Locke, McLaughlin and Marcogliese2010) sequenced barcoding loci for a large number of metacercariae of diplostomid trematodes. These metacercariae cannot be identified to species using morphological criteria, but molecular data allowed them to make species designations and test explicit hypotheses about specificity among their fish hosts.
Using molecular data to detect and describe diversity of parasites
It has become commonplace in many studies to screen hosts primarily by means of a molecular based method (PCR) aimed at picking up miniscule amounts of parasite DNA in relation to host DNA. For the malaria parasites of birds, an infection presenting just one parasite cell per 100 000 host cells is detectable by PCR (Fallon and Ricklefs, Reference Fallon and Ricklefs2008). Such molecular technology has allowed for increased sensitivity in the screening of infections, as well decreased time involved in the screening of large numbers of samples. While these molecular screening methods offer advantages to the study of parasites, they also present their own set of limitations. Numerous studies have demonstrated that certain parasite infections are not detected using PCR diagnostics, likely the result of sequence variation amongst parasite taxa and the inability of a universal primer pair (a short sequence fragment) to align with all parasites within a particular parasite group (Richard et al. Reference Richard, Sehgal, Jones and Smith2002). Another limitation surrounding the diagnosis of parasite infections by PCR includes the high occurrence of mixed infections. Within a mixed infection, if a particular parasite exists in greater numbers or has a gene sequence more similar to that of the screening primers, then this parasite will preferentially be amplified, and in most cases, the secondary (or tertiary and so on) infections will not be picked up by PCR. This scenario also holds true for the malaria parasites, which commonly co-occur within their hosts, rendering the PCR diagnostic method unable to determine the true incidence and diversity of parasites within a given host population (Valkiunas et al. Reference Valkiunas, Bensch, Iezhova, Križanauskienė, Hellgren and Bolshakov2006; Szöllosi, Reference Szöllsi, Hellgren and Hasselquist2008). Quantitative molecular methods, e.g. qPCR, have been developed for measuring parasitaemia in some systems, but careful controls using ratios of host DNA to parasite DNA are necessary due to inherent variance across samples and extractions (Refardt and Ebert, Reference Refardt and Ebert2006).
Characteristic of many parasites is a period during which the parasite enters into a resting or dormant phase within the host organism. The periodicity of such dormancy or subpatency may revolve around a number of external factors including vector transmission opportunities or conditions concerning the host organism itself including factors such as host immunity. Based on the method to detect infection in a host, such as the taking of a blood or faecal sample, the parasite life history stage may or may not be present for any given infection. Barnard and Bair (Reference Barnard and Bair1986) demonstrate that the malaria parasite stage present in the peripheral blood of birds, that which is seen by light microscopy of blood smears and used to gauge infection status, is highly dependent on the time of year, with this stage only occurring during the warm months when vector hosts are abundant. And so is the case for many other parasites. Although infection may last a lifetime, it is only for short periods of time when the parasites are detectable by traditional screening methods.
It is important to add here that relying solely on molecular data to document presence of a parasite can be problematic. Although molecular methods are more sensitive than microscopy or other traditional methods for detecting parasites (Perkins et al. Reference Perkins, Osgood and Schall1998; Aviles et al. Reference Aviles, Belli, Armijos, Monroy and Harris1999; Valkiūnas et al. 2008), they do present the risk of identifying parasites that are not truly infecting this host. For example, vectors may introduce sporozoites of Plasmodium into a host that is not competent for the parasites and, though they will never establish an infection, can serve as a template for PCR for up to 11 days (Valkiūnas et al. Reference Valkiūnas, Bensch, Iezhova, Križanauskienė, Hellgren and Bolshakov2009). Sometimes, the parasites do infect blood cells, but cannot complete development, and likewise, these aborted parasites can yield false positives if only molecular screening methods are used (Valkiūnas, Reference Valkiūnas2005).
A COMMON CONCERN: SAMPLING
Morphological and molecular phylogenetic analyses may both suffer profoundly from limited availability of samples. For taxonomic work, it is not uncommon to find in the literature that a species of parasite has been described from a single infected host, and even sometimes, from a very small number of specimens. Parasitologists may be plagued by this problem more so than workers who specialize on free-living, larger-bodied organisms. To find parasites, fieldwork frequently entails expensive travel to exotic locales to sample host groups that have not been well studied. Oftentimes, the vertebrate hosts are themselves rare and quite possibly protected, thus limiting the number that can be sampled within any set time period. Furthermore, in some cases, there exists the inability to know that parasite taxa have actually been obtained until the field samples have been examined thoroughly back in the lab (Mariaux, Reference Mariaux1996). In terms of morphology, these small sample sizes may not permit an adequate examination of the variability present within a species. Thus, if the parasite is encountered again, slight morphological differences may drive investigators to describe it as a novel species. With respect to molecular studies, a small sample size will mean that a small and very finite quantity of DNA can be obtained and, in terms of analyses, will not allow for the confident determination of molecular synapomorphies, hampering assignment of diagnostic molecular characters.
An important problem in phylogenetic analyses is that of unbalanced taxon sampling, which is also sometimes combined with poor choice of outgroup or inclusion of only a few outgroup taxa. This issue can result in false determination of ancestral versus derived characters, promote long-branch attraction and possibly also result in poor model estimation, depending on the specific data matrix in hand. Even with the issues surrounding paralogy in the rRNA genes of Plasmodium, the example above of the erroneous placement of P. falciparum with avian malaria parasites (Waters et al. Reference Waters, Higgins and McCutchan1991) was more likely due to a very small sampling of taxa and the inclusion of a far too-distant outgroup, as Qari et al.'s (Reference Qari, Shi, Pieniazek, Collins and Lal1996) analysis of the same gene, but expanded to include slightly more taxa, did not show the same relationships. For obvious reasons, it is not uncommon to have abundant samples of species that are of medical or veterinary importance and to lack species from wild hosts as these species often exist solely as the type specimen or series and nowhere else, such that destructive sampling of the type for molecular studies or electron microscopy is not possible (or might not even yield usable results – see above). For this reason, revisionary systematic work of many parasite groups via the addition of molecular data is hampered if not impossible – likely more so than for free-living organisms.
CONCLUSIONS
Here, we have attempted to give a few examples first of why morphological characters may sometimes be misleading for systematic studies of parasites, but also to couch these criticisms with cautionary tales of scenarios where obtaining and analyzing molecular data may be compromised as well. As stated in the introduction, it is not possible to review this field synthetically and this paper is just one more in a fairly long line of attempts by parasitologists to keep a finger on the pulse of the direction that parasite systematics is headed (c.f. Schmidt, Reference Schmidt1969; McManus and Bowles, Reference McManus and Bowles1996; Monis, Reference Monis1999; Brooks and Hoberg, Reference Brooks and Hoberg2001). The first of those citations, Schmidt (Reference Schmidt1969) was an edited volume entitled Problems in Systematics of Parasites. Five parasitologists, each specializing in four different groups, contributed chapters discussing the state of systematics of their groups and outlining some of the problems specific to these taxa. At that time, several developments were promising to change the field: the use of computers, cytogenetics, electron microscopy, numerical taxonomy and the early rudiments of molecular data, which then were mostly limited to protein structures and immunological information. Some of the predictions were rather dire: one cestodologist lamented that it might take 200 years before a new era of taxonomy and systematics would come about (Voge, Reference Voge and Schmidt1969). Several of the new methods did, in fact, change the field – electron microscopy is now routinely used and molecular data and powerful computers allow for extremely efficient analysis of large data-sets. Other methods (numerical taxonomy) did not change the field, but rather new improvements have arisen in their place. Certainly then, as now, these scientists bemoaned the fact that more students needed to be trained in taxonomy and wished that a better appreciation of the challenges faced by parasitologists could be made.
So what's a budding parasite systematist to do? Several authors have contributed to the debate over whether morphological or molecular characters, in general, are more important for analyses (e.g. Hillis, Reference Hillis1987; Jenner, 2004; Scotland et al. Reference Scotland, Olmstead and Bennett2003; Wiens, Reference Wiens2004), but the arguments in favour of a morphological approach have limited applicability to phylogenetics for most parasite groups. A primary argument for a morphological approach is the ability to incorporate fossil data into the analysis; parasites are poorly represented in the fossil record, however. Hillis (Reference Hillis1987) cites the relatively low expense for obtaining morphological data, but even this argument may not hold well for parasite taxa where more sophisticated microscopy (i.e. SEM, TEM) may be necessary. In these cases, a molecular approach may be more economical. Neither of these comparisons has taken into account the cost of labour in the sense of the amount of time needed to get, prepare, obtain and analyze the data. We feel strongly, however, that a combination of both methods should continue to be part of parasite systematics. We advocate that molecular data are more powerful for the phylogenetic aspects of parasite systematics and that as these methods improve, this will only become more true. We should not ignore the importance of properly vouchered specimens and the need for diagnostic autapomorphies in species descriptions. Ultimately, the success and reliability depends on trained workers who are familiar with the organisms that they study and that intimate knowledge must include an appreciation and careful study of morphology. (Perhaps the large degree that molecular phylogenetic studies of parasites have mirrored the previously held relationships based on morphology is testimony to the dedication and excellent training that has graced the parasitology community!) The vast quantity of information that can be found in molecular data should be taken advantage of whenever possible and it likely will not be long before genomics is commonplace in parasite systematics. Though almost 50 years after it was originally written, the following quote by Crites (Reference Crites1962) rings just as true today: “Each specialist owes a debt to past work in his field of endeavour, but he also has an obligation to the future. He has an obligation to make use of the data derived from modern methods.” Surely, more surprises await us, but so do more reliable evolutionary histories.