INTRODUCTION
Nosema spp. are obligate intracellular parasites belonging to Microsporidia, a cosmopolitan phylum derived as a basic branch of the fungi (Keeling and Fast, Reference Keeling and Fast2002; Capella-Gutiérrez et al. Reference Capella-Gutiérrez, Marcet-Houben and Gabaldón2012). Microsporidia are characterized by the production of resistant spores carrying a polar tube, used to transfer the infectious sporal content into target cells, and by their 70S ribosomes with bacterial-sized rRNAs. The genus Nosema comprises more than 150 species that mainly infect invertebrate hosts (Becnel and Andreadis, Reference Becnel, Andreadis, Wittner and Weiss1999). The western honeybee (Apis mellifera) is parasitized by the long-known Nosema apis (Zander, Reference Zander1909) and by the emerging Nosema ceranae. Both are the etiologic agents of nosemosis, one of the most widespread diseases of the honeybee. Nosema ceranae is thought to have jumped host from the Asian (Apis cerana) to the Western honeybee some decades ago and has now become the predominant microsporidian species infecting A. mellifera (Klee et al. Reference Klee, Besana, Genersch, Gisder, Nanetti, Tam, Chinh, Puerta, Ruz, Kryger, Message, Hatjina, Korpela, Fries and Paxton2007; Chen et al. Reference Chen, Evans, Smith and Pettis2008; Martin-Hernandez et al. Reference Martin-Hernandez, Botias, Bailon, Martinez-Salvador, Prieto, Meana and Higes2012). Nosema ceranae has also been reported in other oriental Apis and Bombus species (Plischuk et al. Reference Plischuk, Martín-Hernández, Prieto, Lucía, Botías, Meana, Abrahamovich, Lange and Higes2009; Chaimanee et al. Reference Chaimanee, Warrit and Chantawannakul2010, Reference Chaimanee, Chen, Pettis, Scott Cornman and Chantawannakul2011; Li et al. Reference Li, Chen, Wu, Peng, An, Schmid-Hempel and Schmid-Hempel2012).
The documented A. mellifera colony losses observed for the last 15 years has become an alarming issue for the beekeeping and agricultural fields as well as for conservation biology, especially due to the pollinating activity of the honeybees (Potts et al. Reference Potts, Biesmeijer, Kremen, Neumann, Schweiger and Kunin2010). Although the origin of the honeybee decline has not been clearly identified and is thought to be multicausal, some studies have suggested a role of N. ceranae (Martin-Hernandez et al. Reference Martin-Hernandez, Meana, Prieto, Salvador, Garrido-Bailon and Higes2007; Higes et al. Reference Higes, Martin-Hernandez, Botias, Bailon, Gonzalez-Porto, Barrios, Del Nozal, Bernal, Jimenez, Palencia and Meana2008, Reference Higes, Martín-Hernández and Meana2010). Following the ingestion of spores, the parasites invade and develop within the cytoplasm of the epithelial cells of the adult honeybee midgut. N. ceranae is considered as a major threat to the Western honeybee at both the individual and colony levels (Fries, Reference Fries2010; Higes et al. Reference Higes, Meana, Bartolomé, Botias and Martin-Hernandez2013). It would increase the honeybee mortality, activate the degeneration of the infected cells, induce an energetic stress and inhibit transcripts involved in immunity (Antùnez et al. Reference Antùnez, Martin-Hernandez, Prieto, Meana, Zunino and Higes2009; Martin-Hernandez et al. Reference Martin-Hernandez, Botias, Barrios, Martinez-Salvador, Meana, Mayack and Higes2011; Dussaubat et al. Reference Dussaubat, Brunet, Higes, Colbourne, Lopez, Choi, Martín-Hernández, Botías, Cousin, McDonnell, Bonnet, Belzunces, Moritz, Le Conte and Alaux2012; Chaimanee et al. Reference Chaimanee, Chantawannakul, Chen, Evans and Pettis2012). However the presence of N. ceranae is not systematically associated with honeybee weakening and mortality (Cox-Foster et al. Reference Cox-Foster, Conlan, Holmes, Palacios, Evans, Moran, Quan, Briese, Hornig, Geiser, Martinson, vanEngelsdorp, Kalkstein, Drysdale, Hui, Zhai, Cui, Hutchison, Simons, Egholm, Pettis and Lipkin2007; Invernizzi et al. Reference Invernizzi, Abud, Tomasco, Harriet, Ramallo, Campa, Katz, Gardiol and Mendoza2009; Gisder et al. Reference Gisder, Hedtke, Mockel, Frielitz, Linde and Genersch2010), suggesting modulations in the parasite virulence. Thus, not only the role of N. ceranae in the honeybee decline but also the origin of the variation in its virulence remains elusive. Some authors have suggested that such variation could be related to a polymorphism between N. ceranae isolates (Fries, Reference Fries2010; Higes et al. Reference Higes, Meana, Bartolomé, Botias and Martin-Hernandez2013), meaning that different parasite variants may infect the honeybee, some of them inducing a higher virulence.
Although the genome of the BRL01 strain of N. ceranae has been sequenced (Cornman et al. Reference Cornman, Chen, Schatz, Street, Zhao, Desany, Egholm, Hutchison, Pettis, Lipkin and Evans2009), data are lacking regarding the comparison of the gene content between N. ceranae isolates and the assessment of N. ceranae genetic polymorphism sensu stricto, i.e. its genetically determined pathogenicity. Several studies have sought to evaluate the parasite genetic polymorphism sensu lato, i.e. the nucleotide variation in sequences that appeared relevant for the genotyping of strains in other microsporidia, but failed to discriminate between N. ceranae isolates. The nucleotide diversity of the ribosomal RNA coding region (rDNA) successfully discriminated between isolates of the mammals' microsporidian parasites Encephalitozoon cuniculi, Encephalitozoon hellem and Enterocytozoon bieneusi (Didier et al. Reference Didier, Vossbrinck, Baker, Rogers, Bertucci and Shadduck1995; Mathis et al. Reference Mathis, Tanner, Weber and Deplazes1999; Haro et al. Reference Haro, Del Aguila, Fenoy and Henriques-Gil2003; ten Hove et al. Reference ten Hove, Van Lieshout, Beadsworth, Perez, Spee, Claas and Verweij2009) but not of N. ceranae sampled in A. mellifera (Huang et al. Reference Huang, Bocquet, Lee, Sung, Jiang, Chen and Wang2008; Chen et al. Reference Chen, Evans, Zhou, Boncristiani, Kimura, Xiao, Litkowski and Pettis2009a; Sagastume et al. Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011). The rDNA diversity observed in N. ceranae and the bumblebees' parasite Nosema bombi (Tay et al. Reference Tay, O'Mahony and Paxton2005; O'Mahony et al. Reference O'Mahony, Tay and Paxton2007; Sagastume et al. Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011) have suggested that rDNA may escape the concerted evolution maintaining sequence identity between the multiple gene copies present within Nosema genomes and would consequently be inappropriate to phylogenetic analyses (Ironside, Reference Ironside2007; Fries, Reference Fries2010). The nucleotide diversity of virulence-related genes encoding Polar Tube Proteins (PTPs) and Spore Wall Proteins (SWPs) has allowed determining variants of E. cuniculi and E. hellem (Peuvel et al. Reference Peuvel, Delbac, Metenier, Peyret and Vivares2000; Haro et al. Reference Haro, Del Aguila, Fenoy and Henriques-Gil2003; Polonais et al. Reference Polonais, Mazet, Wawrzyniak, Texier, Blot, El Alaoui and Delbac2010). However, neither PTP1 nor PTP3 encoding genes permitted the distinction of N. ceranae isolates (Chaimanee et al. Reference Chaimanee, Chen, Pettis, Scott Cornman and Chantawannakul2011; Hatjina et al. Reference Hatjina, Tsoktouridis, Bouga, Charistos, Evangelou, Avtzis, Meeus, Brunain, Smagghe and de Graaf2011). Since the polymorphism of rDNA and PTP3 genes seems to be a major obstacle for genotyping (Hatjina et al. Reference Hatjina, Tsoktouridis, Bouga, Charistos, Evangelou, Avtzis, Meeus, Brunain, Smagghe and de Graaf2011; Sagastume et al. Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011) one may wonder if the use of several independent gene markers may resolve parasite taxa.
In order to assess whether N. ceranae isolates can be discriminated through a multilocus approach, we compared the nucleotide and haplotype diversity of ten N. ceranae genes, including the small-subunit ribosomal RNA (SSU-rDNA) gene and nine protein encoding genes, that appeared both within and between parasite populations isolated from four geographically distant honeybees.
MATERIALS AND METHODS
Honeybee sampling
Apis mellifera individuals were collected in France, Morocco, Lebanon and Thailand in 2010 and 2011. The presence of microsporidian parasites has been verified by both the microscopic observation of spores and the amplification of the microsporidian SSU-rDNA using the Nc_D1 (CGACGATGTGATATGAAAATATTAA) and Nc_R1 (TCATTCTCAAACAAAAAACCGTTC) primers. The absence of N. apis was checked using the Na_D1 (GCATGTCTTTGACGTACTATGTAC) and Na_R1 (CGTTTAAAATGTGAAACAACTATG) primers. All primers derived from Martin-Hernandez et al. (Reference Martin-Hernandez, Meana, Prieto, Salvador, Garrido-Bailon and Higes2007). The isolate concept was defined here as the parasite population infecting a single individual, thus only one bee from each location was used for subsequent cloning.
Genetic markers amplification and cloning
Ten genetic markers have been amplified by PCR (Platinum Taq Polymerase High Fidelity, Invitrogen) in a mixture containing 2 mm MgSO4 and 0·4 μ m of each primer (Table 1) in a total volume of 25 μL, and performing annealing steps at 52 °C. PCR products were inserted into the pCRII-TOPO cloning vector (TOPO TA Cloning Kit Dual Promoter, Invitrogen). After transformation (NEB 5-alpha cells, New England Biolabs), several clones were cultured and their plasmids content extracted. Inserted fragments were sequenced (GATC Biotech, Germany) using both the T7 and M13 Reverse flanking primers. To check the absence of experimentally induced sequence variation, a control has been performed using a clonal plasmid preparation carrying the SWP30 marker as an initial PCR template.
Table 1. Marker genes and primers used for genetic variability (a) and RT-PCR (b) analyses
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712013827-54755-mediumThumb-S0031182013001133_tab1.jpg?pub-status=live)
Statistical evolutionary analyses
To determine variable sites, nucleotide sequences were aligned using ClustalW under the BioEdit editor (Hall, Reference Hall1999), selecting the amplified fragments only, i.e. removing primer sequences. All polymorphic sites were visually checked for quality on electropherograms. Alignments with insertions-deletions (indels) of tandem repeats were manually improved for better parsimony. Diversity, transition to transversion and d N:dS analyses were conducted in MEGA5, removing ambiguous positions in pairwise comparisons (pairwise deletion) allowing to consider indels as polymorphic characters (Nei and Kumar, Reference Nei and Kumar2000; Tamura et al. Reference Tamura, Peterson, Peterson, Stecher, Nei and Kumar2011). The nucleotide diversity has been given by averaging the pairwise p-distances between sequences, i.e. the mean number of base differences per site, for the whole allelic population (π) and for populations within (π w) or between (π b) honeybee isolates. To measure how isolates subpopulations differed, the nucleotide diversity explained by population structure and the magnitude of gene differentiation among subpopulations were given by the fixation index (F ST) and the coefficient of evolutionary differentiation (G ST), respectively. Standard error estimates were obtained by a bootstrap procedure with 10 000 replicates. The overall transition/transversion bias (R s/v) was computed from the Maximum Composite Likelihood estimate of the pattern of nucleotide substitution. The codon-based one-tailed test of selection has been performed by calculating the probability of rejecting the null hypothesis of strict-neutrality (d N = d S) in favour of the alternative hypotheses, d N<dS or d N>dS for purifying or positive selection respectively, using the Li–Wu–Luo method. Statistical significance (P-values <0·05 were considered significant) was computed by a 10 000-replicates bootstrap resampling procedure. All tests were in favour of d N<dS rather than d N>dS.
Other selection neutrality tests, linkage disequilibrium (LD) and recombination analyses were conducted in DnaSP (Librado and Rozas, Reference Librado and Rozas2009; Rozas, Reference Rozas2009) removing all positions containing gaps and missing data (complete deletion). Divergence from the null hypothesis, i.e. from the hypothesis that all mutation are either neutral or strongly deleterious, was estimated computing the Tajima's D (Tajima, Reference Tajima1989) and Ramos-Onsins & Rozas's R 2 (Ramos-Onsins and Rozas, Reference Ramos-Onsins and Rozas2002), based on the nucleotide polymorphism, and the Fu's F s (Fu, Reference Fu1997), based on the haplotype distribution. The LD, i.e. the non-random association of alleles, was estimated by calculating ZZ, which equals the average LD between adjacent polymorphic sites minus the standardized average intragenic LD (Rozas et al. Reference Rozas, Gullaud, Blandin and Aguade2001). Statistical significance (P-values <0·05 were considered significant) was given by the probability of obtaining under the null hypothesis absolute values equal or higher than the observed ones. It was computed by performing coalescent simulation with 10 000 replicates using fixed theta per sequence values and no recombination, no strong difference being observed using a recombination parameter. All significant values were out of the confidence interval comprising 95% of the expected values under the null hypothesis (data not shown), i.e. they were very unlikely under the standard neutral model. The number of pairs of polymorphic sites that exhibit statistically significant LD (P<0·05) performing a two-tailed Fisher's exact test, with and without Bonferroni correction against multiple comparisons biases, is given together with the total amount of considered pairwise comparisons.
Phylogenetic trees were computed under MEGA 5, using the Maximum Likelihood algorithm, with a 10 000-replicates bootstrap procedure, a nucleotidic substitution model using the Tamura 3-parameters method and a Nearest-Neighbour-Interchange heuristic method.
Sequence prediction and modelling
Prediction of potentially secreted proteins was based on the screening for potential N-terminal secretory signal sequences using SignalP v3.0 (Bendtsen et al. Reference Bendtsen, Nielsen, von Heijne and Brunak2004). Peptide repeated motifs in EnP1B and NCER_101600 were manually extracted and schematized using the WebLogo tool (Crooks et al. Reference Crooks, Hon, Chandonia and Brenner2004).
RNA extraction and RT-PCR
Total RNA was extracted from the midgut of one naturally infected honeybee in France (SV total RNA Isolation System Kit, Promega). cDNA was synthesized (ImProm-II™ Reverse Transcription System, Promega) using an oligo-d(T) primer (Invitrogen) and 100 ng of RNA. Subsequent PCR amplifications were performed (GoTaq® Flexi DNA Polymerase, Promega) using the RT-PCR primers listed in Table 1. Negative and positive controls were performed using reactions without the reverse transcriptase and N. ceranae genomic DNA as a template respectively. All controls validated the experiment.
RESULTS
Selection of N. ceranae marker genes
Ten marker genes have been selected within the N. ceranae BRL01 genome (Cornman et al. Reference Cornman, Chen, Schatz, Street, Zhao, Desany, Egholm, Hutchison, Pettis, Lipkin and Evans2009), including the nearly complete rRNA small subunit encoding gene. Four spore wall and polar tube protein encoding genes have been identified and named according to their closest known homologues in Nosema bombycis (Table 1). They encode the polar tube protein PTP2, the spore wall proteins SWP25 and SWP30, and the hypothetical spore wall protein HSWP4 (Wu et al. Reference Wu, Li, Pan, Tan, Hu, Zhou and Xiang2008, Reference Wu, Li, Pan, Zhou and Xiang2009). Except for its first 32 codons, the NCER_100768 gene encodes a protein homologous to the spore wall protein EnP1 of Encephalitozoon cuniculi (Peuvel-Fanget et al. Reference Peuvel-Fanget, Polonais, Brosson, Texier, Kuhn, Peyret, Vivarès and Delbac2006). We propose to shift the start codon from the suggested GTG to the first upstream ATG codon and to name the resulting open reading frame EnP1B, as another gene (NCER_101082) with higher homology with E. cuniculi EnP1 is already present in N. ceranae genome. Lastly, four genes encoding hypothetical proteins carrying tandem peptide repeats and potential N-terminus secretion signal were selected as potential surface and/or structural proteins that could show high variability. RT–PCR experiments (data not shown) proved that all the chosen open reading frames are transcribed in a naturally N. ceranae-infected honeybee, demonstrating that they are true protein-encoding genes that ought to be subjected to evolutionary processes.
Within- vs between-isolate distance of N. ceranae marker genes
Following amplification and cloning, 8–15 clone sequences were obtained for each marker and each isolate and have been deposited in GenBank (Accession Numbers KC680230 to KC680656). The experimental control performed using a clonal matrix did not show any sequence variation in the 15 obtained clones, showing that no error-prone bias was brought throughout the procedure, i.e. any sequence diversity observed ought to depict true allelic variability. Whatever the tested marker, a surprisingly high allele content – mostly due to singletons – has been observed, even for populations isolated from single host individuals (Table 2). In order to compare the genetic variability of the markers, pairwise p-distances between sequences have been averaged to the nucleotide diversities present in the whole allelic population (π), within a single honeybee (π w) and between geographically distant honeybees (π b, Table 2 and Fig. 1). For each marker, the π w and π b were very similar, showing that there is as much variability within a single bee as between geographically distant bees. Even p-distance maxima were very similar within and between isolates (Fig. 1), showing that the most divergent sequences can be found within a single parasitized individual. Moreover, whatever the marker, the minimum p-distances were null both within and between isolates, showing that there are always identical alleles within an isolate but also between distant isolates. Altogether, these data suggest that the parasite populations infecting honeybees were similar in the four locations studied. Indeed, the F ST and G ST indices for population differentiation between isolates were low (Table 2), indicating that the parasite populations from distinct isolates are poorly divergent.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712013827-91194-mediumThumb-S0031182013001133_fig1g.jpg?pub-status=live)
Fig. 1. Nucleotide diversity of ten N. ceranae genes. The nucleotide diversity is represented as the mean p-distance (i.e. the number of base differences per site) between sequences within an isolate (π w, circles) and between isolates (π b, squares). Standard error estimates obtained by a 10 000-replicates bootstrap procedure are shown by boxes. Lines represent the complete range of pairwise p-distances, showing the minimum and maximum obtained values.
Table 2. Statistics of nucleotide polymorphism, divergence, linkage disequilibrium and recombination in ten gene regions of N. ceranae
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712013827-09108-mediumThumb-S0031182013001133_tab2.jpg?pub-status=live)
The nucleotide diversity appeared variable between markers, with π ranging from 0·001 to 0·018 for HSWP4 and EnP1B respectively (Fig. 1). However, these values are still too low, and especially much lower than their corresponding π w, to differentiate the isolate populations. Indeed phylogenetic analyses failed to separate isolate-specific taxa: whatever the chosen marker and the evolutionary model, trees had star-like topologies with poorly supported nodes by bootstrap procedures (Fig. 2). Altogether, these data suggested that the markers high nucleotide diversity but low divergence between isolates preclude any multilocus approach for the genotyping of N. ceranae strains.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712013827-61111-mediumThumb-S0031182013001133_fig2g.jpg?pub-status=live)
Fig. 2. Phylogenetic analysis of the N. ceranae small ribosomal RNA subunit (a) and hypothetical spore wall protein EnP1B (b) encoding genes from four geographically distant locations. The evolutionary history analysis was inferred by the maximum likelihood method using the multiple sequences obtained from four individuals sampled in France, Lebanon, Morocco and Thailand (indicated by triangle, star, square and circle respectively). The first tree was rooted using the close Nosema bombi SSU sequence as an outgroup while the second tree was unrooted since no close homologue of EnP1B was found in databases. The percentage of trees in which the associated taxa clustered together, inferred by a 10 000-replicates bootstrap analysis, is shown next to the branches.
Nature of polymorphic characters
The high transition to transversion bias observed for all markers suggests that they are submitted to evolutionary processes including selection. Indeed all markers exhibited rejection of at least one neutrality test (shown in bold in Table 2). For most protein encoding genes, the codon-based one-tailed test of selection significantly rejected the null hypothesis in favour of the purifying hypothesis (d N<dS). However, four markers also exhibited nonsense mutations, leading to the encoding of non-functional proteins, suggesting that the markers may be submitted to complex evolutionary processes. EnP1B and NCER_101600 showed large indel events corresponding to discrete numbers of base triplets thus not inducing frameshift. For EnP1B, indels were linked to a difference in the repetition of a 7-amino acid motif (Figs 3 and 4). A more complex pattern could be highlighted for the indels observed in NCER_101600, which contained two close but distinct 14- or 16-amino acid repeated motifs (Fig. 4). These indels, up to 444 base pairs (bp), generated a large variation in the length of NCER_101600 amplified products. Although they were not present in all isolates in our dataset, the dominant amplification products at 573, 801 and 951 bp were visible by agarose gel electrophoresis for all isolates (data not shown).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712013827-57619-mediumThumb-S0031182013001133_fig3g.jpg?pub-status=live)
Fig. 3. Alignment of the partial EnP1B protein sequences inferred by the sequencing of the corresponding gene in single individuals sampled in France, Lebanon, Morocco and Thailand (as indicated by their initials on the left). The consensus sequence is indicated in the first row. Below, lines stand for identical sequences, letters for amino acid substitutions, dots for synonymous nucleotide substitutions (i.e. without change of the corresponding amino acid), blank areas for deletions. The star indicates a nonsense substitution (i.e. inducing a stop codon). Repeated motifs are separated by columns and have been numbered with Latin letters on their top.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712013827-57737-mediumThumb-S0031182013001133_fig4g.jpg?pub-status=live)
Fig. 4. Logo of the repeated motifs of EnP1B (a) and NCER_101600 (b) inferred by aligning the 264 and 678 respective peptide sequences extracted from our data. The height of each letter is proportional to the observed frequency of the corresponding amino acid and the overall height of each stack of letters is proportional to the sequence conservation at that position (with 4·32 bits as a maximum sequence conservation).
Among the segregating sites observed in the SSU-rDNA in the present and other deposited data, most were single nucleotide polymorphisms (SNPs), 70·8 and 16·7% being transitions and transversions respectively, but 12·5% of the sites showed indel events. The variable sites have been located on a proposed secondary structure model of the corresponding RNA subunit (Supplementary Fig. S1 – in Online version only). Helices were less submitted to variability than non-helix domains and most of the mutations in helices did not strongly modify the base pairing of the structure, i.e. changing a canonical (or non-canonical) bond to another one, suggesting non-random mutagenic events.
DISCUSSION
The variability of N. ceranae gene sequences
Since one gene itself does not allow assessing the diversity of N. ceranae parasites in A. mellifera, we tested whether a multilocus approach could discriminate between populations of the parasite. This work showed that every gene presents unexpected high nucleotide diversity within a single host individual (π w) but a poor genetic variation between parasite populations isolated from geographically distant honeybees (F ST and G ST, Table 2 and Fig. 1). Thus, while N. ceranae showed high sequence variability, the later did not allow any discrimination between parasite variants, even using concomitantly several genetic markers, implying that any phylogenetic assertions on N. ceranae must be considered with care.
For all the tested markers, most of the alleles observed were singletons (Table 2). This low redundancy suggests that the true allelic richness was certainly underestimated. For instance not all indels of the NCER_101600 marker have been cloned and sequenced in every isolate whereas corresponding bands had been observed in agarose gel electrophoresis. Thus variations in nucleotide diversity between genes (Fig. 1) may be due to their intrinsic variability and allele selection as well as to a lack of data. Concerning the nucleotide diversity of the SSU-rDNA, although it seems variable in microsporidia, similar levels have been observed in the rDNA regions of N. bombycis, Nosema granulosis and Vairimorpha cheracis (π ranging from 0·004 to 0·009; Ironside, Reference Ironside2013) and a heterogeneity in rDNA sequences had also been observed in N. apis and N. bombi (Gatehouse and Malone, Reference Gatehouse and Malone1998; Tay et al. Reference Tay, O'Mahony and Paxton2005; O'Mahony et al. Reference O'Mahony, Tay and Paxton2007), suggesting that high nucleotide diversity might be common in the Nosema/Vairimorpha group.
A lack of N. ceranae population divergence between geographically distinct A. mellifera hosts
The low F ST and G ST indices (Table 2) indicated that no differentiation between N. ceranae populations could be assessed upon the geographic origin of their A. mellifera host. Such lack of divergence had already been observed for the rDNA marker and the genes encoding the polar tube proteins PTP1 and PTP3 (Huang et al. Reference Huang, Bocquet, Lee, Sung, Jiang, Chen and Wang2008; Chen et al. Reference Chen, Evans, Zhou, Boncristiani, Kimura, Xiao, Litkowski and Pettis2009a; Chaimanee et al. Reference Chaimanee, Chen, Pettis, Scott Cornman and Chantawannakul2011; Hatjina et al. Reference Hatjina, Tsoktouridis, Bouga, Charistos, Evangelou, Avtzis, Meeus, Brunain, Smagghe and de Graaf2011; Sagastume et al. Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011). Our data inferred that this can certainly be extrapolated to any coding genes of N. ceranae although there are no data based on selection-free silent sites and microsatellites to enlarge this to the whole genome sequence. Moreover, our results clearly confirmed the previously suggested hypothesis that geographically distant honeybees may be infected by similar parasite populations (Huang et al. Reference Huang, Bocquet, Lee, Sung, Jiang, Chen and Wang2008, Chen et al. Reference Chen, Evans, Zhou, Boncristiani, Kimura, Xiao, Litkowski and Pettis2009a, Sagastume et al. Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011).
The significant R 2 and the significantly negative Tajima's D and Fu's F s statistics (Table 2) are characteristic of a population expansion (Ramirez-Soriano et al. Reference Ramirez-Soriano, Ramos-Onsins, Rozas, Calafell and Navarro2008). This is particularly supported by the almost always significantly negative F s (for all markers but NCER_101600) that is thought to be a more sensitive indicator of population expansion than D. The very weak population divergence observed through F ST and G ST (Table 2) and the star-like phylogenetic dendrograms (Fig. 2) suggest that such population expansion would have emerged from a poorly diverse initial population. It has been shown that the nucleotide diversity of the PTP1 gene allowed separating specific taxa of N. ceranae in Apis florae and in Apis dorsata, but no distinction could be found amongst parasites isolated from Apis cerana and A. mellifera (Chaimanee et al. Reference Chaimanee, Chen, Pettis, Scott Cornman and Chantawannakul2011). Thus parasite populations may yet be discriminated relative to their host niche but a very similar population infects the Asian and the European honeybee species. Altogether, these findings support the hypothesis that N. ceranae experienced a recent host-jump from the Asian honeybee A. ceranae to the Western honeybee A. mellifera followed by a rapid spreading in its new host niche (Paxton et al. Reference Paxton, Klee, Korpela and Fries2007; Klee et al. Reference Klee, Besana, Genersch, Gisder, Nanetti, Tam, Chinh, Puerta, Ruz, Kryger, Message, Hatjina, Korpela, Fries and Paxton2007; Chen et al. Reference Chen, Evans, Smith and Pettis2008; Botias et al. Reference Botias, Anderson, Meana, Garrido-Bailon, Martin-Hernandez and Higes2012) that may have been favoured by sustained intercontinental honeybee exchanges (Klee et al. Reference Klee, Besana, Genersch, Gisder, Nanetti, Tam, Chinh, Puerta, Ruz, Kryger, Message, Hatjina, Korpela, Fries and Paxton2007; Mutinelli, Reference Mutinelli2011). Such a host-jump may have constituted a bottleneck-like event, with a strong genetic drift followed by a rapid increase in population size. Such evolutionary history would also be reflected by negative D and F s values and significant R 2 (Fay and Wu, Reference Fay and Wu1999; Ramirez-Soriano et al. Reference Ramirez-Soriano, Ramos-Onsins, Rozas, Calafell and Navarro2008) and by the short and weakly supported branching of the markers’ evolutionary models (Fig. 2). However if most lineages did not survive the bottleneck event, a long branch would be expected between the taxonomic units related to the A. cerana and the A. mellifera hosts. This was not observed by Chaimanee et al. (Reference Chaimanee, Chen, Pettis, Scott Cornman and Chantawannakul2011), suggesting that the parasite populations infecting the former and new hosts may not have been completely isolated yet.
Origin and maintenance of the N. ceranae genetic variation
Mixed infection of one host by several parasite genotypes may affect the host–parasite interactions with benefits to the parasite infectivity and development, as shown for the microsporidian Octosporea bayeri in Daphnia magna (Vizoso and Ebert, Reference Vizoso and Ebert2005). However, in agreement with Sagastume et al. (Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011), it is hardly conceivable that geographically distant honeybees are systematically infected by a similar non-clonal population of parasites, and that this genotype diversity is maintained throughout the parasite development. Thus the high N. ceranae haplotype diversity or expected heterozygosity Hd (Table 2), with many haplotypes per isolates, would rather be linked to a combination of non-haploid stages, recombination events and/or gene exchange.
The rDNA is present in multiple loci within N. ceranae genome (Cornman et al. Reference Cornman, Chen, Schatz, Street, Zhao, Desany, Egholm, Hutchison, Pettis, Lipkin and Evans2009) and its nucleotide diversity could be related to an intragenomic polymorphism of these gene copies, as observed within a single spore of N. bombi (O'Mahony et al. Reference O'Mahony, Tay and Paxton2007). As the SSU-rDNA marker did not show stronger diversity than protein encoding markers (Fig. 1, Table 2), one may wonder whether all genes would be present in multiple polymorphic copies, questioning the polyploidy of the parasite. The observation of frameshift mutations, due to nonsense SNPs or 1-bp deletions, supports this hypothesis, as it would allow the accumulation of mutations in inactivated genes, leading to pseudogenization while maintaining functional allele copies under selective pressure. The observed Hd is not significantly different to what would be expected for a diploid organism (Table 2). Nosema ceranae is a dikaryotic single cell organism (Chen et al. Reference Chen, Evans, Murphy, Gutell, Zuker, Gundensen-Rindal and Pettis2009b) and, although the ploidy of the two nuclei is not known, N. ceranae could be considered to be at least diploid.
High heterozygosity with low F ST could be interpreted as a clonal and asexual life cycle, with transient stages of di- or polyploidy, resulting in the same allelic population in isolates and leading to incorrect phylogenetic topology assertions (Birky, Reference Birky1996; Haag et al. Reference Haag, Traunecker and Ebert2013). However a clonal mode of reproduction should lead to strong LD whereas the intragenic recombination given as ZZ values was never significant (Table 2). Yet the R m values and the detection of polymorphic site pairs with significant LD suggest that recombinations did occur between alleles. Through meiosis, sex is a source of recombination and gene exchange, but it would have been lost independently in distinct microsporidian clades (Ironside, Reference Ironside2007). Despite the lack of evidence for sexual reproduction in N. ceranae, its genome contains sex-related loci and genes related to the meiotic recombination machinery (Lee et al. Reference Lee, Corradi, Doan, Dietrich, Keeling and Heitman2010a, Reference Lee, Ni, Li, Shertz and Heitmanb) and a potential sexual haplo-diploid cycle of the parasite has been hypothesized (Sagastume et al. Reference Sagastume, Del Aguila, Martin-Hernandez, Higes and Henriques-Gil2011). Only deep molecular studies of clonal cultured populations of N. ceranae could resolve its degree of ploidy, recombination and sexual/asexual cycle.
EnP1B and NCER_101600, which are the two markers showing numerous indels of repeated motifs (Fig. 4), were the most polymorphic markers (π), with the highest isolate differentiation (F ST), and showed more polymorphic site pairs with significant LD and higher R m, i.e. more recombination events are required to explain the observed haplotype diversity (Table 2). These two genes seem to have experienced slightly independent evolutionary history that could be related to their potential involvement in pathogenesis. Indeed the E. cuniculi EnP1 contains repeated peptides related to heparin-binding motifs and is involved in the parasite adhesion to the host cell surface in vitro (Southern et al. Reference Southern, Jolly, Lester and Hayman2007). Polymorphism is a common feature of genes involved in the gene-for-gene relationships between a parasite and its host. Some models have suggested that such polymorphism could be stabilized in parasite populations associated with polycyclic diseases (Tellier and Brown, Reference Tellier and Brown2007), i.e. when several parasite generations develop in a host through autoinfection, which is thought to be the case for microsporidia (Cali and Takvorian, Reference Cali, Takvorian, Wittner and Weiss1999). NCER_101600 may well be involved in a gene-for-gene host–parasite interaction.
CONCLUSION
Taken as a phenotypic variation, differences in a parasite virulence would reflect a non-exclusive combination of its (1) genetic polymorphism sensu stricto, i.e. its genetically determined virulence strategies adapted to certain environmental conditions – including the host niche – and (2) polyphenism, i.e. the expression of alternative virulence strategies depending on environmental cues (Pfennig, Reference Pfennig2001; Schwander and Leimar, Reference Schwander and Leimar2011). The present work showed that despite a relatively high nucleotide diversity, a similar parasite population is found within geographically distant honeybee isolates. Thus the virulence of N. ceranae should rather be related to an alteration of the host–parasite interactions in response to environmental cues, including environmental stressors (Bromenshenk et al. Reference Bromenshenk, Henderson, Wick, Stanford, Zulich, Jabbour, Deshpande, McCubbin, Seccomb, Welch, Williams, Firth, Skowronski, Lehmann, Bilimoria, Gress, Wanner and Cramer2010; Gisder et al. Reference Gisder, Hedtke, Mockel, Frielitz, Linde and Genersch2010) and the honeybee genetic background (Fontbonne et al. Reference Fontbonne, Garnery, Vidau, Aufauvre, Texier, Tchamitchian, El Alaoui, Brunet, Delbac and Biron2013), than to its polymorphism.
Lastly, if the present work showed that N. ceranae populations are poorly divergent in distant isolates, it questions the origin and maintenance of a strong haplotype diversity within a single honeybee host. Our data support the hypothesis of a recent host jump of N. ceranae from the Asian to the European honeybee that would be still followed by a population expansion – and possibly further radiation – in its new host niche, without clear isolation from the A. cerana-infecting parasites. Altogether, these findings suggest that N. ceranae may constitute a valuable model to monitor the evolution of an emerging parasite.
ACKNOWLEDGEMENTS
We are grateful to J. Vandamme, V. Chaimanee and P. Chantawannakul for providing us with infected honeybees.
FINANCIAL SUPPORT
This work was funded by a grant from the Centre National de la Recherche Scientifique (CNRS, EC2CO: Ecosphère continentale et côtière), a European grant for beekeeping (FEAGA-convention 10–31 R/2008-2010) and by the Région Auvergne (Programme Nouveau Chercheur). MR was supported by a grant from Région Auvergne and JA from the Ministère de l'Enseignement Supérieur et de la Recherche.