INTRODUCTION
Schistosomiasis is a major public health problem, mainly in the tropics, with an estimated 200 million individuals infected and 650 million living in endemic areas (http://www.who.int/schistosomiasis/en/). The causative agents are digenetic parasitic flatworms of the genus Schistosoma, which have a complex life cycle involving 2 obligate hosts; a mammalian definitive host (human) and snail intermediate host. Adult male and female worms, depending on the species, inhabit the vasculature of the urinary plexus or mesenteric venules surrounding the large intestine, where they produce eggs that escape to the environment in urine or feces. In freshwater, free-swimming miracidia contained within eggs hatch and, upon finding a suitable snail intermediate host, directly penetrate the snail mantle to initiate an infection (Basch, Reference Basch1991). Post-penetration, the miracidium sheds its ciliary epidermal plates, during which time a tegumental syncytium covers the entire larval surface as it transforms to the parasitic primary or mother sporocyst stage (Basch and DiConza, Reference Basch and DiConza1974). This developmental transition from free-living to parasitic state within the snail host is crucial to successful establishment of infections, and is presumed to involve dramatic physiological changes both at the biochemical and molecular levels. However, to date few studies have focused on the molecular basis of larval transformation or the identification of genes regulating subsequent asexual development of and embryogenesis within subsequent sporocyst generations.
Previously, high-throughput cDNA or oligonucleotide DNA microarrays have been used to compare the expression of thousands of genes in a variety of S. mansoni stages, including adult male and female worms (Hoffman et al. Reference Hoffmann, Johnston and Dunne2002; Fitzpatrick et al. Reference Fitzpatrick, Johnston, Williams, Williams, Freeman, Dunne and Hoffmann2005; Verjovski-Almeida et al. Reference Verjovski-Almeida, Venancio, Oliveira, Almeida and Demarco2007; Waisberg et al. Reference Waisberg, Lobo, Cerqueira, Passo, Carvalho, Franco and El-Sayed2007) as well as larval stages, including daughter sporocysts and cercariae (Jolly et al. Reference Jolly, Chin, Miller, Bahgat, Lim, DeRisi and McKerrow2007). In a recent study Vermeire et al. (Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006) identified a large number of gene expression changes in S. mansoni miracidia compared to 6-day in vitro cultured sporocysts using a DNA microarray (MA) spotted with oligonucleotide probes corresponding to ESTs or full-length mRNAs derived mainly from adult worms, eggs, and cercariae, with <5% originating from miracidia or sporocyst sequences. Despite this over-representation by cercarial and adult genes, approximately 60% of the array probes, representing individual mRNA transcripts, were expressed in miracidia and/or sporocysts, with a significant number being differentially expressed between these stages (Vermeire et al. Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006). Considerable overlap in transcribed genes thus exists between stages. However, microarrays are limited to analysing only previously identified transcripts. In this way they constitute a ‘closed’ gene expression profiling platform, limited to predetermined or known sets of genes.
Serial Analysis of Gene Expression (SAGE) is a sequence based gene-expression profiling tool that can be utilized to generate quantitative transcriptional profiles of genes in an organism. In SAGE, a short sequence tag from a unique position of each mRNA molecule is used to uniquely identify the source gene from within the genome (Velculescu et al. Reference Velculescu, Zhang, Vogelstein and Kinzler1995). Sequence tags are isolated from an mRNA sample and are linked together to form long concatenated molecules that are cloned and sequenced. The population of tags defines patterns of expression of individual genes. Quantification of all tags provides a relative measure of gene expression (i.e., mRNA abundance). SAGE thus provides both the identity of expressed genes and levels of their expression. SAGE constitutes an ‘open’ platform, providing a rapid and comprehensive approach for elucidation of quantitative gene expression patterns not dependent upon prior availability of transcript information. In addition, the sequences generated can be used to identify previously unknown genes through the application of tag-based reverse transcription-polymerase chain reaction (RT-PCR), i.e., use of tag sequences to design primers for amplifying unknown cDNA sequences, leading to gene identification and elucidation of function.
In preliminary experiments involving gene expression patterns across the entire S. mansoni life cycle, SAGE has demonstrated excellent potential for stage-associated gene profiling (Williams et al. Reference Williams, Sayed, Bernier, Birkeland, Cipriano, Papa, McArthur, Taft, Vermeire and Yoshino2007). Recently, SAGE has been used to identify transcriptional changes in adult worms in response to nitric oxide exposure (Messerli et al. Reference Messerli, Morgan, Birkeland, Bernier, Cipriano, McArthur and Greenberg2006). Ojopi et al. (Reference Ojopi, Oliveira, Nunes, Paquola, DeMarco, Gregório, Aires, Menck, Leite, Verjovski-Almeida and Dias-Neto2007) used SAGE to identify expressed transcripts in pooled adult male and female worms, but did not identify any stage-related changes in transcript levels. The SAGE tags used by Ojopi et al. (Reference Ojopi, Oliveira, Nunes, Paquola, DeMarco, Gregório, Aires, Menck, Leite, Verjovski-Almeida and Dias-Neto2007) were only 14 bp long, potentially leading to errors in data analysis; specifically the problem of a single tag matching multiple transcripts.
In the present study, LongSAGE was used to compare gene expression profiles for S. mansoni miracidia, 6-day in vitro cultured primary sporocysts and 20-day in vitro cultured sporocysts to quantitatively assess commonly- and differentially-expressed genes during early larval development of this parasite. LongSAGE is a highly specific quantitative method of gene expression profiling that generates 21 bp tags, of which theoretical modelling predicts that >99·8% are expected to match only once to a human-sized genome (Saha et al. Reference Saha, Sparks, Rago, Akmaev, Wang, Vogelstein, Kinzler and Velculescu2002). We combined SAGE data with the gene predictions/annotations of the S. mansoni genome (v4.0e) and modelled theoretical 3′ UTR lengths for genes without 3′ EST sequence data to generate the most up-to-date analysis of the S. mansoni larval transcriptome during establishment of intramolluscan infections. The 6- and 20-day time-points were chosen to determine transcriptional changes associated with 2 very important developmental time-points in early larval development. After 6-days of in vitro culture the miracidia have all transformed and are transitioning from a free-living to parasitic stage and at 20-days the in vitro-cultured sporocysts are beginning to form brood chambers to produce the second generation of daughter/secondary sporocysts. In addition, we also investigated the effects of sporocyst exposure to snail Bge cell products, predicted to enhance larval growth/embryogenesis on gene expression (Yoshino and Laursen, Reference Yoshino and Laursen1995).
MATERIALS AND METHODS
Parasite culture
Schistosoma mansoni (NMRI strain) eggs were recovered from the livers of mice at 7–8 weeks post-infection as described by Yoshino and Laursen (Reference Yoshino and Laursen1995). Miracidia were hatched from the eggs in sterile artificial pond water and concentrated on ice in conical polypropylene centrifuge tubes. Miracidia were isolated at 15-min intervals over a 2 h period. Cold-immobilized miracidia were either centrifuged for 1 min at 500 g and immediately harvested for total RNA or were pooled and transferred to a 24-well culture plate to permit transformation into primary sporocysts followed by culturing for 6 or 20 days under normoxic conditions at 26°C in either S. mansoni sporocyst medium (SM; Ivanchenko et al. Reference Ivanchenko, Lerner, McCormick, Toumadje, Allen, Fischer, Hedstrom, Helmrich, Barnes and Bayne1999) or SM previously conditioned with Biomphalaria glabrata embryonic (Bge) cells. Conditioned SM was used in these larval SAGE studies to determine the effect of snail-derived components on sporocyst gene expression. Control (unconditioned) and conditioned media in wells containing sporocysts were changed at 2-day intervals. At 6-day (6d) and 20-day (20d) developmental time-points total RNA from all cultured sporocysts (6d control, 6d conditioned, 20d control and 20d conditioned) was isolated using TRIzol® reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions.
Bge cell culture and production of conditioned media
The Biomphalaria glabrata embryonic (Bge) cell line (ATCC CRL 1494) was used to produce snail cell-conditioned sporocyst medium (SM) for use in sporocyst in vitro culture experiments. Bge cells were maintained in 250-ml tissue culture flasks (Falcon™, BD Biosciences, San Jose, CA) containing Bge medium (Hansen, Reference Hansen, Kurstak and Maramorosch1976) supplemented with heat-inactivated 10% fetal bovine serum (cBge), penicillin G (0·06 mg/ml) and streptomycin sulfate (0·05 mg/ml) at 26°C under normal atmospheric conditions. Bge cells were grown to confluence, washed once with snail phosphate-buffered saline (sPBS; Yoshino, Reference Yoshino1981) pH 7·2, suspended in sPBS by gentle spraying of buffer to detach them from the flask wall, removed from the flasks, and washed an additional 2 times with sPBS. The cells were then resuspended in SM supplemented with 5% heat-inactivated fetal bovine serum and cultured in 250-ml flasks for 48 h prior to use in sporocyst culture experiments. After the 48 h incubation period these media were considered conditioned (designated ‘conditioned SM’) and were removed from the flasks containing Bge cells, centrifuged for 10 min at 500 g at 4°C to remove cellular debris, and immediately introduced into parasite culture experiments as described above.
Long Serial Analysis of Gene Expression (LongSAGE)
SAGE libraries were constructed using 30 μg of total RNA isolated from miracidia and 6d- and 20d-old in vitro-cultured primary sporocysts following the I-SAGE Long Kit protocol (Invitrogen, Carlsbad, CA) following the manufacturer's instructions, with the exception of the use of pGEM3Z as the cloning vector. One Shot TOP10 Electrocomp E. coli cells (Invitrogen) were transformed with recombinant pGEM3Z clones containing SAGE concatemers by electroporation using an Eppendorf (Westbury, NY) E. coli electroporation apparatus. Plasmid sequencing templates were prepared from 1·2 ml cultures using alkaline lysis as performed by a RevPrep Orbit robotic workstation (GeneMachines, San Carlos, CA). Sequencing reactions were run on an ABI 3730xl capillary DNA sequencer. Recombinant pGEM3Z clones containing SAGE concatemers were sequenced using only the M13F primer to avoid duplicate sampling of SAGE tags due to overlapping bidirectional sequences from individual clones.
Sequences collected were analysed with software created specifically for Schistosoma SAGE analysis. The SAGE software extracts ditag sequences from the ABI 3730xl results according to the SAGE sequence grammar, passes out individual SAGE tags, and reduces all SAGE tags to a table of unique SAGE tag sequences and their observed frequencies among all of the Schistosoma SAGE libraries. In cases where a ditag sequence was sampled more than once, only 1 representative was used in generating tag frequencies (Emmersen, Reference Emmersen2008a). Tags that contained base call ambiguities or bases with PHRED (Ewing et al. Reference Ewing, Hillier, Wendl and Green1998; Ewing and Green, Reference Ewing and Green1998) values of less than 10 (10% or greater chance of incorrect base call) were excluded from analyses (Emmersen, Reference Emmersen2008b). Additional putative sequencing error was removed by identifying SAGE tag sequences that did not have a perfect sequence match in the set of genome project gene predictions and that did not appear more than once in any of the SAGE libraries. As such, tags appearing at least twice in at least one SAGE library but that did not have a sequence match within the predicted genes were assumed to be from legitimate rare transcripts, from allelic variants, or un-sequenced regions of the genome and were retained for analysis of differential gene expression.
SAGE tag sequences were mapped to genes predicted by the ongoing genome project (version 4.0e) using custom software created specifically for Schistosoma SAGE. As SAGE tags are generated from the 3′-most NlaIII restriction site of the transcript, we included observed (EST) or theoretical 3′ UTR sequences when assigning tags to genes. Of all the genes predicted by the ongoing genome project (version 4.0e), 5334 had empirical EST data for 3′ UTR lengths. In these cases, the 3′ UTR sequence used for assignment of SAGE tags to genes was that predicted by the ESTs. The 99% confidence of the observed 3′ UTR lengths was determined (1388 bp) and for genes without EST data this length was used to predict 3′ UTR sequences from the genome.
In order to identify potential differentially-expressed gene between stages, tags were assigned an R-value, the log-likelihood ratio statistic of Stekel et al. (Reference Stekel, Git and Falciani2000), which scores tags by their deviation from the null hypothesis of equal frequencies. Higher scores represent a greater deviation from the null hypothesis, while scores close to zero represent near constitutive expression. For this study, an R-value of ⩾4 was used as a conservative measure to denote significant differences in gene expression between compared libraries (Stekel et al. Reference Stekel, Git and Falciani2000). In addition, Fisher's exact test was used to compare the effects of Bge cell-conditioning on the 6-d and 20-d sporocysts.
Quantitative PCR
Total RNA was extracted using TRIzol as described above. Single-stranded cDNA was synthesized using the SuperScript III First-Strand Synthesis System for RT-PCR (Invitrogen, Carlsbad CA). Quantitative real-time PCR (qPCR) primers were designed using Primer Express software (Applied Biosystems, Foster City, CA). qPCR was performed in a GeneAmp 7300 PCR apparatus in a 96-well format using SYBR green chemistry. Theoretical tag numbers were approximated by setting the highest Ct value to 1 (Smp_134670.2) and each increasing cycle number corresponds to a 2-fold increase in transcript abundance. This calculation assumes that amplification efficiency is similar for each primer set.
RESULTS
SAGE libraries
Five SAGE libraries were constructed from miracidia, 6-day unconditioned sporocysts, 6-day Bge cell-conditioned sporocysts, 20-day unconditioned sporocysts, and 20-day Bge cell-conditioned sporocysts, resulting in 68 450, 68 044, 60 171, 30 684 and 52 666 sequenced tags, respectively, after removal of sequencing error (Williams et al. Reference Williams, Sayed, Bernier, Birkeland, Cipriano, Papa, McArthur, Taft, Vermeire and Yoshino2007). The total number of unique SAGE tag sequences obtained from all 5 libraries was 21 440, including 8180 detecting sense transcription of a single gene, 4544 detecting anti-sense transcription of a single gene, 625 unresolved among possible sense transcription of several genes, 32 unresolved among possible anti-sense transcription of several genes, and 8059 not assigned to a gene. Of the 8059 unassigned tags, 2836 matched the genome but were not associated with any predicted gene or known transcript, 1900 matched the genome in multiple locations, and 3233 did not match the genome at all. Of the 13 185 transcripts predicted by the ongoing genome project, 12 879 (96·7%) contained at least one NlaIII site and are detectable by SAGE. All SAGE data were deposited to GenBank's GEO database under Accession number GSE9722.
Gene abundance
Of the 21 440 unique tag sequences observed in the 5 SAGE libraries, 30·4% had a frequency of 1, 37·1% a frequency of 2–5, 10·8% a frequency of 6–9, 19·8% a frequency of 10–99, and 1·9% a frequency of ⩾100 at their highest observed abundance. Of the most abundant tags (frequency ⩾100), 253 detected sense transcription of a single gene, 7 detected anti-sense transcription of a single gene, 39 were unresolved among multiple sense and/or anti-sense transcripts, and 113 were not assigned to a gene. For genes assigned both sense and anti-sense SAGE tags, a strong inverse correlation existed between the anti-sense:sense frequency ratio and the sense tag frequency (Fig. 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160709012102-52805-mediumThumb-S0031182009005733_fig1g.jpg?pub-status=live)
Fig. 1. Correlation between anti-sense:sense frequency ratio and the sense tag frequency for genes assigned both sense and anti-sense SAGE tags.
Using Blast2GO (Conesa et al. Reference Conesa, Gotz, Garcia-Gomez, Terol, Talon and Robles2005) we identified level 3 molecular function gene ontology (GO) categories for 99 of the most abundant 150 tags (Fig. 2). The most highly represented functional categories included genes involved in transcription, translation, ion binding and oxidoreductase activity. There were 2147 SAGE tags uniquely expressed in miracidia and 9739 SAGE tags uniquely expressed in sporocysts. A comparison of the level 2 molecular function gene ontology categories between uniquely expressed miracidia and sporocyst transcripts revealed 2 categories, binding and catalytic activities, dominated gene expression in both miracidia and sporocysts, comprising >80% of the total transcriptome (Table 1). Although the overall pattern of unique miracidia and sporocyst transcript GO categories is similar, this may represent different transcripts performing similar molecular functions in the various stages. GO categories enriched in unique sporocyst transcripts include structural molecules, plus antioxidant and enzyme regulator activities, while chaperone activity was enriched in unique miracidial transcripts. In addition, we identified 911 SAGE tags uniquely expressed in miracidia and 3608 SAGE tags uniquely expressed in sporocysts corresponding to non-predicted and unknown transcripts or SAGE tag sequences not matching the genome.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160709190408-62677-mediumThumb-S0031182009005733_fig2g.jpg?pub-status=live)
Fig. 2. Level 3 molecular function GO categories for the most highly expressed transcripts in the 5 Schistosoma mansoni SAGE libraries. Percentages represent the frequency of each term.
Table 1. Transcripts uniquely expressed in Schistosoma mansoni miracidia (2,147 SAGE tags corresponding to 448 transcripts) or sporocysts (9739 SAGE tags corresponding to 2104 transcripts) within various GO categories, also included are unique unknown SAGE tags
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160709190408-45687-mediumThumb-S0031182009005733_tab1.jpg?pub-status=live)
Differential expression of genes during development in miracidia, 6-day sporocysts and 20-day sporocysts
We identified 432 differentially expressed tags during larval development by comparison of all 5 SAGE libraries (R⩾4). Differentially expressed sense SAGE tags are listed in Table 2, while all tags are shown in supplemental data file 1. Due to the limits of space, only tags with R⩾7 are shown. The major classes of genes upregulated in miracidia compared to sporocysts were calcium-binding proteins (SME16, Calcineurin, 22·6 kDa tegument-associated antigen and synaptogagmin), heat shock proteins (HSP70, HSP90 and HSP27) and genes involved in cellular energy production (mitochondrial carrier protein, lactate dehydrogenase and phosphoglycerate kinase). The major classes of proteins upregulated in sporocysts compared to miracidia were associated with transcription and translation (elongation factor 1-alpha, polyadenylate binding protein, and many ribosomal proteins). Some heterogeneity may exist in the transcript levels of individual miracidia or sporocysts; however, the use of large populations of parasites more accurately reflects the transcriptomic profiles of the developmental stages sampled.
Table 2. Sense tags differentially expressed (R⩾7) between the 5 libraries
(Values represent the individual tag percentage abundance in a given library.)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160709190408-92073-mediumThumb-S0031182009005733_tab2.jpg?pub-status=live)
Differential expression of genes between conditioned and unconditioned sporocysts
Using Fisher's Exact test (Fisher, Reference Fisher1922) with a cutoff of P<0·01, we identified 53 tags that were differentially expressed between 6-day unconditioned and 6-day conditioned sporocysts, 22 of which were sense tags (Table 3). Forty-five tags were differentially expressed (P<0·01) between 20-day unconditioned and 20-day conditioned sporocysts, 19 of which were sense tags (Table 3). Fifteen of these differentially expressed genes were in higher abundance in unconditioned media than in conditioned media. Transcripts differentially expressed due to the effects of conditioning include HSP90, thioredoxin reductase, elongation factor 1-alpha, multiple ribosomal proteins, and proteins of unknown function. Seven tags were found to be differentially expressed in both 6-day and 20-day datasets.
Table 3. Differential expression of transcripts due to the effects of conditioning with Bge excretory-secretory products, examined using Fisher's Exact test (P<0·01)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160709190408-67375-mediumThumb-S0031182009005733_tab3.jpg?pub-status=live)
Quantitative PCR
Since SAGE is quantitative in nature, we selected 3 genes exhibiting of high, medium and low expression levels in the miracidial stage and used real-time qPCR to independently examine their transcript levels within a miracidial cDNA pool. Those genes with the highest tag number (Smp_071390, Smp_009760 and Smp_067800) correlated significantly in abundance within the larval cDNA pool by qPCR (Fig. 3). Similar correlations of tag number and transcript abundance by qPCR can be observed for genes with medium and low tag numbers (Fig. 3). These results indicate that tag numbers are predictive of relative levels of the transcript abundance.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160709012102-58180-mediumThumb-S0031182009005733_fig3g.jpg?pub-status=live)
Fig. 3. (A) Quantitative PCR examination of high, medium and low expressed transcripts. (A) Theoretical transcript numbers are compared to actual transcript numbers for nine transcripts: Smp_134670.2 (expressed protein of unknown function), Smp_161730 (RNA binding protein), Smp_043220 (expressed protein of unknown function), Smp_002880.1 (ATP synthase alpha), Smp_064950 (calcineurin B), Smp_137710 (putative drug transporter), Smp_071380 (adenylate kinase) and Smp_067800 (fibrillin 2). (B) The correlation between actual transcript levels and theoretical transcript levels.
DISCUSSION
SAGE allows for the dissection of complex processes involving the interaction of multiple genes or gene families, such as stage-specific differentiation or response to external stimuli, on a transcriptome-wide level. The quantitative nature of SAGE also enables one to analyse thousands of transcripts from a given sample simultaneously, allowing for greater coverage of expressed mRNAs and the detection of low abundance transcripts that may be missed using methodologies like DNA microarrays (Velculescu et al. Reference Velculescu, Zhang, Vogelstein and Kinzler1995). SAGE has been employed in numerous areas of biological, medical, and pharmaceutical research and has proved to be an excellent tool for comparing gene expression profiles between normal and abnormal cells occurring in a diseased state (e.g., tumors; St-Croix et al. Reference St-Croix, Rago, Velculescu, Traverso, Romans, Montgomery, Lal, Riggins, Lengauer, Vogelstein and Kinzler2000), differentially-treated, experimental cell populations (de Waard et al. Reference de Waard, van den Berg, Veken, Schultz-Heienbrok, Pannekoek and van Zonneveld1999) or whole organisms sampled over a period of development (e.g., Drosophila: Lee et al. Reference Lee, Bao, Zhou, Shapiro, Xu, Shi, Lu, Clark, Johnson, Kim, Wing, Tseng, Sun, Lin, Wang, Yang, Wang, Du, Wu, Zhang and Wang2005 or parasites: Palm et al. Reference Palm, Weiland, McArthur, Winiecka-Krusnell, Cipriano, Birkeland, Pacocha, Davids, Gillin, Linder and Svard2005; Skuce et al. Reference Skuce, Yaga, Lainson and Knox2005; Kronstad, Reference Kronstad2006).
Previously, only 2 studies have undertaken large-scale gene expression studies focusing on intramolluscan larval schistosomes. Vermeire et al. (Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006) investigated miracidia and 4-day in vitro cultured sporocysts using a 7335 feature oligonucleotide microarray and Jolly et al. (Reference Jolly, Chin, Miller, Bahgat, Lim, DeRisi and McKerrow2007) identified gene transcript changes between cercariae, daughter sporocysts recovered from infected snails, and adult worms utilizing an 11 998 feature array. We identified a large number of stage-associated transcripts that correlated with the Vermeire et al. study (Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006), despite differences in culture conditions and sporocyst age. For example, both studies found SME16, p40 egg antigen, myosin light chain, phosphoenolpyruvate carboxykinase (PEPCK), and secretory glycoprotein kappa-5 to be stage-associated with miracidia. Similarly, calreticulin, polo-like kinase (smPLK), 14-kDa fatty acid binding protein (Sm14; M60895) and a protein similar to insulin induced gene-1 were significantly associated with the sporocyst stage in both studies, demonstrating that similar results may be obtained using both SAGE and microarray methodologies in these stages. Differential gene expression for these transcripts was confirmed by real-time quantitative PCR (Vermeire et al. Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006) and further serves to validate SAGE as a quantitative transcriptomic method when applied to schistosomes. Williams et al. (Reference Williams, Sayed, Bernier, Birkeland, Cipriano, Papa, McArthur, Taft, Vermeire and Yoshino2007) also demonstrated, using semi-quantitative reverse transcription PCR (RT-PCR) that overall expression levels and changes in expression levels correlate well between SAGE and RT-PCR in stage-specific comparisons.
Based on GO assignments, various functional groups were identified by SAGE as being over-represented in differentially expressed genes, uniquely expressed genes and/or genes having very high expression levels (SAGE tag frequencies). These functional groups are discussed below.
Heat-shock proteins (HSP)/chaperonins
HSPs play important roles in folding, secretion, regulation, assembly, translocation and degradation of other proteins and as such are critical to numerous biochemical and molecular cell processes (Brown et al. Reference Brown, Zhu, Schmidt and Tucker2007). The expression levels of HSP86/90, HSP40/DnaJ, HSP70/BiP, USP (universal stress protein-like) and p40 egg antigen (HSP27) were significantly higher in miracidia than 6- or 20-day sporocysts. HSP90 is an ATP-dependent chaperone involved in the activation and trafficking of proteins (Young et al. Reference Young, Agash, Siegers and Hartl2004). The p40 egg antigen contains 2 alpha-crystallin domains and exhibits high homology to small heat shock proteins from Drosophila (HSP27; Nene et al. Reference Nene, Dunne, Johnson, Taylor and Cordingley1986). The biological significance of S. mansoni miracidia secreting such high levels of a small heat shock protein homologue remains unclear. However, the secretion of HSPs by parasitic helminths is emerging as a common theme (Nene et al. Reference Nene, Dunne, Johnson, Taylor and Cordingley1986; Cai et al. Reference Cai, Langley, Smith and Boros1996; Knudsen et al. Reference Knudsen, Medzihradszky, Lim, Hansell and McKerrow2005; Craig et al. Reference Craig, Wastling and Knox2006; Cass et al. Reference Cass, Johnson, Califf, Xu, Hernandez, Stadecker, Yates and Williams2007). Knudsen et al. (Reference Knudsen, Medzihradszky, Lim, Hansell and McKerrow2005) identified HSP86/90, HSP70 and HSP60 in cercarial secretions, while p40 and other HSP/chaperone family members are secreted by schistosome eggs (Cai et al. Reference Cai, Langley, Smith and Boros1996; Cass et al. Reference Cass, Johnson, Califf, Xu, Hernandez, Stadecker, Yates and Williams2007) or found in the empty eggshells of newly hatched S. japonicum miracidia (Liu et al. Reference Liu, Lu, Hu, Wang, Cui, Chi, Yan, Wang, Song, Xu, Wang, Zhang, Zhang, Wang, Xue, Brindley, McManus, Yang, Feng, Chen and Han2006). In a recently completed proteomic study, excretory-secretory proteins (ESP) released by in vitro-cultured S. mansoni miracidia, HSP/chaperonins were identified as a major constituent (Guillou et al. Reference Guillou, Roger, Mone, Rognon, Grunau, Theron, Mitta, Coustau and Gourbal2007; Wu et al. Reference Wu, Sabat, Brown, Zhang, Taft, Peterson, Harms and Yoshino2008). In the mammalian host, schistosome-secreted HSPs, like p40, serve as powerful immunogens eliciting production of pro-inflammatory cytokines resulting in extensive tissue fibrosis (Cai et al. Reference Cai, Langley, Smith and Boros1996). Recently, it has been proposed that HSPs represent important ‘danger signals’ that, upon binding to macrophages/monocytes receptors (e.g., toll-like or scavenger receptors), stimulate release of pro-inflammatory cytokines or chemokines (Binder et al. Reference Binder, Vatner and Srivastava2004). Because induction of inflammatory fibrosis and granuloma formation around eggs is required for their efficient excretion from the host, the schistosome parasite appears to manipulate the host immune response to their advantage, by increased expression of HSPs and other immunogens (Binder et al. Reference Binder, Vatner and Srivastava2004).
The transition from free-living miracidium to parasitic sporocyst is accompanied by morphogenetic and physiological changes (Voge and Seidel, Reference Voge and Seidel1972). Upregulation of HSPs may very well represent a stress response to these changes. During larval transformation the shedding, and subsequent degeneration, of ciliary epidermal plates during formation of the sporocyst tegument appears to represent a major source of excreted larval proteins and thus likely represents the source of the abundant HSPs found in larval ESP (Guillou et al. Reference Guillou, Roger, Mone, Rognon, Grunau, Theron, Mitta, Coustau and Gourbal2007; Wu et al. Reference Wu, Sabat, Brown, Zhang, Taft, Peterson, Harms and Yoshino2008). Yet to be answered, however, is the question of whether or not HSPs released by the parasite serve to alert or suppress the snail host's immune system.
Calcium-interactive proteins
The divalent cation Ca++ is used as a cellular signal or ionic cofactor involved in diverse metabolic processes, including secretion, metabolism, muscle movement and neuronal function (Bhattacharya et al. Reference Bhattacharya, Padhan, Jain and Bhattacharya2006). Likewise, molecular interactions with calcium appear to play important roles in several physiological processes that govern miracidial infection of the snail host, especially its initial development to the parasitic sporocyst stage. Host entry, miracidium-sporocyst transformation, muscle movement and larval motility, and enzyme regulation all appear to be calcium-dependent processes (Sponholtz and Short, Reference Sponholtz and Short1976; Knabe et al. Reference Knabe, Gilbertson and Plorin1982; Noel et al. Reference Noel, Cunha, Silva and Mendonca-Silva2001). For example, calcium chelators and pharmacological calmodulin antagonists have been shown to inhibit schistosome egg hatching and/or miracidial transformation (Katsumata et al. Reference Katsumata, Shimada, Sato and Aoki1988, Reference Katsumata, Kohno, Yamaguchi, Hara and Aoki1989; Kawamoto et al. Reference Kawamoto, Shozawa, Kumada and Kojima1989). Other studies have shown that calcium mobilization plays a role in cercarial penetration processes, possibly by Ca-regulation of protease activities during infection (Lewert et al. Reference Lewert, Hopkins and Mandlowitz1966; Fusco et al. Reference Fusco, Salafsky, Vanderkooi and Shibuya1991) or within penetration glands (Dresden and Edlin, Reference Dresden and Edlin1975). The finding that excystment of Paragonimus ohirai metacercariae is a Ca++-dependent process (Ikeda, Reference Ikeda2004) indicates that the role of calcium as a regulator of larval development may be functionally conserved across trematode species. In the present study, 5 calcium-signalling or binding molecules were found to have stage-associated expression. Calcineurin B and SME16 were found to be highly expressed in miracidia, while calreticulin and calmodulin were mainly associated with 6-day and/or 20-day-old sporocysts. These results are consistent with the findings of recent gene expression analyses of calponin, SME16, calreticulin and calpain in miracidia of S. japonicum (Liu et al. Reference Liu, Lu, Hu, Wang, Cui, Chi, Yan, Wang, Song, Xu, Wang, Zhang, Zhang, Wang, Xue, Brindley, McManus, Yang, Feng, Chen and Han2006) and SME16 and calreticulin in S. mansoni miracidia (Vermeire et al. Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006; Guillou et al. Reference Guillou, Roger, Mone, Rognon, Grunau, Theron, Mitta, Coustau and Gourbal2007). Calreticulin, a versatile protein typically associated with chaperone activity, also functions as a storage form for calcium and a signalling molecule involved in regulating calcium homeostasis (Gelebart et al. Reference Gelebart, Opas and Michalak2005). Recently, it was shown that a calreticulin-like protein from endoparasitoid wasp venom fluid inhibits haemocyte spreading behaviour and thus prevents encapsulation within its lepidopteran host (Zhang et al. Reference Zhang, Schmidt and Asgari2006). In this light, the finding of a calreticulin in ESP of S. mansoni primary sporocysts (Guillou et al. Reference Guillou, Roger, Mone, Rognon, Grunau, Theron, Mitta, Coustau and Gourbal2007) suggests that it may have a role as a parasite defensive mechanism against haemocyte encapsulation.
Calmodulin is a multifunctional protein involved in the regulation of a variety of cellular processes. In mammalian cells, calmodulin functions in the activation of protein kinases, smooth muscle contraction and calcium channel regulation (Bers and Guo, Reference Bers and Guo2005). Most pertinent to our studies was an earlier finding that the hatching of S. mansoni eggs appears to be a Ca++/calmodulin-dependent process (Katsumata et al. Reference Katsumata, Kohno, Yamaguchi, Hara and Aoki1989). The fact that some HSPs (e.g., HSP90 and HSP70) contain calmodulin-binding domains (Song et al. Reference Song, Song, Na, Kim, Kwon, Park, Pak, Im and Shin2007) implies potential molecular and functional interactions between these distinct molecular groups.
Since larval hatching from eggs is greatly facilitated by active larval motility, the finding of calponin, an actin- and tropomyosin-binding protein that acts as a regulator of smooth muscle contraction and motility (Winder et al. Reference Winder, Allen, Clément-Chomienne and Walsh1998), is consistent with its predicted involvement in hatching or other functions requiring miracidial motility. Synaptotagmin, Ca++ sensors in a family of membrane-trafficking proteins involved in exocytosis and neurotransmitter release (Yoshihara and Montana, Reference Yoshihara and Montana2004), also is upregulated in miracidia compared to sporocysts. This protein may function as a signalling protein during the miracidial stage.
Two other Ca++-binding protein transcripts, SME16 and a small 8 kDa protein containing predicted EF-hand domains, also show differential expression in miracidia. Both SME16 and a protein with significant homology to the Ca-binding 8 kDa dynein light chain have been identified in ESP from S. mansoni miracidia (Wu et al. Reference Wu, Sabat, Brown, Zhang, Taft, Peterson, Harms and Yoshino2008). It now appears that SME16, previously shown to be a highly-expressed egg protein, originates within the fully-formed miracidium, although its function in larval development or metabolism is still unknown (Moser et al. Reference Moser, Doenhoff and Klinkert1992).
Egg antigens
Transcripts of several abundant proteins found in soluble egg antigen (SEA) (Cass et al. Reference Cass, Johnson, Califf, Xu, Hernandez, Stadecker, Yates and Williams2007) were found to be highly expressed in miracidia (Vermeire et al. Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006). These transcripts included p40 egg antigen, PEPCK, secretory glycoprotein kappa-5, SME16 and several forms of glutathione-S-transferases (GSTs). This finding suggests that these antigens arise from the miracidia within the egg, or possibly other egg tissues (e.g., von Lichtenberg's membrane) (Neill et al. Reference Neill, Smith, Doughty and Kemp1988), capable of synthesizing and secreting proteins from the egg. Neill et al. (Reference Neill, Smith, Doughty and Kemp1988) showed by electron microscopy that the von Lichtenburg's membrane, a structure that forms a thin cellular envelope surrounding the developing miracidium, contains ribosomes, and it was suggested that this tissue may be the source of SEA detected by immune sera and be responsible for promoting granuloma formation around the egg. Recent proteomic analysis of S. mansoni egg excretions and their immunolocalization supports this notion (Cass et al. Reference Cass, Johnson, Califf, Xu, Hernandez, Stadecker, Yates and Williams2007). However, the observation that most ‘egg antigen’ mRNAs are being transcribed in high abundance within miracidia suggests that these proteins are synthesized within the miracidium itself, are then released, and finally exported by some yet unknown mechanism through the tissues surrounding the larvae and out through the cribriform pores of the parasite's egg shell (Neill et al. Reference Neill, Smith, Doughty and Kemp1988). El Ridi et al. (Reference El Ridi, Velupillai and Harn1996) demonstrated the in situ binding of mouse L-selectin to the surface of egg-encased miracidia, implying external intact protein transport across the von Lichtenberg membrane, thus providing a basis for hypothesizing that reverse transport (miracidium-to-egg-shell surface) is highly possible. Further characterization of larval S. mansoni excretory-secretory proteins will help to further our understanding of the molecules released from the ciliated larvae during egg development, hatching and penetration of the molluscan intermediate host.
Antioxidants
RNA transcripts for S. mansoni glutathione peroxidase (GPx) were found to be present at high levels in both miracidia and sporocysts, although a greater abundance was observed in miracidia mRNA populations. This observation is consistent with earlier microarray analyses in which GPx mRNA was found to be preferentially expressed in miracidia when compared to sporocysts (Vermeire et al. Reference Vermeire, Taft, Hoffmann, Fitzpatrick and Yoshino2006) and in preliminary analyses of larval transcripts by SAGE (Williams et al. Reference Williams, Sayed, Bernier, Birkeland, Cipriano, Papa, McArthur, Taft, Vermeire and Yoshino2007). However, even though lower than in miracidia, GPx transcript abundance in sporocysts was still much higher (by 10-fold) than those of peroxiredoxins (Prxs), another prominent class of reactive oxygen-scavenging enzymes expressed mainly in sporocysts (Vermeire and Yoshino, Reference Vermeire and Yoshino2007). S. mansoni GPx has been biochemically characterized as a primary lipid hydroperoxide reductant, but the molecule also possesses a hydrogen peroxide metabolizing activity (Mei et al. Reference Mei, Thakur, Schwartz and Lo Verde1996). Another protein potentially involved in the S. mansoni redox pathway is translationally-controlled tumor protein (TCTP), which was the sixteenth highest expressed gene transcript from the 5 larval libraries. Studies of recombinant Brugia malayi TCTP showed that this protein possesses antioxidant activity, can be reduced by thioredoxin, and is upregulated upon host infection (Gnanasekar and Ramaswamy, Reference Gnanasekar and Ramaswamy2007). In yeast, it is upregulated by treatment with H2O2 (Bonnet et al. Reference Bonnet, Petter, Dumont, Picard, Caput and Lenaers2000). The presence of GPx and TCTP in such high abundance in miracidia and sporocysts suggests that it may provide protection for both the developing miracidia within eggs in the definitive host as well as some measure of protection against oxidative damage in the sporocyst stage. It seems plausible that GPx and TCTP may also aid sporocysts in their defence against reactive oxygen species (ROS) naturally occurring in the haemoglobin-rich plasma environment of the snail host (Hahn et al. Reference Hahn, Bender and Bayne2001a) or produced by snail haemocytes during an encapsulation reaction (Hahn et al. Reference Hahn, Bender and Bayne2001b; Bender et al. Reference Bender, Broderick, Goodall and Bayne2005). Recent studies have shown that Prx expression can be induced in S. mansoni mother sporocysts upon exposure to B. glabrata embryonic cells or exogenous hydrogen peroxide in vitro (Coppin et al. Reference Coppin, Lefebvre, Caby, Cocquerelle, Vicogne, Coustau and Dissous2003; Vermeire and Yoshino, Reference Vermeire and Yoshino2007). Thioredoxin, another redox pathway enzyme found to be moderately expressed in all 5 of our S. mansoni libraries, had highest expression levels in 6- and 20-day sporocysts cultured in snail cell-conditioned medium. GST-26 and GST-28, also possessing anti-oxidant activities, were more highly expressed in sporocysts than in miracidia.
Effects of cultivation in Biomphalaria glabrata embryonic (Bge) cell-conditioned medium
Co-cultivation of S. mansoni primary sporocysts with the Bge cell line results in the production of embryos of secondary (=daughter) sporocysts by 15 days in culture and fully formed daughter sporocysts from 20 days and after (Yoshino and Laursen, Reference Yoshino and Laursen1995). However, due to the complexity of performing SAGE analysis on a multi-organism dataset, we utilized Bge cell-conditioned medium to investigate the influence of host molecules on sporocyst gene expression and larval development. Earlier experiments (Coppin et al. Reference Coppin, Lefebvre, Caby, Cocquerelle, Vicogne, Coustau and Dissous2003; Vermeire et al. Reference Vermeire, Boyle and Yoshino2004) have demonstrated that factors secreted from Bge cells influence gene expression during in vitro development of S. mansoni sporocysts. For example, in sporocysts, glutaminyl t-RNA synthetase (GlnRS) transcripts were shown to be 3-fold higher and T-complex protein 1 subunit gamma (SmTCP-1) transcripts 1·3-fold higher in response to Bge cell-conditioned medium (Coppin et al. Reference Coppin, Lefebvre, Caby, Cocquerelle, Vicogne, Coustau and Dissous2003). Our data indicate GlnRS transcripts are 2-fold higher in 6-day sporocysts cultured in conditioned medium and 1·3-fold higher in 20-day larvae. Vermeire et al. (Reference Vermeire, Boyle and Yoshino2004) demonstrated that G-alpha subunit 1 (G-alpha1) and SmPLK gene expression was higher in 4-day sporocysts cultured in Bge cell-conditioned Bge medium than in unconditioned medium. In contrast, our data showed that SmTCP-1 and SmPLK were lower in conditioned 6-day sporocysts than in 6-day unconditioned sporocysts. The discrepancies between our data and those of Coppin et al. (Reference Coppin, Lefebvre, Caby, Cocquerelle, Vicogne, Coustau and Dissous2003) and Vermeire et al. (Reference Vermeire, Boyle and Yoshino2004) are likely due to the differences in culture media used and/or the length of time cultured.
Overall, 22 sense tags were found to be differentially expressed between 6-day conditioned and unconditioned sporocysts, of which 14 were upregulated in 6-day conditioned sporocysts compared to 8 transcripts upregulated in unconditioned sporocysts. This compares to 19 sense tags being differentially expressed between 20-day conditioned and unconditioned sporocysts, of which 16 were more highly expressed in unconditioned versus conditioned media. The majority of upregulated genes in both 6- and 20-day sporocysts involved transcriptional and translational processes, possibly reflecting a high degree of mitotic and protein synthetic activity associated with somatic and germinal tissue growth at this stage of larval development. Thirty-two % of the sense tags differentially expressed between 6-day conditioned and unconditioned sporocysts were higher in unconditioned media whereas 84% of the sense tags differentially expressed between 20-day conditioned and unconditioned were higher in unconditioned media. Differences in the gene expression in early versus late developing larvae under different culture conditions suggest that snail cell components may be playing different roles in regulating gene expression throughout sporocyst development.
Anti-sense transcripts
Unlike traditional oligonucleotide microarrays, SAGE analysis can identify anti-sense transcripts. These transcripts have been previously identified in S. mansoni (Waisberg et al. Reference Waisberg, Lobo, Cerqueira, Passo, Carvalho, Franco and El-Sayed2007) and other parasitic organisms, including Leishmania (Dumas et al. Reference Dumas, Chow, Muller and Papadopoulou2006), Onchocerca volvulus (Erttmann et al. Reference Erttmann, Buttner and Gallin1995), Plasmodium falciparum (Gunasekera et al. Reference Gunasekera, Patankar, Schug, Eisen, Kissinger, Roos and Wirth2004) and Trypanosoma brucei (Liniger, Reference Liniger, Bodenmüller, Pays, Gallati and Roditi2001). Anti-sense SAGE tags are highly represented in these parasites, constituting 17% and 21·5% of all tags and in Plasmodium and Toxoplasma gondii, respectively (Patankar et al. Reference Patankar, Munasinghe, Shoaibi, Cummings and Wirth2001; Radke et al. Reference Radke, Behnke, Mackey, Radke, Roos and White2005). Moreover, in P. falciparum (Gunasekera et al. Reference Gunasekera, Patankar, Schug, Eisen, Kissinger, Roos and Wirth2004) and other organisms (Farrell and Lukens, Reference Farrell and Lukens1995; Luther et al. Reference Luther, Haase, Hohaus, Beckmann, Reich and Morano1998; Hastings et al. Reference Hastings, Ingle, Lazar and Munroe2000) there exists a significant inverse relationship between anti-sense and sense tag frequencies that have been speculated as being a novel form of post-transcriptional gene regulation. In our study, 35% of the total mapped tags appear to be anti-sense tags and a strong inverse correlation between sense to anti-sense transcription exists. Gene loci containing higher levels of anti-sense tags contain lower levels of sense transcripts and vice-versa, supporting the notion that anti-sense transcription may be a novel form of post-transcriptional regulation. Three mechanisms of anti-sense-mediated post-transcriptional regulation have been proposed, (a) anti-sense transcripts bind to the complementary sense transcripts, targeting it for RNAi mediated decay; (b) interfering with mRNA elongation or; (c) binding to the sense transcript and interfering with translation (Gunasekera et al. Reference Gunasekera, Patankar, Schug, Eisen, Kissinger, Roos and Wirth2004).
Conclusions
We have utilized LongSAGE to profile gene expression changes during early larval schistosome development, targeting the transition from free-living miracidium to the snail-parasitic mother sporocyst stage. This study represents the largest and most comprehensive transcriptomic analysis of gene expression changes during the earliest stages of intramolluscan larval S. mansoni development. We identified genes potentially involved in parasite growth and development, including many genes that are expressed in a stage-associated manner, thereby increasing our knowledge of putative regulatory networks in establishment of molluscan schistosome infections. Although the in vitro culture system employed in this study may not exactly mimic in vivo development, parasites in this in vitro system appear to develop normally (Basch and DiConza, Reference Basch and DiConza1974) and daughter sporocyst production is attainable (Yoshino and Laursen, Reference Yoshino and Laursen1995). Moreover, susceptible strains of B. glabrata, when injected with in vitro-cultured sporocysts, develop fully patent infections (Granath and Yoshino, Reference Granath and Yoshino1984). Also, the miracidia recovered using our isolation procedure likely represent a heterogeneous population with regards to age or maturation synchrony, and may account for variation in individual gene expression between stages, especially those with low transcript numbers. However, because our analyses encompass large population sizes, data on differential gene expression captures a transcriptomic profile of the majority of parasites represented in those populations. In addition, we have identified 4519 SAGE tags uniquely expressed in either the miracidial or sporocyst stage corresponding to non-predicted transcripts, unknown transcripts or unsequenced regions of the genome. These transcripts may represent important developmental processes crucial to the survival of these individual stages. As the S. mansoni genome is further annotated, these results can be updated with additional SAGE tag mappings thereby identifying and further elucidating the function of these stage-specific transcripts.
APPENDIX A. SUPPLEMENTARY DATA
The following additional data files are available with the online version of the paper. Supplemental data file 1 contains all tags differentially-expressed (R⩾4) during larval development. Supplemental data file 2 contains all tags differentially expressed (P<0·01) between 6-day conditioned and unconditioned sporocysts and supplemental data file 3 contains all tags differentially expressed (P<0·01) between 20-day conditioned and unconditioned sporocysts.
This work was supported by NIH R01AI061436-03 (T.P.Y.). Schistosome-infected mice were provided from Fred Lewis (Biomedical Research Institute), through NIH supply contract AI30026. A.G.M., S.R.B., J.B., A.R.P. and M.J.C. were additionally supported by the Marine Biological Laboratory's Program in Global Infectious Diseases, funded by the Ellison Medical Foundation. Computational resources were provided by the Josephine Bay Paul Center for Comparative Molecular Biology and Evolution (Marine Biological Laboratory) through funds provided by the W.M. Keck Foundation and the G. Unger Vetlesen Foundation.