Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-02-11T13:11:17.399Z Has data issue: false hasContentIssue false

Transcriptome assembly and polymorphism detection in Silene ciliata (Caryophyllaceae)

Published online by Cambridge University Press:  30 April 2019

Sandra Sacristán-Bajo
Affiliation:
Biodiversity and Conservation Area, Superior School of Experimental Science and Technology (ESCET), Rey Juan Carlos University, Madrid, Spain
Alfredo García-Fernández
Affiliation:
Biodiversity and Conservation Area, Superior School of Experimental Science and Technology (ESCET), Rey Juan Carlos University, Madrid, Spain
Jose M. Iriondo
Affiliation:
Biodiversity and Conservation Area, Superior School of Experimental Science and Technology (ESCET), Rey Juan Carlos University, Madrid, Spain
Carlos Lara-Romero*
Affiliation:
Biodiversity and Conservation Area, Superior School of Experimental Science and Technology (ESCET), Rey Juan Carlos University, Madrid, Spain Global Change Research Group, Mediterranean Institute of Advanced Studies (CSIC–IUB), Esporles, Mallorca, Spain
*
*Corresponding author. E-mail: carlos.lara.romero@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Silene ciliata (Caryophyllaceae) is a key species to test evolutionary hypotheses in a global warming context. The recent advances in Next Generation Sequencing technologies can help in providing clues about climate-mediated local adaptation. In the present study, we analysed the full transcriptome of six individuals of S. ciliata from Central Spain, by aligning it with the transcriptome of S. latifolia. We aimed (a) to identify Single Nucleotide Polymorphisms (SNPs) in the transcriptome of the species, (b) to describe the biological function of the polymorphic genes expressed and (c) to identify loci that may be involved in local adaptation processes at optimal and marginal populations of the species. We identified a total of 147,118 SNPs distributed throughout 12,688 sequences. The number of polymorphic sequences annotated was 8023. One hundred thirty sequences containing polymorphisms strongly associated with optimal and marginal conditions were selected. Gene ontology searches were successful for 118, and many of these were related to responses to stress (n = 19) and abiotic stimulus (n = 16). Genomic data generated provide a starting point for further research on the identification of candidate genes related to local adaptation and other processes in the species.

Type
Short Communication
Copyright
Copyright © NIAB 2019 

Introduction

Silene L. (Caryophyllaceae) is a key plant genus for studying crucial questions interrelating ecology and evolution (Bernasconi et al., Reference Bernasconi, Antonovics, Biere, Charlesworth, Delph, Filatov, Giraud, Hood, Marais, McCauley, Pannell, Shykoff, Vyskot, Wolfe and Widmer2009). Silene ciliata Pourret is a Mediterranean alpine plant that occurs in alpine pastures (Escudero et al., Reference Escudero, Gimenez-Benavides, Iriondo and Rubio2005) protected by the European Habitats Directive (Council Directive 92/43/EEC, 1992). The species is threatened by global warming (Giménez-Benavides et al., Reference Giménez-Benavides, Escudero and Iriondo2007, Reference Giménez-Benavides, Escudero, García-Camacho, García-Fernández, Iriondo, Lara-Romero and Morente-López2018) and has been included in catalogues of threatened species from different countries and regions (Dray, Reference Dray1985; Fernández et al., Reference Fernández, Prieto, Altuna, Arregui, Gutiérrez, Sánchez and Janices2007; Sanz et al., Reference Sanz, Carreño and Albacete2010; Légifrance, 2019). The seedling stage of this species experiences great mortality and it is, thus, subjected to strong selective pressure. This pressure may be qualitatively different between environmental conditions that are most commonly found at the species populations (optimal conditions) and those found only at the extreme of the species ecological range (marginal conditions). Previous studies have shown local adaptation patterns in optimal and marginal populations (Giménez-Benavides et al., Reference Giménez-Benavides, Escudero and Iriondo2007, Reference Giménez-Benavides, Escudero, García-Camacho, García-Fernández, Iriondo, Lara-Romero and Morente-López2018; García-Fernández et al., Reference García-Fernández, Escudero, Lara-Romero and Iriondo2015). Thus, S. ciliata is a key exponent to evaluate Mediterranean alpine species responses to oncoming global warming and corroborate evolutionary hypotheses (e.g. García-Fernández et al., Reference García-Fernández, Escudero, Lara-Romero and Iriondo2015; Kyrkou et al., Reference Kyrkou, Iriondo and García-Fernández2015; Lara-Romero et al., Reference Lara-Romero, García-Fernández, Robledo-Arnuncio, Roumet, Morente- López, López-Gil and Iriondo2016). Accordingly, there is great interest in developing genomic resources for the species, which would improve the understanding of the genetics of adaptation in Mediterranean alpine environments.

In this context, we carried out a transcriptomic study of seedlings of S. ciliata, with the following objectives: (1) to identify Single Nucleotide Polymorphisms (SNPs) in the transcriptome of the species, (2) to describe the biological function of the polymorphic genes expressed, (3) to examine diversity patterns of potential adaptive value in optimal and marginal populations of the species and identify loci that may be involved in local adaptation processes.

Experimental

We used RNeasy Plant Mini-Kit (QIAGEN) to extract and isolate RNA from six seedlings grown under controlled conditions, one for each of the six studied populations located in Central Spain. Three populations were located at the high edge and the other three at the low edge of the species elevational range (Table 1). The high and low-elevation edges represent optimum and marginal (warmer and drier) environmental conditions for the species, respectively. The quality of RNA was evaluated with a Qubit (Invitrogen, Carlsbad, CA, USA). One sequencing run was carried out in an Illumina platform through 100 bp paired-end reads. Trimming was carried out with software Trimmomatic (Bolger et al., Reference Bolger, Lohse and Usadel2014). Then, S. ciliata transcriptome was aligned with the genome of Silene latifolia (GenBank reference: GCA_900095335.1) using BWA software (Li and Durbin, Reference Li and Durbin2010).

Table 1. Geographic and climatic characteristics of the six studied populations of Silene ciliata in Central Spain and genetic assessment of the sampled seedlings

For each plant, the population of origin, environmental classification, minimum annual temperature in (Min T), snowpack accumulation in thaw months (February, March and April), geographical coordinates, percentage of mapping with the reference genome, total number of sequences retained after trimming and estimators of genetic diversity are provided. Trimming configuration after the optimization process was: leading: 5, trailing: 5, sliding window: 4:15, minlen: 50. See trimmomatic manual (http://www.usadellab.org/cms/?page=trimmomatic) for further details on the selection of trimming steps and their associated parameters.

H O, observed heterozygosity; F i, coefficient of inbreeding.

a Nucleotide diversity per site ×10−3. Climatic variables were obtained from the ACPI (Atlas Climatic de la Peninsula Ibérica, http://opengis.uab.es/wms/iberia/index.htm). Monthly snowpack was calculated following the methodology proposed by López-Moreno and co-workers (Reference López-Moreno, Vicente-Serrano and Lanjeri2007) and it ranged from 0 to 1.

SNPs were identified using Reads2snp (Gayral et al., Reference Gayral, Melo-Ferreira, Glemin, Bierne, Carneiro, Nabholz, Lourenco, Alves, Ballenghien, Faivre, Belkhir, Cahais, Loire, Bernard and Galtier2013) and filtered with VCFtools 4.1 (Danecek et al., Reference Danecek, Auton, Abecasis, Albers, Banks, DePristo, Handsaker, Lunter, Marth, Sherry and McVean2011). Only, biallelic SNPs with no missing data and at least seven reads per genotype were retained to prevent the inclusion of false positive SNPs (Swarts et al., Reference Swarts, Li, Romero Navarro, An, Romay, Hearne, Acharya, Glaubitz and Mitchell2014; Marano et al., Reference Marano, Marcorin, Castelli and Mendes-Junior2017). Paralogous and singleton SNPs were further deleted. VCFtools was also used to estimate the genetic variation in the whole genome. Blastx software (Altschul et al., Reference Altschul, Gish, Miller, Myers and Lipman1990) and the database of SWISS-PROT (Bairoch and Apweiler, Reference Bairoch and Apweiler2000) were used to annotate the biological function (i.e. gene ontology terms) of the sequences carrying SNPs. We applied two different measures to detect candidate SNPs with unusually high-allele frequency differentiation between elevations. We first calculated allele frequency differences (AFDs) between low and high elevations at the individual allele level (Turner et al., Reference Turner, Bourne, Von Wettberg, Hu and Nuzhdin2010; Stölting et al., Reference Stölting, Paris, Meier, Heinze, Castiglione, Bartha and Lexer2015). SNPs were considered unusually divergent if AFDs were ≥3 SDs higher than the genome-wide average. Second, following Muller et al., Reference Muller, Latreille and Tollon2011, we computed the dispersion of each allele (m), which is the average elevational distance of an allele copy to the average elevation of all copies of that allele (β). Then, 1000 permutations of allele copies among all studied geographical locations were performed, to subsequently estimate m and β in each of the permutations. SNPs were considered unusually clustered if m were ≥2 SDs higher than the permutation average. We performed a GO term enrichment analyses using Fisher's exact tests to assess whether sequences containing selected SNPs were enriched in a biological function.

Discussion

Eighty percent of RNA sequences were conserved after trimming from an initial average of 30,000,698 ± 3,715,368 sequences. The percentage of sequences mapped against the reference genome ranged between 37.9 and 45 (Table 1). After filtering, we identified 147,118 SNPs distributed throughout 12,688 complete and partial sequences (SNPs per sequence: mean ± SD = 11.6 ± 14.51). Posterior probability per SNP was higher than 0.985 for all SNPs (mean ± SD = 0.9995 ± 0.0011). In total, 8023 polymorphic sequences were annotated. Their most common function was related to cellular processes, metabolic processes and biological regulation (Fig. 1, Table S1). Annotated sequences were deposited in the GenBank (BioProject ID: PRJNA528948). This extensive dataset provides a novel genomic resource for S. ciliata, and a significant step towards a better understanding of its genetics.

Fig 1. Most common biological functions found in polymorphic genes. The graph was created using the web Gene Ontology Annotation Plotting WEGO 2.0 (Ye et al., Reference Ye, Zhang, Cui, Liu, Wu, Cheng, Xu, Huang, Li, Zhou, Zhang, Bolund, Chen, Wang, Yang, Fang and Shi2018).

According to identified SNPs, individuals from high and low elevation presented similar values of genetic diversity (Table 1, all Wilcoxon rank tests: P ≥ 0.2). Fi was positive in five out of six plants (Table 1) and did not differ between elevations (Wilcoxon rank test: P = 0.7). Previous studies on S. ciliata using neutral markers also found similar estimates of genetic diversity across elevations (García-Fernández et al., Reference García-Fernández, Segarra-Moragues, Widmer, Escudero and Iriondo2012; Lara-Romero et al., Reference Lara-Romero, García-Fernández, Robledo-Arnuncio, Roumet, Morente- López, López-Gil and Iriondo2016). Overall, 775 sequences carrying SNPs associated with elevation were selected by at least one of the implemented approaches, but both shared only 130 of them. Almost 90% (n = 118) of them were successfully annotated (Table S1), but they were not significantly enriched in any biological process after FDR correction. However, about 15% of these sequences were associated with responses to stress (n = 19) and abiotic stimulus (n = 16) (Table S2). This is particularly interesting for the identification of loci related to adaptation to climate change (Giménez-Benavides et al., Reference Giménez-Benavides, Escudero and Iriondo2007, Reference Giménez-Benavides, Escudero, García-Camacho, García-Fernández, Iriondo, Lara-Romero and Morente-López2018), and improving the understanding of adaptation processes in the species.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S1479262119000157

Acknowledgements

Samples were sequenced in the ETH Zurich Genetic Diversity Centre (GDC). Niklaus Zemp from the GDC for providing access to the reference transcriptome of Silene latifolia. Alex Widmer and the Plant Ecological Genetics group (ETH Zurich) for their helpful recommendations and discussions throughout the study. This work was supported by projects AdAptA (CGL2012-33528) and EVA (CGL2016-77377-R) of the Spanish Ministry of Economy and Competitiveness (MINECO). CLR was supported by a Juan de la Cierva post-doctoral fellowship (MINECO: FJCI-2015-24712) and by a European Science Foundation ESF Exchange Grant (Reference No. 4794) within the ESF activity entitled ‘Conservation Genomics: Amalgamation of Conservation Genetics and Ecological and Evolutionary Genomics'. SSB was supported by a FPI pre-doctoral fellowship (MINECO: BES-2017-082317).

References

Altschul, SF, Gish, W, Miller, W, Myers, EW and Lipman, DJ (1990) Basic local alignment search tool. Journal of Molecular Biology 215: 403410.Google Scholar
Bairoch, A and Apweiler, R (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research 28: 4548.Google Scholar
Bernasconi, G, Antonovics, J, Biere, A, Charlesworth, D, Delph, LF, Filatov, D, Giraud, T, Hood, ME, Marais, GAB, McCauley, D, Pannell, JR, Shykoff, JA, Vyskot, B, Wolfe, LM and Widmer, A (2009) Silene as a model system in ecology and evolution. Heredity 103: 514.Google Scholar
Bolger, AM, Lohse, M and Usadel, B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England) 30: 21142120.Google Scholar
Council Directive 92/43/EEC of 21 May 1992. On the conservation of natural habitats and of wild fauna and flora. Official Journal of the European Union 206: 750.Google Scholar
Danecek, P, Auton, A, Abecasis, G, Albers, CA, Banks, E, DePristo, MA, Handsaker, RE, Lunter, G, Marth, GT, Sherry, ST and McVean, G (2011) The variant call format and VCFtools. Bioinformatics (Oxford, England) 27: 21562158.Google Scholar
Dray, AM (1985) Plantas a proteger en Portugal continental. Serviço Nacional de Parques, Reservas e Conservaçao da Natureza, Lisboa.Google Scholar
Escudero, A, Gimenez-Benavides, L, Iriondo, JM and Rubio, A (2005) Patch dynamics and islands of fertility in a high mountain Mediterranean community. Arctic, Antarctic and Alpine Research 36: 518527.Google Scholar
Fernández, AP, Prieto, JAC, Altuna, JG, Arregui, JL, Gutiérrez, LO, Sánchez, SP and Janices, JV (2007) Flora amenazada presente en la región eurosiberiana de la Comunidad Autónoma del País Vasco. Naturalia Cantabricae 3: 7991.Google Scholar
García-Fernández, A, Segarra-Moragues, JG, Widmer, A, Escudero, A and Iriondo, JM (2012) Unravelling genetics at the top: mountain islands or isolated belts? Annals of Botany 110: 12211232.Google Scholar
García-Fernández, A, Escudero, A, Lara-Romero, C and Iriondo, JM (2015) Effects of the duration of cold stratification on early life stages of the Mediterranean alpine plant Silene ciliata. Plant Biology 17: 344350.Google Scholar
Gayral, P, Melo-Ferreira, J, Glemin, S, Bierne, N, Carneiro, M, Nabholz, B, Lourenco, JM, Alves, PC, Ballenghien, M, Faivre, N, Belkhir, K, Cahais, V, Loire, E, Bernard, A and Galtier, N (2013) Reference-free population genomics from next-generation transcriptome data and the vertebrate-invertebrate gap. PLoS Genetics 9: e1003457.Google Scholar
Giménez-Benavides, L, Escudero, A and Iriondo, JM (2007) Local adaptation enhances seedling recruitment along an altitudinal gradient in a high mountain Mediterranean plant. Annals of Botany 99: 723734.Google Scholar
Giménez-Benavides, L, Escudero, A, García-Camacho, R, García-Fernández, A, Iriondo, JM, Lara-Romero, C and Morente-López, J (2018) How does climate change affect regeneration of Mediterranean high-mountain plants? An integration and synthesis of current knowledge. Plant Biology 20(Suppl. 1): 5062.Google Scholar
Kyrkou, I, Iriondo, JM and García-Fernández, A (2015) A glacial survivor of the alpine Mediterranean region: phylogenetic and phylogeographic insights into Silene ciliata Pourr. (Caryophyllaceae). PeerJ 3: e1193.Google Scholar
Lara-Romero, C, García-Fernández, A, Robledo-Arnuncio, JJ, Roumet, M, Morente- López, J, López-Gil, A and Iriondo, JM (2016) Individual spatial aggregation correlates with between-population variation in fine-scale genetic structure of Silene ciliata (Caryophyllaceae). Heredity 116: 417423.Google Scholar
Légifrance (2019) 10 mai 1990. Relatif à la liste des espèces végétales protégées en région Auvergne complétant la liste nationale. Journal Officiel de la République Française. NOR: PRME9061196A.Google Scholar
Li, H and Durbin, R (2010) Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics (Oxford, England) 26: 589595.Google Scholar
López-Moreno, JI, Vicente-Serrano, JI and Lanjeri, S (2007) Mapping snowpack distribution over large areas using GIS and interpolation techniques. Climate Research 33: 257270.Google Scholar
Marano, LA, Marcorin, L, Castelli, EC and Mendes-Junior, CT (2017) Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project. Genetics and Molecular Biology 40: 530539.Google Scholar
Muller, MH, Latreille, M and Tollon, C (2011) The origin and evolution of a recent agricultural weed: population genetic diversity of weedy populations of sunflower (Helianthus annuus L.) in Spain and France. Evolutionary Applications 4: 499514.Google Scholar
Sanz, JMH, Carreño, MÁC and Albacete, EA (2010) Conservación de flora amenazada en Castilla-La Mancha. Foresta 47: 1628.Google Scholar
Stölting, KN, Paris, M, Meier, C, Heinze, B, Castiglione, S, Bartha, D and Lexer, C (2015) Genome-wide patterns of differentiation and spatially varying selection between postglacial recolonization lineages of Populus alba (Salicaceae), a widespread forest tree. New Phytologist 207: 723734.Google Scholar
Swarts, K, Li, H, Romero Navarro, JA, An, D, Romay, MC, Hearne, S, Acharya, C, Glaubitz, JC and Mitchell, S (2014) Novel methods to optimize genotypic imputation for low-coverage, next-generation sequence data in crop plants. The Plant Genome 7: 112.Google Scholar
Turner, TL, Bourne, EC, Von Wettberg, EJ, Hu, TT and Nuzhdin, SV (2010) Population resequencing reveals local adaptation of Arabidopsis lyrata to serpentine soils. Nature Genetics 42: 260.Google Scholar
Ye, J, Zhang, Y, Cui, H, Liu, J, Wu, Y, Cheng, Y, Xu, H, Huang, X, Li, S, Zhou, A, Zhang, X, Bolund, L, Chen, Q, Wang, J, Yang, H, Fang, L and Shi, C (2018) WEGO 2.0: a web tool for analyzing and plotting GO annotations, 2018 update. Nucleic Acids Research 46: W71W75.Google Scholar
Figure 0

Table 1. Geographic and climatic characteristics of the six studied populations of Silene ciliata in Central Spain and genetic assessment of the sampled seedlings

Figure 1

Fig 1. Most common biological functions found in polymorphic genes. The graph was created using the web Gene Ontology Annotation Plotting WEGO 2.0 (Ye et al., 2018).

Supplementary material: PDF

Sacristán-Bajo et al. supplementary material

Sacristán-Bajo et al. supplementary material
Download Sacristán-Bajo et al. supplementary material(PDF)
PDF 204.3 KB