Introduction
The detection of allelic variation in plant populations has benefited from advances in genomics. Genome and transcriptome sequence data, comparative genomics and bioinformatics have expedited the identification of candidate gene sequences. Natural variation in candidate genes can then be targeted in germplasm collections through screening approaches such as ecotype TILLING (Ecotilling; Till, Reference Till, Tuberosa, Graner and Frison2014) or breeding with rare defective alleles (BRDA; Vanholme et al., Reference Vanholme, Cesarino, Goeminne, Kim, Marroni, Acker, Vanholme, Morreel, Ivens, Pinosio and Morgante2013). Through germplasm screens, alleles have been identified for drought resistance in rice (Yu et al., Reference Yu, Liao, Wang, Wen, Li, Mei and Luo2012), improved oil quality in rapeseed (Wang et al., Reference Wang, Shi, Tian, Ning, Wu, Long and Meng2010), early flowering in sugar beet (Frerichmann et al., Reference Frerichmann, Kirchhoff, Müller, Scheidig, Jung and Kopisch-Obuch2013) and modified lignin in poplar (Vanholme et al., Reference Vanholme, Cesarino, Goeminne, Kim, Marroni, Acker, Vanholme, Morreel, Ivens, Pinosio and Morgante2013). Our research interests involved examining a blueberry (Vaccinium sp.) germplasm collection for allelic variation in candidate genes controlling flowering and architecture.
The blueberry germplasm collection in the USDA-ARS National Clonal Germplasm Repository is composed primarily of V. corymbosum, but includes other Vaccinium species and hybrids (Ballington, Reference Ballington2001). The ploidy levels of these accessions range from diploid to hexaploid, with most being autotetraploid. Analysis with simple sequence repeat (SSR) markers found a high level of genetic diversity in Vaccinium accessions of the USDA-ARS collection (Boches et al., Reference Boches, Bassil and Rowland2006). The degree of allelic variation in specific genes among accessions of this germplasm collection is not known.
Genomic and bioinformatic resources have been developed for V. corymbosum, including expressed sequence tag (EST) libraries, EST-based molecular markers, genetic linkage maps and an online database to house blueberry genomic information (Rowland et al., Reference Rowland, Alkharouf, Darwish, Ogden, Polashock, Bassil and Main2012; Die and Rowland, Reference Die and Rowland2013). A draft genome sequence of the diploid V. corymbosum selection W8520 was generated by a combination of Roche 454 and Illumina sequencing (Bian et al., Reference Bian, Ballington, Raja, Brouwer, Reid, Burke, Wang, Rowland, Bassil and Brown2014). Annotation of the genome assembly with RNA-Seq data identified approximately 60,000 gene models (Gupta et al., Reference Gupta, Estrada, Blakley, Reid, Patel, Meyer, Andersen, Brown, Lila and Loraine2015). The availability of the genome assembly, annotations and RNA Seq data through an Integrated Genome Browser platform facilitates the identification of blueberry orthologues of candidate genes.
TERMINAL FLOWER 1 (TFL1) is a gene with the potential to affect both flowering time and plant architecture through its involvement in shoot meristem identity (Bradley et al., Reference Bradley, Ratcliffe, Vincent, Carpenter and Coen1997; McGarry and Ayre, Reference McGarry and Ayre2012). TFL1 is a member of a phosphatidyl ethanolamine-binding protein (PEBP) gene family and it has been found to repress the transition from vegetative to reproductive growth in shoot meristems of a variety of plants. In the perennial plants rose and woodland strawberry, TFL1 mutations result in repetitive flowering (Iwata et al., Reference Iwata, Gaston, Remay, Thouroude, Jeauffre, Kawamura, Oyant, Araki, Denoyes and Foucher2012). TFL1 mutations in the annual plants tomato, soybean and cowpea cause a switch from indeterminate to determinate growth (Pnueli et al., Reference Pnueli, Carmel-Goren, Hareven, Gutfinger, Alvarez, Ganal, Zamir and Lifschitz1998; Tian et al., Reference Tian, Wang, Lee, Li, Specht, Nelson, McClean, Qiu and Ma2010; Dhanasekar and Reddy, Reference Dhanasekar and Reddy2015). TFL1 has served as a domestication gene for several crops and the traits conferred by TFL1 variants may be of ornamental interest in blueberry. In this study, we identified V. corymbosum orthologues of TFL1 and its family members, examined allelic variation of VcTFL1 among accessions of a blueberry germplasm collection, and characterized a missense mutation likely to be deleterious to VcTFL1 function.
Materials and methods
Germplasm and DNA isolation
Young leaves of Vaccinium sp. were obtained from plants in the blueberry germplasm collection and breeding programme of the USDA-ARS (Corvallis, OR) and stored at −80°C until used. The accessions are listed in online Supplementary Table S1. To isolate genomic DNA, 50–100 mg of frozen leaf tissue was ground in 2 ml Eppendorf safe lock tubes using a bead mill (Tissuelyzer, Qiagen). DNA was extracted using a modified CTAB (cetyl trimethylammonium bromide) method (Porebski et al., Reference Porebski, Bailey and Baum1997). DNA samples were resuspended in TE (Tris-EDTA) buffer and quantified with a NanoDrop 800 spectrophotometer (Thermo Scientific).
Orthologue identification and primer design
The assembled draft genome sequence of the diploid V. corymbosum W8520 was available through GenSAS v3.0 (http://gensas2.bioinfo.wsu.edu/). A BLASTx search of the genomic database was conducted with Solanum lycopersicum SELF-PRUNING (GenBank AAC26161) as a query sequence. BLASTx analysis with TFL1 orthologues from other plant species (impatiens, apple, peach and Arabidopsis) identified the same V. corymbosum sequences. Coding regions with significant similarity to SlSP (e-value <10−15) were used to query the SwissProt database (Bairoch and Apweiler, Reference Bairoch and Apweiler2000) to confirm their identity. PEBP family protein sequences were compared by the neighbour-joining method of MEGA v6.06 (Tamura et al., Reference Tamura, Stecher, Peterson, Filipski and Kumar2013), with a total of 100 bootstraps. The gene model of VcTFL1 was predicted using AUGUSTUS (Stanke et al., Reference Stanke, Steinkamp, Waack and Morgenstern2004), after training with S. lycopersicum. PCR primers to amplify exons of VcTFL1 were designed using Primer3 (Rozen and Skaletsky, Reference Rozen, Skaletsky, Krawetz and Misener2000) and are shown in online Supplementary Table S2.
PCR and high-resolution melting (HRM) analysis
PCR and HRM were performed with a LightCycler 480 (Roche Diagnostics) in 96-well plates. PCR was conducted in a 20 µl volume containing 20 ng DNA, 10 × HRM master mix, 3.0 mM MgCl2 and 0.8 µM of each primer. Reactions were denatured at 95°C for 10 min, followed by 40 cycles of denaturation at 95°C for 20 s, step-down annealing temperature from 65–60°C at the rate of 0.2°C per cycle and extension at 72°C for 30 s. For exon four, 45 cycles of PCR was performed to allow the amplification curves to reach saturation point. The reference genome W8520 was included in each reaction at 12.5% of the total DNA (2.5 ng). Three technical replications of each sample–primer pair combination were conducted.
For HRM analysis, PCR products were denatured at 95°C for 1 min, cooled to 40°C for 1 min for re-annealing, and then heated from 70 to 95 °C at 0.02°C/s, while continuously measuring florescence with 25 data acquisitions/°C. DNA melting data were analysed with the LC480 Gene Scanning software (Roche Diagnostics) with settings for sensitivity and temperature shifting at 0.3 and 5, respectively. Difference plots were generated through subtraction of normalized and temperature shifted curves from the melting curve of the reference DNA (V. corymbosum W8520).
Sequencing and haplotype calling
For sequencing, PCR products were fractionated in a 1% agarose gel and purified by EXOSAP PCR (Affymetrix) cleanup. The resulting DNA samples were sequenced in both directions by the Georgia Genomics Facility (Athens, GA) with an Applied Biosystems 3730xl DNA Analyser.
The sequence data were analysed and aligned for SNP discovery using GENEIOUS 8.1.7 software (Kearse et al., Reference Kearse, Moir, Wilson, Stones-Havas, Cheung, Sturrock, Buxton, Cooper, Markowitz, Duran and Thierer2012). The allelic dosage for heterozygous SNPs in autotetraploid individuals was predicted by using the ‘Find Heterozygotes’ tools available in GENEIOUS. For calling duplex heterozygotes (AAaa), peak similarity between primary and secondary peaks was set to be 70% and for calling simplex (Aaaa) or triplex (AAAa), peak similarity was set to 30%.
In silico SNP analysis
For SNPs leading to non-synonymous amino acid substitutions, in silico analyses were conducted to predict, whether the amino acid change had an impact on protein function or stability. Functionally important regions of VcTFL1 were identified using ConSurf (Glaser et al., Reference Glaser, Pupko, Paz, Bell, Bechor-Shental, Martz and Ben-Tal2003). The effect of mutations was evaluated using the computational tools PROVEAN (Choi et al., Reference Choi, Sims, Murphy, Miller and Chan2012), SIFT (Ng and Henikoff, Reference Ng and Henikoff2003), PredictSNP (Bendl et al., Reference Bendl, Stourac, Salanda, Pavelka, Wieben, Zendulka, Brezovsky and Damborsky2014), Polyphen-1 and Polyphen-2 (Adzhubei et al., Reference Adzhubei, Schmidt, Peshkin, Ramensky, Gerasimova, Bork, Kondrashov and Sunyaev2010), Panther (Thomas et al., Reference Thomas, Campbell, Kejariwal, Mi, Karlak, Daverman, Diemer, Muruganujan and Narechania2003), MutPred (Li et al., Reference Li, Krishnan, Mort, Xin, Kamati, Cooper, Mooney and Radivojac2009) and MUpro (Cheng et al., Reference Cheng, Randall and Baldi2006). Default parameters of the software programs were used.
Results
Identification of VcTFL1
At the time of analysis, the draft blueberry genome sequence was assembled on 13,787 scaffolds. Five scaffolds contained sequences with significant similarity (e-value <10−17) to the tomato TFL1 orthologue SlSP. Query of the SwissProt database with the five sequences identified them as members of the PEBP gene family. A comparison of the predicted amino acid sequence of the V. corymbosum sequences with PEBP family members of other plant species found that one candidate sequence grouped with the TFL1 orthologue clade, while the others grouped in FLOWERING LOCUS T (FT), BROTHER OF FT, or CENTRORADIALIS clades (Fig. 1). The V. corymbosum gene that was similar in sequence to other TFL1 orthologues was designated VcTFL1 (GenBank KX834412).
The gene model of VcTFL1 revealed that the 2758 bp sequence was comprised of four exons that encode a protein of 174 amino acids, typical of TFL1 orthologues (Fig. 2a). Sequence comparison of VcTFL1 (Fig. 2b) found that it shared a high percentage of identical amino acids with orthologues from Fragaria vesca (83.1%), Medicago trunculata (80.9%), Malus domestica (79.7%), S. lycopersicum (75.7 %), Arabidopsis thaliana (76.9%) and Impatiens balsamina (77.5%). The VcTFL1 sequence was used to design primers that amplified coding sequences (Fig. 2a, online Supplementary Table S2).
Screening for allelic variation in VcTFL1
A collection of 160 blueberry accessions obtained from the USDA-ARS (Corvallis, OR) included 135 V. corymbosum accessions, 15 V. corymbosum × V. darrowii hybrids, one V. corymbosum × V. angustifolium hybrid, and nine accessions of other Vaccinium species, including V. darrowii, V. virgatum, V. fuscatum, V. simulatum and V. angustifolium (online Supplementary Table S1). TFL1 exons from these accessions were examined individually for allelic variation by HRM analysis. Figure 3 shows an example of variation found in exon 1 of five accessions. Relative to exon 1 of the reference genotype W8520, accessions ORUS 060-1 and ORUS 288-1 had a lower melting temperature and accessions ORUS 285–5, Grover, and O'Neal had a higher melting temperature (Fig. 3a). HRM could distinguish the ORUS 285–5, Grover, and O'Neal accessions, although they all varied from W8520 at the same nucleotide position (nt 104; Fig. 3b). GENIOUS analysis of secondary peaks of chromatograms at SNP104 identified these autotetraploid accessions as nulliplex, simplex and quadraplex haplotypes (Fig. 3c). The melting curves of exon 1 of ORUS 288–1 and ORUS 060–1 are more complex because of the presence two and five SNPs, respectively.
Allelic variants and potential effects on VcTFL1 function
Analysis of the four VcTFL1 exons by HRM and sequencing identified 18 SNP positions among the 160 accessions (Table 1). Most of these SNPs resulted in synonymous amino acid substitutions, but three SNPs caused non-synonymous changes (Fig. 2b). In silico analysis of the effect of non-synonymous changes predicted that SNPs at nucleotides 104 and 149 would have no impact on protein function, but the SNP at nucleotide 475 would be deleterious.
S, synonymous; N, non-synonymous.
a Predicted by PROVEAN analysis.
SNP475 causes a missense mutation from alanine to valine. Amino acids that are critical to VcTFL1 structure and function were identified by ConSurf analysis (Fig. 4). The alanine at amino acid 159 was predicted to be a buried residue that is conserved in TFL1 orthologues. The consequence of the A159 V substitution was further assessed by the software programs MuPro, MuPred and PredictSNP, a combination of tools including SIFT, Polyphen and Panther. In general, the A159 V substitution was predicted to be deleterious to TFL1 function (Table 2). A diploid accession (DE596) was found to be heterozygous for this SNP, which can potentially be used to obtain a novel ornamental phenotype through breeding.
a software tools SIFT, PolyPhen and Panther are integrated in PredictSNP, which transformed the score of each to a confidence level of 0–100%.
Discussion
The TFL1 orthologue of V. corymbosum was identified and allelic variation in this gene was discovered in a Vaccinium germplasm collection. Among 160 blueberry accessions, 18 SNP sites were detected, one of which was predicted by bioinformatic analyses to be deleterious to VcTFL1 function. The predominant allele among the accessions analysed was similar to the VcTFL1 present in the sequenced genome of diploid V. corymbosum line W8520, with other alleles having polymorphisms leading to synonymous amino acid changes or non-synonymous changes with no effect on VcTFL1 function. The detection of a genotype with a potentially deleterious VcTFL1 mutation among 160 blueberry accessions is indicative of the heterogeneity within this germplasm collection, as well as the utility of Ecotilling for SNP identification.
The identification of useful alleles for crop improvement in screens of germplasm collections of this size has been reported in other plant species. The size of an Ecotilling population needed to identify functional polymorphisms depends on factors such as ploidy, breeding habit, and heterogeneity. Ecotilling of 117 accessions of three Brassica species identified a SNP in fatty acid elongase1 leading to the loss of FAE1 function (Wang et al., Reference Wang, Shi, Tian, Ning, Wu, Long and Meng2010). In maize, screening of 175 inbred breeding lines found a SNP in isopentenyl transferase 2 that was associated with higher kernel weight (Weng et al., Reference Weng, Li, Liu, Yang, Wang, Hao, Li, Zhang, Ci, Li and Zhang2013). SNPs affecting seed weight were also detected in eight chickpea (Cicer arietinum) transcription factor genes in an analysis of 192 accessions (Bajaj et al., Reference Bajaj, Srivastava, Nath, Tripathi, Bharadwaj, Upadhyaya, Tyagi and Parida2016).
The blueberry germplasm collection that was examined is composed primarily of autotetraploid genotypes, which complicated SNP identification because of the presence to two nearly identical subgenomes. HRM analysis was able to discriminate between blueberry accessions with different haplotypes at the same SNP position. Haplotype variants could also be identified in candidate genes of autotetraploid potato genotypes by HRM analysis (De Koeyer et al., Reference De Koeyer, Douglass, Murphy, Whitney, Nolan, Song and De Jong2010). In autotetraploid alfalfa, a combination of next-generation sequencing (NGS) and HRM was used to discover thousands of SNPs (Han et al., Reference Han, Kang, Torres-Jerez, Cheung, Town, Zhao, Udvardi and Monteros2011). HRM was also found to be an efficient approach for SNP genotyping and mapping in alfalfa (Han et al., Reference Han, Khu and Monteros2012).
A diploid V. corymbosum accession, DE596, was identified with an A159 V variation in VcTFL1. The alanine at position 159 is in a highly conserved region of exon 4. Exon 4 has been shown to be critical for normal TFL1 function (Ahn et al., Reference Ahn, Miller, Winter, Banfield, Lee, Yoo, Henz, Brady and Weigel2006). In soybean, a missense mutation in TFL1 exon 4 (R168W) caused a switch from indeterminate to determinate growth habit (Tian et al., Reference Tian, Wang, Lee, Li, Specht, Nelson, McClean, Qiu and Ma2010). A similar change in growth habit was due to a missense mutation (P139H) in exon 4 of the cowpea TFL1 orthologue (Dhanasekar and Reddy, Reference Dhanasekar and Reddy2015).
Germplasm database records indicate that V. corymbosum DE596 has growth and flowering phenology typical of blueberry, which is to be expected for an accession with a heterozygous TFL1 mutation. This accession can potentially be used in a breeding programme to develop an ornamental blueberry with continuous flowering (Iwata et al., Reference Iwata, Gaston, Remay, Thouroude, Jeauffre, Kawamura, Oyant, Araki, Denoyes and Foucher2012). TFL1 is a target for breeding remontancy in cultivated strawberry (Koskela et al., Reference Koskela, Sønsteby, Flachowsky, Heide, Hanke, Elomaa and Hytönen2016). A homozygous TFL1 mutation may also result in an ornamental blueberry with more compact form due to a change in the balance of vegetative and reproductive meristems (McGarry and Ayre, Reference McGarry and Ayre2012). Obtaining homozygosity of tfl1 will require breeding or doubled haploid technology, similar to an approach used to improve potato with a loss-of-function allele (Muth et al., Reference Muth, Hartje, Twyman, Hofferbert, Tacke and Prüfer2008).
DNA sequencing was used in this study to confirm HRM results and determine haplotypes. With the advantages offered in scale and throughput, NGS will likely be the most efficient means to discover allelic variation. Sets of genes known to be involved in a trait, a biochemical pathway, or a regulatory cascade can be targeted by sequence capture methods such as SureSelect (Gnirke et al., Reference Gnirke, Melnikov, Maguire, Rogov, LeProust, Brockman, Fennell, Giannoukos, Fisher, Russ and Gabriel2009) or NimbleGen (Kiss et al., Reference Kiss, Ortoleva-Donnelly, Reginald Beer, Warner, Bailey, Colston, Rothberg, Link and Leamon2008). For example, Uitdewilligen et al. (Reference Uitdewilligen, Wolters, Bjorn, Borm, Visser and van Eck2013) used SureSelect to develop sequencing libraries of 807 target genes from 84 autotetraploid potato genotypes. This resulted in the detection of allelic variants associated with tuber flesh colour and plant maturity. The identification of allelic variation in candidate genes can provide functional or ‘perfect’ markers for breeding new traits (Moose and Mumm, Reference Moose and Mumm2008).
Supplementary Material
The supplementary material for this article can be found at https://doi.org/10.1017/S1479262116000435
Acknowledgements
The authors are grateful to Drs Nahla Bassil and Chad Finn (USDA-ARS, Corvallis OR) for providing the blueberry material for this study. R.G. was supported by a research assistantship from the UGA Institute of Plant Breeding, Genetics and Genomics.