Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-06T04:56:40.656Z Has data issue: false hasContentIssue false

Genome-wide SSR marker development in oil palm by Illumina HiSeq for parental selection

Published online by Cambridge University Press:  22 April 2015

Puntaree Taeprayoon
Affiliation:
Program in Plant Breeding, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen, Nakhon Pathom73140, Thailand
Patcharin Tanya*
Affiliation:
Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen, Nakhon Pathom73140, Thailand
Yang Jae Kang
Affiliation:
Department of Plant Science and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul151-921, Republic of Korea
Anek Limsrivilai
Affiliation:
Golden Tenera Limited Partnership, Krabi81000, Thailand
Suk-Ha Lee*
Affiliation:
Department of Plant Science and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul151-921, Republic of Korea Plant Genomics and Breeding Institute, Seoul National University, Seoul151-921, Republic of Korea
Peerasak Srinives
Affiliation:
Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen, Nakhon Pathom73140, Thailand
*
*Corresponding authors. E-mail: altanya55@yahoo.com; sukhalee@snu.ac.kr
*Corresponding authors. E-mail: altanya55@yahoo.com; sukhalee@snu.ac.kr
Rights & Permissions [Opens in a new window]

Abstract

Next-generation sequencing is a new technique for plant genome sequencing at a large scale that is faster and cheaper than previous sequencing technologies. The present work reports the development of new polymorphic simple sequence repeat (SSR) markers in oil palm (Elaeis guineensis Jacq.) using Illumina HiSeq sequencing data. More than 39 Gb (total 39,086,646,904 bases) was generated from the selected oil palm clone, D4. After de novo assembly, a total of 130,840 potential SSRs were identified. For SSR validation, 144 out of 762 SSR primer pairs were designed, including tri-nucleotide motifs, from the D4 contigs. Using 11 lines from three different clones of oil palm, 61 SSR primers revealed polymorphic alleles and high average polymorphic information content (PIC) values. Cluster analysis separated all oil palm plants into three clusters: clones A, B and C. These identified genome-wide SSR markers will enrich current genomic resources of the oil palm crop.

Type
Short Communication
Copyright
Copyright © NIAB 2015 

Introduction

Oil palm (Elaeis guineensis Jacq.) (2n= 2x= 32; 1.8 Gb) is a perennial oil crop, with high oil yield production per hectare compared with other vegetable oilseeds. The global production of oil palm in 2012–2013 was 55,969 thousand metric tons (USDA, 2014). The global harvesting area in 2011 was 3.60% of the 252.83 million hectare of total harvesting oilseed area. With a total 153.95 Mt of vegetative oil produced in 2011, oil palm had the highest oil yield production (36.30%) of any crop compared with soyabean (26.9%) and rapeseed (15.4%) (MPOC, 2014).

Simple sequence repeat (SSR) markers are useful markers for assessing genetic variation within and between populations, genome mapping and paternity confirmation (Kale et al., Reference Kale, Pardeshi, Kadoo, Ghorpade, Jana and Gupta2012). Next-generation sequencing (NGS) is a technology for large-scale genome sequencing and for exploring SSR markers in plants. The Illumina platform is an NGS approach that allows sequencing without genomic library construction, which is more accurate and more cost and time effective than conventional sequencing technologies (Van et al., Reference Van, Rastogi, Kim and Lee2013). The main objective of our study was to use NGS to shorten the time required for constructing and sequencing a genomic library of oil palm breeding material in order to isolate and pre-evaluate a large number of SSR markers.

Experimental

Three oil palm clones (A, B and C) were obtained from a breeding population of Golden Tenera Limited Partnership (Krabi, Thailand). Each clone comprised F2 plants derived from each F1 plant. Clone A was developed from a cross between Ulu Remis Dura and Dumpy AVROS Pisifera, clone B was derived from Deli Dura and Dumpy Pisifera (SP540P), while clone C was derived from Deli Dura and Dumpy Pisifera (S29/36P). Altogether, six Dura (D) and five Pisifera (P) plants were selected from about 100 F2 plants (10% selection intensity) to establish a set of parents for further testing. All selected plants possessed slow stem growth with no crown disease. The selected Dura plants gave an average yield of over 4 tons/year and over 25% oil/bunch at 7 years old. Clone A included palms no. D1, D2, D4, P2 and P7; clone B included D3, D5, D6, P3 and P4; while clone C was derived from a single ortet, P5.

Genomic DNA was extracted from young leaves Tanya et al. (Reference Tanya, Taeprayoon, Hadkam and Srinives2011). One of the maternal parents (D4) was chosen for sequencing using the Illumina HiSeq platform. Genome sequence assembly was carried out using the ABySS1.3.2 software (Simpson et al., Reference Simpson, Wong, Jackman, Schein, Jones and Birol2009). Identification of SSR motifs was performed by MISA software (Thiel et al., Reference Thiel, Michalek, Varshney and Graner2003). PRIMER3 (Rozen and Skaletsky, Reference Rozen and Skaletsky2000) was used to design SSR primer. PCR was amplified in 10 μl reactions containing 10 ng DNA, 10 ×  PCR buffer with MgCl2 (Vivagen, Seoul, Republic of Korea), 2.5 mM dNTPs, 10 μM for primer pairs and 1 U Taq polymerase. The amplification program consisted of de-naturation (95°C) for 10 min, 30 cycles of 95°C for 30 s, 54°C for 30 s and 72°C for 30 s with a final extension (72°C) for 5 min in a PCT-100™ Thermal Controller (MJ Research, USA). The Fragment analyzer™ Automated CE System was used to analyse PCR products. Based on the UPGMA method, a dendrogram was clustered using NTSYS-pc version 2.20e (Rohlf and Sokal, Reference Rohlf and Sokal1981). Polymorphic information content (PIC), expected heterozygosity (H e) and observed heterozygosity (H o) were calculated using the PowerMarker 3.25 program (Liu and Muse, Reference Liu and Muse2005).

Discussion

Using 386,996,504 reads of clone D4, de novo assembly generated a total of 218,183 contigs. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession no. JRVM00000000. The version described in this paper is version JRVM01000000. SSR motif search in these contigs identified 76,032 monomers (58.11%), 42,532 di-mers (32.51%), 6604 tri-mers (3.71%), 4859 tetra-mers (5.05%), 585 penta-mers (0.45%) and 228 hexa-mers (0.17%). For mononucleotides, A/T were more abundant (99.72%) than C/G (0.28%) as reported in E. oleifera (Zaki et al., Reference Zaki, Singh, Rosli and Ismail2012). Feng et al. (Reference Feng, Li, Huang, Wang and Wu2009) reported that mononucleotides were not considered for analysis. Among di-nucleotides, the AT/AT (51.13%) motif was the most abundant repeat followed by AG/CT (23.24%) and AC/GT (23.20%). Tri-nucleotides comprised mainly AAT/ATT (41.3%) and AAG/CTT (32.98%). The most abundant in tetra-nucleotide motifs were ACAT/ATGT (71.89%) and AAAT/ATTT (16.26%) while penta-nucleotide motifs were AATAT/ATATT (34.02%) and AAATT/AATTT (25.98%). Zhao et al. (Reference Zhao, Roxanne, Prakash and He2013) worked in date palm (Phoenix dactylifera L.) and found highly abundant di-nucleotides AG/CT (85.7%), AC/GT (8.2%), AT/TA (5.4%) and GC/CG (0.7%) while among the tri-nucleotides AGG (26.8%) and AAG (9.3%) were most abundant. Ting et al. (Reference Ting, Noorhariza, Rozana, Low, Ithnin, Cheah, Tan and Singh2010) reported SSR mining in an oil palm (E. guineensis) EST database and found that the major di-nucleotides consisting of AG/CT (66.9%), AT/AT (21.9%) and AC/GT (10.9%), tri-nucleotides consisting of AAG/CTT (23.3%), AAG/CTT (23.3%), AGG/CCT (13.7%) and AAT/ATT (10.8%). Our analysis found 130,840 SSR motifs in the 499,254,157 bp sequences examined (Table 1). Only 763 (11.55%) out of 6604 tri-nucleotide sequences had primers designed for SSR markers. The other motifs had insufficient flanking sequences to design a pair of primers. The designed markers were classified into ten classes (Table 1). Class 1 was the most abundant (41.3%), whereas class 8 (0.1%) and class 10 (0.3%) were the least. The (AAT)n motifs in class 1 and (AAG)n in class 2 gave the largest number at 100 and 70 SSR primers, respectively. Out of 763 SSR primer pairs, 144 were used for pre-evaluation of primers using the 11 elite oil palms. The numbers of primers designed from classes 1, 2, 3, 6 and 9 were 48, 39, 27 20 and 10 pairs, respectively. These motifs were commonly found in oil palm, i.e. E. guineensis EST-SSR (Singh et al., Reference Singh, Noorhariza, Ting, Rozana, Tan, Low, Ithnin and Cheah2008) and E. oleifera SSR (Noorharriza et al., 2012). Out of the 144 SSR markers, 61 were amplifiable. There were 18.03, 31.15, 26.23, 13.11 and 11.48% from class 1, 2, 3, 6 and 9, respectively. These primers detected 371 polymorphic alleles ranging from 3 to 11 alleles per locus (average of six alleles). The highest and lowest PIC values were 0.75 and 0.62 with an average of 0.68 similar to previously reports by Noorharriza et al. (2012) (PIC = 0.63). The H e (0.72) and H o (0.41) showed H e was higher than H o implying that the oil palm samples might be affected by repeated rounds of breeding cycles that have caused deviation from the Hardy–Weinberg equilibrium. The higher H o in our work compared with that reported in E. oleifera (H o= 0.16) by Noorharriza et al. (2012) revealed that E. guineensis used in our experiment had higher heterozygosity. The E. oleifera samples were collected from isolated areas in four countries of South-Central America. The dendrogram separated the palms into three groups at 0.45 Jaccard's coefficient (Fig. 1) with 0.81 cophenetic correlation. We concluded that NGS technology using Illumina HiSeq platform is a powerful tool for sequencing and developing valuable molecular markers from a large and complex genomes.

Table 1 Summary of sequencing information, genome assembly, SSR screen and frequency of the tri-nucleotide repeat motif used for primer design

Fig. 1 A dendrogram of genetic relatedness among 11 oil palm parents based on UPGMA clustering. Cluster analysis clearly separated these into three groups with a 0.45 Jaccard's similarity coefficient. Samples from the same clone tended to group together.

Supplementary material

To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S1479262115000143

Acknowledgements

Puntaree Taeprayoon was granted a scholarship (CHE-PhD-SW) from Commission on Higher Education (CHE). The authors thank (1) Center of Excellence in Oil Palm Biotechnology for Renewable Energy, CHE (2) Center for Advanced Studies for Agriculture and Food (CASAF), Kasetsart University, (3) Next Generation BioGreen 21 Program (code no. PJ0110262015), Rural Development Administration, Republic of Korea, and (4) Golden Tenera Limited Partnership, Krabi, Thailand.

References

Feng, SP, Li, WG, Huang, HS, Wang, JY and Wu, YT (2009) Development, characterization and cross-species/genera transferability of EST-SSR markers for rubber tree (Hevea brasiliensis). Molecular Breeding 23: 8597.CrossRefGoogle Scholar
Kale, SM, Pardeshi, VC, Kadoo, NY, Ghorpade, PB, Jana, MM and Gupta, VS (2012) Development of genomic simple sequence repeat markers for linseed using next-generation sequencing technology. Molecular Breeding 30: 597606.CrossRefGoogle Scholar
Liu, K and Muse, SV (2005) POWERMARKER: an integrated analysis environment for genetic marker analysis. Bioinformatics 21: 21282129.CrossRefGoogle ScholarPubMed
Malaysian Palm Oil Council (MPOC) (2014) Palm Oil Fact. http://www.mpoc.org.my/Palm_Oil_Fact_Slides.aspx (accessed accessed 20 March 2014).Google Scholar
Rohlf, FJ and Sokal, RR (1981) Comparing numerical taxonomic studies. Systematic Zoology 30: 459490.CrossRefGoogle Scholar
Rozen, S and Skaletsky, H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods in Molecular Biology 132: 365368.Google ScholarPubMed
Simpson, JT, Wong, K, Jackman, SD, Schein, JE, Jones, SJ and Birol, I (2009) ABySS: a parallel assembler for short read sequence data. Genome Research 19: 11171123.CrossRefGoogle ScholarPubMed
Singh, R, Noorhariza, MZ, Ting, NC, Rozana, R, Tan, SG, Low, LET, Ithnin, M and Cheah, SC (2008) Exploiting an oil palm EST the development of gene derived and their exploitation for assessment of genetic diversity. Biologia 63: 19.CrossRefGoogle Scholar
Tanya, P, Taeprayoon, P, Hadkam, Y and Srinives, P (2011) Genetic diversity among Jatropha and Jatropha related species based on ISSR markers. Plant Molecular Biology Reporter 29: 252264.CrossRefGoogle Scholar
Thiel, T, Michalek, W, Varshney, RK and Graner, A (2003) Exploiting EST databases for the development and characterization of gene derived SSR markers in barley (Hordeum vulgare L.). Theoretical and Applied Genetics 106: 411422.CrossRefGoogle ScholarPubMed
Ting, NC, Noorhariza, MZ, Rozana, R, Low, LET, Ithnin, M, Cheah, SC, Tan, SG and Singh, R (2010) SSR mining in oil palm EST database: application in oil palm germplasm diversity studies. Journal of Genetics 89: 135145.CrossRefGoogle ScholarPubMed
United States Department of Agriculture (USDA) (2014) Oil seeds world market and trade. http://apps.fas.usda.gov/psdonline/psdHome.aspx (accessed accessed 13 August 2014).Google Scholar
Van, K, Rastogi, K, Kim, KH and Lee, SH (2013) Next-generation sequencing technology for crop improvement. SABRAO Journal of Breeding and Genetics 45: 8499.Google Scholar
Zaki, NM, Singh, R, Rosli, R and Ismail, I (2012) Elaeis oleifera genomic-SSR markers: exploitation in oil palm germplasm diversity and cross-amplification in Arecaceae. International Journal of Molecular Sciences 13: 40694088.CrossRefGoogle ScholarPubMed
Zhao, YL, Roxanne, W, Prakash, CS and He, GH (2013) Identification and characterization of gene-based SSR markers in date palm (Phoenix dactyilfera L.). BMC Plant Biology 12: 237.CrossRefGoogle Scholar
Figure 0

Table 1 Summary of sequencing information, genome assembly, SSR screen and frequency of the tri-nucleotide repeat motif used for primer design

Figure 1

Fig. 1 A dendrogram of genetic relatedness among 11 oil palm parents based on UPGMA clustering. Cluster analysis clearly separated these into three groups with a 0.45 Jaccard's similarity coefficient. Samples from the same clone tended to group together.

Supplementary material: File

Taeprayoon supplementary material

Table S1

Download Taeprayoon supplementary material(File)
File 24.4 KB
Supplementary material: File

Taeprayoon supplementary material

Table S2

Download Taeprayoon supplementary material(File)
File 97.8 KB