Genome-wide SSR marker development in oil palm by Illumina HiSeq for parental selection

Puntaree Taeprayoon; Patcharin Tanya; Yang Jae Kang; Anek Limsrivilai; Suk-Ha Lee; Peerasak Srinives

doi:10.1017/S1479262115000143

Genome-wide SSR marker development in oil palm by Illumina HiSeq for parental selection

Published online by Cambridge University Press: 22 April 2015

Puntaree Taeprayoon ,

Suk-Ha Lee and

Puntaree Taeprayoon: Affiliation:
Program in Plant Breeding, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen, Nakhon Pathom73140, Thailand
Patcharin Tanya*: Affiliation:
Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen, Nakhon Pathom73140, Thailand
Yang Jae Kang: Affiliation:
Department of Plant Science and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul151-921, Republic of Korea
Anek Limsrivilai: Affiliation:
Golden Tenera Limited Partnership, Krabi81000, Thailand
Suk-Ha Lee*: Affiliation:
Department of Plant Science and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul151-921, Republic of Korea Plant Genomics and Breeding Institute, Seoul National University, Seoul151-921, Republic of Korea
Peerasak Srinives: Affiliation:
Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen, Nakhon Pathom73140, Thailand
*: *Corresponding authors. E-mail: altanya55@yahoo.com; sukhalee@snu.ac.kr
*Corresponding authors. E-mail: altanya55@yahoo.com; sukhalee@snu.ac.kr

Article contents

Abstract
Introduction
Experimental
Discussion
Supplementary material
References

Rights & Permissions

Abstract

Next-generation sequencing is a new technique for plant genome sequencing at a large scale that is faster and cheaper than previous sequencing technologies. The present work reports the development of new polymorphic simple sequence repeat (SSR) markers in oil palm (Elaeis guineensis Jacq.) using Illumina HiSeq sequencing data. More than 39 Gb (total 39,086,646,904 bases) was generated from the selected oil palm clone, D4. After de novo assembly, a total of 130,840 potential SSRs were identified. For SSR validation, 144 out of 762 SSR primer pairs were designed, including tri-nucleotide motifs, from the D4 contigs. Using 11 lines from three different clones of oil palm, 61 SSR primers revealed polymorphic alleles and high average polymorphic information content (PIC) values. Cluster analysis separated all oil palm plants into three clusters: clones A, B and C. These identified genome-wide SSR markers will enrich current genomic resources of the oil palm crop.

Keywords

Elaeis guineensis next-generation sequencing oil palm SSR

Type: Short Communication
Information: Plant Genetic Resources , Volume 14 , Issue 2 , June 2016 , pp. 157 - 160

DOI: https://doi.org/10.1017/S1479262115000143 [Opens in a new window]
Copyright: Copyright © NIAB 2015

Introduction

Oil palm (Elaeis guineensis Jacq.) (2n= 2x= 32; 1.8 Gb) is a perennial oil crop, with high oil yield production per hectare compared with other vegetable oilseeds. The global production of oil palm in 2012–2013 was 55,969 thousand metric tons (USDA, 2014). The global harvesting area in 2011 was 3.60% of the 252.83 million hectare of total harvesting oilseed area. With a total 153.95 Mt of vegetative oil produced in 2011, oil palm had the highest oil yield production (36.30%) of any crop compared with soyabean (26.9%) and rapeseed (15.4%) (MPOC, 2014).

Simple sequence repeat (SSR) markers are useful markers for assessing genetic variation within and between populations, genome mapping and paternity confirmation (Kale et al., Reference Kale, Pardeshi, Kadoo, Ghorpade, Jana and Gupta2012). Next-generation sequencing (NGS) is a technology for large-scale genome sequencing and for exploring SSR markers in plants. The Illumina platform is an NGS approach that allows sequencing without genomic library construction, which is more accurate and more cost and time effective than conventional sequencing technologies (Van et al., Reference Van, Rastogi, Kim and Lee2013). The main objective of our study was to use NGS to shorten the time required for constructing and sequencing a genomic library of oil palm breeding material in order to isolate and pre-evaluate a large number of SSR markers.

Experimental

Three oil palm clones (A, B and C) were obtained from a breeding population of Golden Tenera Limited Partnership (Krabi, Thailand). Each clone comprised F₂ plants derived from each F₁ plant. Clone A was developed from a cross between Ulu Remis Dura and Dumpy AVROS Pisifera, clone B was derived from Deli Dura and Dumpy Pisifera (SP540P), while clone C was derived from Deli Dura and Dumpy Pisifera (S29/36P). Altogether, six Dura (D) and five Pisifera (P) plants were selected from about 100 F₂ plants (10% selection intensity) to establish a set of parents for further testing. All selected plants possessed slow stem growth with no crown disease. The selected Dura plants gave an average yield of over 4 tons/year and over 25% oil/bunch at 7 years old. Clone A included palms no. D1, D2, D4, P2 and P7; clone B included D3, D5, D6, P3 and P4; while clone C was derived from a single ortet, P5.

Genomic DNA was extracted from young leaves Tanya et al. (Reference Tanya, Taeprayoon, Hadkam and Srinives2011). One of the maternal parents (D4) was chosen for sequencing using the Illumina HiSeq platform. Genome sequence assembly was carried out using the ABySS1.3.2 software (Simpson et al., Reference Simpson, Wong, Jackman, Schein, Jones and Birol2009). Identification of SSR motifs was performed by MISA software (Thiel et al., Reference Thiel, Michalek, Varshney and Graner2003). PRIMER3 (Rozen and Skaletsky, Reference Rozen and Skaletsky2000) was used to design SSR primer. PCR was amplified in 10 μl reactions containing 10 ng DNA, 10 × PCR buffer with MgCl₂ (Vivagen, Seoul, Republic of Korea), 2.5 mM dNTPs, 10 μM for primer pairs and 1 U Taq polymerase. The amplification program consisted of de-naturation (95°C) for 10 min, 30 cycles of 95°C for 30 s, 54°C for 30 s and 72°C for 30 s with a final extension (72°C) for 5 min in a PCT-100™ Thermal Controller (MJ Research, USA). The Fragment analyzer™ Automated CE System was used to analyse PCR products. Based on the UPGMA method, a dendrogram was clustered using NTSYS-pc version 2.20e (Rohlf and Sokal, Reference Rohlf and Sokal1981). Polymorphic information content (PIC), expected heterozygosity (H _e) and observed heterozygosity (H _o) were calculated using the PowerMarker 3.25 program (Liu and Muse, Reference Liu and Muse2005).

Discussion

Using 386,996,504 reads of clone D4, de novo assembly generated a total of 218,183 contigs. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession no. JRVM00000000. The version described in this paper is version JRVM01000000. SSR motif search in these contigs identified 76,032 monomers (58.11%), 42,532 di-mers (32.51%), 6604 tri-mers (3.71%), 4859 tetra-mers (5.05%), 585 penta-mers (0.45%) and 228 hexa-mers (0.17%). For mononucleotides, A/T were more abundant (99.72%) than C/G (0.28%) as reported in E. oleifera (Zaki et al., Reference Zaki, Singh, Rosli and Ismail2012). Feng et al. (Reference Feng, Li, Huang, Wang and Wu2009) reported that mononucleotides were not considered for analysis. Among di-nucleotides, the AT/AT (51.13%) motif was the most abundant repeat followed by AG/CT (23.24%) and AC/GT (23.20%). Tri-nucleotides comprised mainly AAT/ATT (41.3%) and AAG/CTT (32.98%). The most abundant in tetra-nucleotide motifs were ACAT/ATGT (71.89%) and AAAT/ATTT (16.26%) while penta-nucleotide motifs were AATAT/ATATT (34.02%) and AAATT/AATTT (25.98%). Zhao et al. (Reference Zhao, Roxanne, Prakash and He2013) worked in date palm (Phoenix dactylifera L.) and found highly abundant di-nucleotides AG/CT (85.7%), AC/GT (8.2%), AT/TA (5.4%) and GC/CG (0.7%) while among the tri-nucleotides AGG (26.8%) and AAG (9.3%) were most abundant. Ting et al. (Reference Ting, Noorhariza, Rozana, Low, Ithnin, Cheah, Tan and Singh2010) reported SSR mining in an oil palm (E. guineensis) EST database and found that the major di-nucleotides consisting of AG/CT (66.9%), AT/AT (21.9%) and AC/GT (10.9%), tri-nucleotides consisting of AAG/CTT (23.3%), AAG/CTT (23.3%), AGG/CCT (13.7%) and AAT/ATT (10.8%). Our analysis found 130,840 SSR motifs in the 499,254,157 bp sequences examined (Table 1). Only 763 (11.55%) out of 6604 tri-nucleotide sequences had primers designed for SSR markers. The other motifs had insufficient flanking sequences to design a pair of primers. The designed markers were classified into ten classes (Table 1). Class 1 was the most abundant (41.3%), whereas class 8 (0.1%) and class 10 (0.3%) were the least. The (AAT)_n motifs in class 1 and (AAG)_n in class 2 gave the largest number at 100 and 70 SSR primers, respectively. Out of 763 SSR primer pairs, 144 were used for pre-evaluation of primers using the 11 elite oil palms. The numbers of primers designed from classes 1, 2, 3, 6 and 9 were 48, 39, 27 20 and 10 pairs, respectively. These motifs were commonly found in oil palm, i.e. E. guineensis EST-SSR (Singh et al., Reference Singh, Noorhariza, Ting, Rozana, Tan, Low, Ithnin and Cheah2008) and E. oleifera SSR (Noorharriza et al., 2012). Out of the 144 SSR markers, 61 were amplifiable. There were 18.03, 31.15, 26.23, 13.11 and 11.48% from class 1, 2, 3, 6 and 9, respectively. These primers detected 371 polymorphic alleles ranging from 3 to 11 alleles per locus (average of six alleles). The highest and lowest PIC values were 0.75 and 0.62 with an average of 0.68 similar to previously reports by Noorharriza et al. (2012) (PIC = 0.63). The H _e (0.72) and H _o (0.41) showed H _e was higher than H _o implying that the oil palm samples might be affected by repeated rounds of breeding cycles that have caused deviation from the Hardy–Weinberg equilibrium. The higher H _o in our work compared with that reported in E. oleifera (H _o= 0.16) by Noorharriza et al. (2012) revealed that E. guineensis used in our experiment had higher heterozygosity. The E. oleifera samples were collected from isolated areas in four countries of South-Central America. The dendrogram separated the palms into three groups at 0.45 Jaccard's coefficient (Fig. 1) with 0.81 cophenetic correlation. We concluded that NGS technology using Illumina HiSeq platform is a powerful tool for sequencing and developing valuable molecular markers from a large and complex genomes.

Table 1 Summary of sequencing information, genome assembly, SSR screen and frequency of the tri-nucleotide repeat motif used for primer design

Fig. 1 A dendrogram of genetic relatedness among 11 oil palm parents based on UPGMA clustering. Cluster analysis clearly separated these into three groups with a 0.45 Jaccard's similarity coefficient. Samples from the same clone tended to group together.

Supplementary material

To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S1479262115000143

Acknowledgements

Puntaree Taeprayoon was granted a scholarship (CHE-PhD-SW) from Commission on Higher Education (CHE). The authors thank (1) Center of Excellence in Oil Palm Biotechnology for Renewable Energy, CHE (2) Center for Advanced Studies for Agriculture and Food (CASAF), Kasetsart University, (3) Next Generation BioGreen 21 Program (code no. PJ0110262015), Rural Development Administration, Republic of Korea, and (4) Golden Tenera Limited Partnership, Krabi, Thailand.

References

Feng, SP, Li, WG, Huang, HS, Wang, JY and Wu, YT (2009) Development, characterization and cross-species/genera transferability of EST-SSR markers for rubber tree (Hevea brasiliensis). Molecular Breeding 23: 85–97.CrossRef Google Scholar

Kale, SM, Pardeshi, VC, Kadoo, NY, Ghorpade, PB, Jana, MM and Gupta, VS (2012) Development of genomic simple sequence repeat markers for linseed using next-generation sequencing technology. Molecular Breeding 30: 597–606.CrossRef Google Scholar

Liu, K and Muse, SV (2005) POWERMARKER: an integrated analysis environment for genetic marker analysis. Bioinformatics 21: 2128–2129.CrossRef Google Scholar PubMed

Malaysian Palm Oil Council (MPOC) (2014) Palm Oil Fact. http://www.mpoc.org.my/Palm_Oil_Fact_Slides.aspx (accessed accessed 20 March 2014).Google Scholar

Rohlf, FJ and Sokal, RR (1981) Comparing numerical taxonomic studies. Systematic Zoology 30: 459–490.CrossRef Google Scholar

Rozen, S and Skaletsky, H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods in Molecular Biology 132: 365–368.Google Scholar PubMed

Simpson, JT, Wong, K, Jackman, SD, Schein, JE, Jones, SJ and Birol, I (2009) ABySS: a parallel assembler for short read sequence data. Genome Research 19: 1117–1123.CrossRef Google Scholar PubMed

Singh, R, Noorhariza, MZ, Ting, NC, Rozana, R, Tan, SG, Low, LET, Ithnin, M and Cheah, SC (2008) Exploiting an oil palm EST the development of gene derived and their exploitation for assessment of genetic diversity. Biologia 63: 1–9.CrossRef Google Scholar

Tanya, P, Taeprayoon, P, Hadkam, Y and Srinives, P (2011) Genetic diversity among Jatropha and Jatropha related species based on ISSR markers. Plant Molecular Biology Reporter 29: 252–264.CrossRef Google Scholar

Thiel, T, Michalek, W, Varshney, RK and Graner, A (2003) Exploiting EST databases for the development and characterization of gene derived SSR markers in barley (Hordeum vulgare L.). Theoretical and Applied Genetics 106: 411–422.CrossRef Google Scholar PubMed

Ting, NC, Noorhariza, MZ, Rozana, R, Low, LET, Ithnin, M, Cheah, SC, Tan, SG and Singh, R (2010) SSR mining in oil palm EST database: application in oil palm germplasm diversity studies. Journal of Genetics 89: 135–145.CrossRef Google Scholar PubMed

United States Department of Agriculture (USDA) (2014) Oil seeds world market and trade. http://apps.fas.usda.gov/psdonline/psdHome.aspx (accessed accessed 13 August 2014).Google Scholar

Van, K, Rastogi, K, Kim, KH and Lee, SH (2013) Next-generation sequencing technology for crop improvement. SABRAO Journal of Breeding and Genetics 45: 84–99.Google Scholar

Zaki, NM, Singh, R, Rosli, R and Ismail, I (2012) Elaeis oleifera genomic-SSR markers: exploitation in oil palm germplasm diversity and cross-amplification in Arecaceae. International Journal of Molecular Sciences 13: 4069–4088.CrossRef Google Scholar PubMed

Zhao, YL, Roxanne, W, Prakash, CS and He, GH (2013) Identification and characterization of gene-based SSR markers in date palm (Phoenix dactyilfera L.). BMC Plant Biology 12: 237.CrossRef Google Scholar

Table 1 Summary of sequencing information, genome assembly, SSR screen and frequency of the tri-nucleotide repeat motif used for primer design

Taeprayoon supplementary material

Table S1

File 24.4 KB

Taeprayoon supplementary material

Table S2

File 97.8 KB

Article contents

Genome-wide SSR marker development in oil palm by Illumina HiSeq for parental selection

Abstract

Keywords

Introduction

Experimental

Discussion

Supplementary material

Acknowledgements

References

Taeprayoon supplementary material

Taeprayoon supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests