Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-06T23:48:44.218Z Has data issue: false hasContentIssue false

Novel microsatellite markers for the oriental fruit moth Grapholita molesta (Lepidoptera: Tortricidae) and effects of null alleles on population genetics analyses

Published online by Cambridge University Press:  07 November 2016

W. Song
Affiliation:
Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China College of Life Sciences, Hebei Normal University, Shijiazhuang 071000, China
L.-J. Cao
Affiliation:
Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Y.-Z. Wang
Affiliation:
Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China Key Laboratory of Forest Disaster Warning and Control of Yunnan Province, College of Forestry, Southwest Forestry University, Kunming 650224, China
B.-Y. Li
Affiliation:
Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China Key Laboratory of Forest Disaster Warning and Control of Yunnan Province, College of Forestry, Southwest Forestry University, Kunming 650224, China
S.-J. Wei*
Affiliation:
Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
*
*Author for correspondence Phone: +86 010 -51503439 Fax: +86 010-51503899 E-mail: shujun268@163.com
Rights & Permissions [Opens in a new window]

Abstract

The oriental fruit moth (OFM) Grapholita molesta (Lepidoptera: Tortricidae) is an important economic pest of stone and pome fruits worldwide. We sequenced the OFM genome using next-generation sequencing and characterized the microsatellite distribution. In total, 56,674 microsatellites were identified, with 11,584 loci suitable for primer design. Twenty-seven polymorphic microsatellites, including 24 loci with trinucleotide repeat and three with pentanucleotide repeat, were validated in 95 individuals from four natural populations. The allele numbers ranged from 4 to 40, with an average value of 13.7 per locus. A high frequency of null alleles was observed in most loci developed for the OFM. Three marker panels, all of the loci, nine loci with the lowest null allele frequencies, and nine loci with the highest null allele frequencies, were established for population genetics analyses. The null allele influenced estimations of genetic diversity parameters but not the OFM's genetic structure. Both a STRUCTURE analysis and a discriminant analysis of principal components, using the three marker panels, divided the four natural populations into three groups. However, more individuals were incorrectly assigned by the STRUCTURE analysis when the marker panel with the highest null allele frequency was used compared with the other two panels. Our study provides empirical research on the effects of null alleles on population genetics analyses. The microsatellites developed will be valuable markers for genetic studies of the OFM.

Type
Research Papers
Copyright
Copyright © Cambridge University Press 2016 

Introduction

The oriental fruit moth (OFM) Grapholita molesta (Lepidoptera: Tortricidae) is a major pest of stone and pome fruit, especially species of Rosaceae (Rothschild & Vickers, Reference Rothschild, Vickers, der Geest and Evenhuis1991). Larvae of the OFM cause damage by boring into twigs as well as fruits. Assumed to be native to China, this moth has spread into other stone-fruit growing continents, such as the Europe, Africa, South and North America, New Zealand, and Australia, during the last century (Quaintance & Wood, Reference Quaintance and Wood1916; Rothschild & Vickers, Reference Rothschild, Vickers, der Geest and Evenhuis1991).

Population genetics of the OFM have been widely studied on both global and national scales (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010; Kirk et al., Reference Kirk, Dorn and Mazzi2013; Zheng et al., Reference Zheng, Peng, Liu, Pan, Dorn and Chen2013) because of the OFM's threat to fruit industries and complicated invasion history. The high genetic diversity in the Chinese populations provided strong evidence that China was the native range of the OFM (Kirk et al., Reference Kirk, Dorn and Mazzi2013; Zheng et al., Reference Zheng, Peng, Liu, Pan, Dorn and Chen2013). Further population genetic structure studies across China and South Korea revealed that this moth originated in Southwestern China and followed a northward dispersal (Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015). Global studies of the OFM indicated a weak but significant genetic structure on a continental scale (Kirk et al., Reference Kirk, Dorn and Mazzi2013). Although no pattern of isolation by distance was found within invasive populations of South Africa or Italy (Timm et al., Reference Timm, Geertsema and Warnich2008; Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010), Silva-Brandão et al., found that geographic distance was the main factor affecting genetic structure and gene flow in the invasive populations of Brazil (Silva Brandão et al., Reference Silva Brandão, Brandão, Omoto and Sperling2015). Structured populations from different orchards within an area indicated a selective host switch occurred in certain segments of the population (Zheng et al., Reference Zheng, Peng, Liu, Pan, Dorn and Chen2013). Also, pest management methods, such as fruit bagging, can have an important impact on the levels of genetic diversity and the genetic structures of OFM populations (Zheng et al., Reference Zheng, Qiao, Wang, Dorn and Chen2015).

In previous genetic studies of the OFM, four types of markers, including amplified fragment length polymorphisms (Timm et al., Reference Timm, Geertsema and Warnich2008), microsatellites (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010), mitochondrial genes (Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015) and single nucleotide polymorphisms (Silva Brandão et al., Reference Silva Brandão, Brandão, Omoto and Sperling2015) were used. Among the markers, microsatellites were the most frequently used (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010; Kirk et al., Reference Kirk, Dorn and Mazzi2013; Zheng et al., Reference Zheng, Peng, Liu, Pan, Dorn and Chen2013; Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015; Zheng et al., Reference Zheng, Qiao, Wang, Dorn and Chen2015) because of their high levels of polymorphism, co-dominance and easy detection by polymerase chain reaction (PCR). However, microsatellite development is challenging work, especially for species of Lepidoptera (Zhang, Reference Zhang2004). The first set of ten microsatellite markers for the OFM was developed from an enriched library of genomic DNA (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010). These microsatellite loci showed relatively high frequencies of null alleles, ranging from 0.005 to 0.208 in Torriani et al. (Reference Torriani, Mazzi, Hein and Dorn2010), from 0.07 to 0.25 in Kirk et al. (Reference Kirk, Dorn and Mazzi2013), from 0.112 to 0.241 in Zheng et al. (Reference Zheng, Qiao, Wang, Dorn and Chen2015), from 0.13 to 0.28 in Zheng et al. (Reference Zheng, Peng, Liu, Pan, Dorn and Chen2013) and from 0.04–0.31 in Wei et al. (Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015). A new set of nine microsatellite markers was developed using the high-throughput genomic sequencing approach; however, the null allele frequency was still very high, ranging from 0.05 to 0.33 (Kirk et al., Reference Kirk, Dorn and Mazzi2013).

Null alleles are a common issue of microsatellite markers in eukaryotic genomes, especially in Lepidoptera (Sinama et al., Reference Sinama, Dubut, Costedoat, Gilles, Junker, Malausa, Martin, Neve, Pech, Schmitt, Zimmermann and Meglecz2011). The high null allele frequency leads to great difficulty in microsatellite genotyping (Anthony et al., Reference Anthony, Gelembiuk, Raterman and Nice2001; Habel et al., Reference Habel, Finger, Meyer, Schmitt and Assmann2008) and the transferability between species of the same genus (Flanagan et al., Reference Flanagan, Blum, Davison, Alamo, Albarrán, Faulhaber, Peterson and Mcmillan2002) as well as between populations of the same species (Jiggins et al., Reference Jiggins, Mavarez, Beltran, McMillan, Johnston and Bermingham2005). Microsatellite null alleles are caused by flanking sequence variants (point mutations, insertions or deletions) that affect primer binding sites (Wang, Reference Wang1994), unlike null alleles of isozyme markers (Primmer et al., Reference Primmer, Møller and Ellegren1995). Although software programs, such as Micro-checker (Van Oosterhout et al., Reference Van Oosterhout, Hutchinson, Wills and Shipley2004), GENEPOP (Raymond & Rousset, Reference Raymond and Rousset1995; Rousset, Reference Rousset2008) and FreeNA (Chapuis & Estoup, Reference Chapuis and Estoup2007), have been developed to estimate the frequencies of null alleles, the effects of the null alleles on genetic diversity analyses and population structures needs further empirical evaluation.

In this study, we developed novel sets of microsatellite markers for the OFM from the genomic sequences and validated them in four natural populations from the native range of China. Additionally, the effects of null alleles on population genetics analyses were assessed.

Materials and methods

Samples and DNA extraction

A larva of the OFM collected from Aksu Prefecture, Xinjiang Province, China was sampled for genomic sequencing. Eight individuals from eight different locations of China were used for the initial testing of the primer pairs. In total, 95 OFM larvae collected between 2010 and 2012 from four geographic locations across China, including 17 from Shilin, Yunnan Province (SL) (E103°19′48.00″, N24°51′48.96″), 39 from Chengdu, Sichuan Province (CD) (E104°03′53.48″, N30°39′30.96″), 20 from Nanjing, Jiangsu Province (NJ) (E118°47′48.76″, N32°03′36.92″) and 19 from Shenyang, Liaoning Province (SY) (E123°25′53.29″, N41°48′20.59″), were used for population-level analyses. All of the specimens were stored in absolute ethanol and frozen at −80°C prior to the DNA extraction and stored at the Integrated Pest Management Laboratory of the Beijing Academy of Agriculture and Forestry Sciences. Genomic DNA was extracted from a segment of an individual larva using a DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.

Sequencing and assembly of the OFM genome

A library with a 500-bp insert size was constructed using the Illumina TruSeq DNA PCR-Free HT Library Prep Kit (Illumina, San Diego, CA, USA). The MiSeq Reagent Kit v3 (Illumina, San Diego, CA, USA) was used to sequence the prepared library on an Illumina MiSeq Sequencer. The generated genomic sequences were assembled by IDBA with a Kmer = 80–240 (Peng et al., Reference Peng, Leung, Yiu and Chin2010). The low-quality reads were removed using SolexaQA software (Cox et al., Reference Cox, Peterson and Biggs2010).

Genome-wide microsatellite survey and primer design

We surveyed all of the potential microsatellite loci from the assembled genomic sequences using the software MSDB (Du et al., Reference Du, Li, Zhang and Yue2013) under parameters of a minimum of 250 (an extremely high value to exclude mononucleotide motifs), 5, 5, 5, 5 and 5 repeats to identify the mono-, di-, tri-, tetra-, penta- and hexanucleotide motifs, respectively. Classifying repeats into classes was simplified. For example, among the dinucleotide, (AC)n, (CA)n, (TG)n and (GT)n were considered as the same class (Jurka & Pethiyagoda, Reference Jurka and Pethiyagoda1995). The program QDD (Meglécz et al., Reference Meglécz, Costedoat, Dubut, Gilles, Malausa, Pech and Martin2010) was used to isolate microsatellites, and primers were designed according to the method of Wang et al. (Reference Wang, Cao, Zhu and Wei2016). The parameters are as follows: The annealing temperature for each primer was set between 58 and 62°C, while the difference in the annealing temperatures within a primer pair was <4°C. The outputs of primer pairs from QDD were further filtered by the following stringent criteria: (i) the microsatellites had to be pure and specific; (ii) the design strategy of ‘A’ was used; and (iii) the minimum distance between the 3′ end of a primer pair and its target region had to be no shorter than 10 bp. Putative genes of validated microsatellite loci were identified using the BLASTX algorithm against the GenBank database using a maximum e-value of 1e-5.

Primer validation and polymorphism detection

For primer validation, a primer C tail (PC tail) (5′ CAGGACCAGGCTACCGTG 3′) was added to the 5′ end of each candidate forward primer (Schuelke, Reference Schuelke2000; Blacket et al., Reference Blacket, Robin, Good, Lee and Miller2012) to reduce the cost. Eight OFM larvae from eight geographical populations were used for the initial test. The final amplification volume was 10 µl, including 0.5 µl of template DNA (5–20 ng µl−1), 5 µl of Master Mix (Promega, Madison, WI, USA), 0.08 µl of PC tail modified forward primer (10 mM), 0.16 µl of reverse primer (10 mM), 0.32 µl of fluorescence-labeled PC tail (10 mM) and 3.94 µl of ddH2O. The amplification program was performed under the following conditions: 4 min at 94°C; 35 cycles of 30 s at 94°C, 30 s at 56°C and 45 s at 72°C, followed by a final 10-min extension at 72°C. The amplified PCR fragments were analyzed on the ABI 3730xl DNA Analyzer (Applied Biosystems) using the GeneScan 500 LIZ size standard (Applied Biosystems).

Primer pairs for 64 loci (GenBank accession numbers: KX711549–KX711612) were selected for initial test based on two criteria: (1) at most one primer pair is retained from one scaffold and (2) repeat motif of the expected amplification is larger or equal to 3. Primer pairs with amplification rates lower than 75% were excluded for genotyping (marked as ‘non-amplification’ or ‘low-success-rate’ in table S1). Those that failed to amplify target sequences were repeated using annealing temperatures of 53 and 50°C. During genotyping step by GENEMAPPER version 4.0 (Applied Biosystems), the loci showed more than two peaks in one individuals (marked as ‘non-specific amplification’ in table S1) and those had less than two alleles in eight testing individuals (marked as ‘no polymorphism’ in table S1) were excluded for large-scale examination. The remaining primer pairs were validated in 95 individuals from four natural populations, conducted as in the previous steps.

The genotyping data were determined using GENEMAPPER version 4.0 (Applied Biosystems). The stuttering and large allele dropouts were detected using MICRO-CHECKER version 2.2.3 (Van Oosterhout et al., Reference Van Oosterhout, Hutchinson, Wills and Shipley2004) and checked back in GENEMAPPER. The null allele frequencies were estimated using the software FReeNA (Chapuis & Estoup, Reference Chapuis and Estoup2007). Allele frequencies, observed heterozygosity (H O), expected heterozygosity (H E) and the polymorphic information content (PIC) were calculated using the macros Microsatellite Tools (Park, Reference Park2001). We used GENEPOP version 4.0.11 (Raymond & Rousset, Reference Raymond and Rousset1995) to test the likely deviation from Hardy–Weinberg equilibrium (HWE) at each locus/population pair and the linkage disequilibrium (LD) among loci within each population. The allelic richness (A R) of each loci and inbreeding coefficient (F IS) between the individuals within each population were detected by the software FSTAT version 2.9.3 (Goudet, Reference Goudet1995). The program LOSITAN (Antao et al., Reference Antao, Lopes, Lopes, Beja-Pereira and Luikart2008) was used to detect putative loci potentially under selection with two options: neutral mean F ST′ and force mean F ST′.

Selection of marker panels

To explain the influence of the null allele on genetic diversity and population genetic structure, we established three marker panels based on the following criteria: (i) all of the loci (ALL), (ii) the top one-third of the loci (GM3-S11, GM3-S13, GM5-S18, GM3-S22, GM3-S31, GM3-S34, GM5-S44, GM3-S46 and GM3-S64) with the lowest null allele frequencies (LNAs) and (iii) the top one-third of the loci (GM3-S04, GM3-S12, GM3-S15, GM3-S33, GM3-S35, GM3-S41, GM3-S49, GM3-S51 and GM3-S61) with the highest null allele frequencies (HNAs). The three marker panels were independently used to estimate the genetic diversity and in population structure analyses.

Population genetic structure analyses

The pairwise differentiations (F ST) of the four populations were calculated using both GENEPOP version 4.0.11 (Raymond & Rousset, Reference Raymond and Rousset1995) and FReeNA (Chapuis & Estoup, Reference Chapuis and Estoup2007). The latter uses an excluding null allele method to avoid the effects of a null allele on the estimates of genetic differentiation. Paired t-tests were used to compare the F ST values calculated by the above two methods within each marker panel.

The population genetic structure was analyzed by STRUCTURE version 2.3.4 (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000). For the cluster identification, we used replicates with K from 1 to 10, with 30 repeats for each, and a burn-in of 100,000 iterations followed by 200,000 Markov Chain Monte Carlo iterations. The results were submitted to the online software STRUCTURE HARVESTER version 0.6.94 (Earl & vonHoldt, Reference Earl and vonHoldt2011) (http://taylor0.biology.ucla.edu/structureHarvester/) to determine the optimal K value using the Delta (K) method. The visualized results obtained from previous databases were handled by two programs, CLUMPP version 1.1.2 (Jakobsson & Rosenberg, Reference Jakobsson and Rosenberg2007) and DISTRUCT version 1.1 (Rosenberg, Reference Rosenberg2004). To quantify the distinction rate of each locus, we concluded that individuals were correctly assigned to their respective populations as long as the Q-value obtained with CLUMPP met our lowest standard value of 0.6. Because the STRUCTURE analysis considers HWE and linkage equilibrium, an extra discriminant analysis of principal components (DAPC) was implemented in the R package ADEGENET 1.4–2 (Jombart et al., Reference Jombart, Devillard, Dufour and Pontier2008), which does not require a biological hypothesis. The detailed methods are as follows: after importing the data to be processed, we chose the optimal number of clusters and PCs. Other parameters and settings were determined by the script.

Results

Genomic sequencing and characteristics of the OFM's microsatellites

In total, 4916 Mb paired-end sequences with read lengths of 300 bp were generated by the Illumina MiSeq system. Raw data sequences were submitted to the Short Read Archive of the National Center for Biotechnology Information under accession number SRP074918. The high-quality reads were assembled into 65,534 scaffolds with a total length of 243 Mb, mean size of 2061 bp, N50 of 2611 bp and coverage of approximately 20 times.

We isolated 56,674 microsatellites from the genomic sequences. No mononucleotide repeats were obtained under our selection parameters, while 36,789 (64.913%) dinucleotide, 14,135 (24.941%) trinucleotide, 5005 (8.831%) tetranucleotide, 605 (1.068%) pentanucleotide and 140 (0.247%) hexanucleotide repeats were included (table S1). The average length of the different nucleotide repeats ranged from 13 to 35 bp, while the corresponding frequencies and densities decreased from 75.55 to 0.29 loci Mb−1 and 956.975 to 10.03 bp Mb−1, respectively. The number of microsatellites decreased as the number of repeat motifs increased. There were far more dinucleotide repeats than other repeats in our study, which was common in other species of Lepidoptera (A'Hara & Cottrell, Reference A'Hara and Cottrell2013; Pavinato et al., Reference Pavinato, Silva-Brandao, Monteiro, Zucchi, Pinheiro, Dias and Omoto2013), Thysanoptera (Yang et al., Reference Yang, Sun, Xue, Zhu and Hong2012) and Coleoptera (Bouanani et al., Reference Bouanani, Magné, Lecompte and Crouau-Roy2014).

Validation and null alleles of microsatellite loci

In total, 11,584 loci were suitable for primer design. We retained one primer pair for each locus in primer design using the software QDD. Thirty-six of the 64 primer pairs were retained after the stringent criteria selection and applied into the initial validation on eight individuals, while the other 28 pairs were discarded because they resulted in an amplification rate lower than 75%, generated more than two peaks in the genotyping process or had a low polymorphism. After the second round of screening, 27 primer pairs were maintained (table 1, table S2). BLAST search showed that 14 loci are located on putative coding regions, while the others might be from non-coding regions of the OFM genome.

Table 1. Twenty-seven microsatellite markers developed for Grapholita molesta.

BLASTx, results using the BLASTx algorithm against GenBank.

*Indicates the locus deviated from Hardy–Weinberg equilibrium (HWE) in one population.

**Indicates the locus deviated from HWE in two populations.

***Indicates the locus deviated from HWE in three populations.

****Indicates the locus deviated from HWE in all four populations.

Among the 27 validated microsatellites, six loci had low frequencies of null alleles in the four populations, ranging from 0.002 to 0.078, ten loci had moderate frequencies, ranging from 0.140 to 0.190, and 11 loci had high frequencies, ranging from 0.218 to 0.325 (table 2). Among the four tested populations, CD had the highest null allele frequency (0.213), followed by NJ (0.185), SL (0.176) and SY (0.166).

Table 2. Frequencies of the null alleles of the 27 microsatellite loci developed for Grapholita molesta in four natural populations.

ALL, LNA and HNA in the first column indicate the three marker panels established based on the frequencies of null alleles.

Genetic diversity of the OFM based on the three marker panels

Population level tests showed that all of the loci were highly polymorphic. The H O and H E varied from 0.081 to 0.702 and from 0.194 to 0.850, respectively. The allele numbers ranged from 4 to 40, with an average value of 13.7 per locus. The inbreeding coefficient ranged from −0.500 to 1.000, with an average value of 0.486, and the PIC was distributed between 0.167 and 0.810, with an average value of 0.544 (table 3), after applying Holm's correction (Gaetano, Reference Gaetano2013).

Table 3. Genetic diversity of the four Grapholita molesta populations calculated using 27 microsatellites markers.

A R, allelic richness per locus; H O, observed heterozygosity; H E, expected heterozygosity; F IS, inbreeding coefficient; HWE, average P-value of Hardy–Weinberg equilibrium; PIC, polymorphic information content.

Within the marker panel ALL, 12 loci showed significant deviation from HWE (P < 0.05) in different populations (table 3), which might be caused by heterozygote deficiencies and the presence of null alleles, corroborating the previously observed results in most lepidopteran microsatellites (An et al., Reference An, Deng, Shi, Ding, Lan, Yang and Li2014). The H O value was lower than the H E value in all of the loci among all of the populations. A high F IS was found in the loci GM3-S01, GM3-S15, GM5-S23, GM3-S32, GM3-S35, GM3-S41, GM3-S49, GM3-S51 and GM3-S61 in some populations, and might be caused by the Wahlund Effect in the CD population (Wahlund, Reference Wahlund1928; Dharmarajan et al., Reference Dharmarajan, Beatty and Rhodes2013). Thirty-six pairs of loci (11 pairs in the CD population, 12 pairs in the SL population, 10 pairs in the NJ population and 3 pairs in the SY population) of the 1404 pairwise comparisons between each pair of loci in the four populations showed a significant LD after multiple tests. Since no pair of loci showed LD in all tested populations of OFM, the presence of LD is unlikely due to physical linkage, as in previous studies (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010; Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015).

When the marker panels LNA and HNA were used for genetic diversity estimations, the average A R values for the four populations calculated from HNA were the highest, followed by those from LNA and ALL, although there were no statistically significant differences (table 4). The H O values for these two panels were obviously lower than the H E values; however, the differences between H O and H E calculated from the HNA panel were higher than those from LNA, which indicates that the null allele's presence might explain the low H O and the deviation from the HWE. The high F IS of loci from HNA (0.715, 0.702, 0.666 and 0.714 in CD, NJ, SL and SY, respectively) were approximately two or three times those from LNA (0.334, 0.235, 0.211 and 0.183 in CD, NJ, SL and SY, respectively).

Table 4. Average allelic richness values and comparisons calculated using the three marker panels.

The neutrality test showed that S22 was a candidate for positive selection and S12 for balancing selection. The other 25 of the markers showed no selection pressure. S12, S22 as well as another ten loci displayed significant deviation from HWE in average of P values (<0.05) of each populations (table 3).

Population genetic structure of the OFM using three marker panels

No significant differences were observed between pairwise F ST and excluding null alleles-corrected pairwise F ST for null alleles (t-test: ALL, P = 0.99995; LNA, P = 0.99988; HNA, P = 0.99989) (table S3), which suggests that null alleles did not affect this analysis.

Using the 27 microsatellites for a STRUCTURE analysis (marker panel ALL) resulted in the individuals from four geographic populations being divided into five clusters. The NJ and SY populations together formed one cluster, the SL population formed another one, while the CD population was separated into three distinct clusters (fig. 1a). To eliminate the biological hypothesis, an extra DAPC was performed, resulting in the individuals from four populations sampled being clearly divided into three genetic clusters (fig. 2a). This was in agreement with the clusters determined by STRUCTURE, although the CD population was regarded as one integral unit rather than three clusters. Our analyses indicated that the microsatellite markers validated in our study were powerful for detecting the population genetic structure.

Fig. 1. Genetic structures of four Grapholita molesta populations inferred from the three marker panels using the software STRUCTURE. (a) Results based on the marker panel ALL (all of the loci) when K = 5; (b) results based on the marker panel ALL when K = 3; (c) results based on the marker panel LNA (the top one-third of the loci with the lowest null allele frequencies) when K = 3; and (d) results based on the marker panel HNA (the top one-third of the loci with the highest null allele frequencies) when K = 3. CD, population of Chengdu, Sichuan Province; NJ, population of Nanjing, Jiangsu Province; SL, population of Shilin, Yunnan Province; SY, population of Shenyang, Liaoning Province.

Fig. 2. Genetic structures of four Grapholita molesta populations inferred from the three marker panels using DAPC. (a) Results based on the marker panel ALL; (b) results based on the marker panel LNA; and (c) results based on the marker panel HNA. Abbreviations for the populations are as in Fig. 1. NJ and SY populations formed one clustered, while SL and CD were each displayed as independent clusters.

When LNA and HNA were subjected to STRUCTURE analyses, the populations were clearly divided into three clusters. To make the results comparable and consistent, the STRUCTURE results of the ALL panel was drawn for K = 3 (fig. 1b). Almost all of the loci in the three marker panels displayed a very high degree of distinction in the four natural populations (fig. 1bd). The average posterior probability (Q) was high in all cases (CD: ALL, Q = 0.969; LNA, Q = 0.863; HNA, Q = 0.811; SL: ALL, Q = 0.972; LNA, Q = 0.966; HNA, Q = 0.924; NJ: ALL, Q = 0.956; LNA, Q = 0.894; HNA, Q = 0.837; SY: ALL, Q = 0.981; LNA, Q = 0.922; and HNA, Q = 0.850), although 1, 4 and 10 individuals were not correctly assigned to their particular populations in the ALL, LNA and HNA panels, respectively. Similar results for the population's genetic structure were obtained using DAPC (fig. 2b, c).

Discussion

The genome sizes of sequenced lepidopteran species are usually several hundred megabases, such as Plutella xylostella (You et al., Reference You, Yue, He, Yang, Yang, Xie, Zhan, Baxter, Vasseur and Gurr2013) (343 Mb), Papilio machaon (281 Mb), Papilio xuthus (244 Mb) (Li et al., Reference Li, Fan, Zhang, Liu, Zhang, Zhao, Fang, Chen, Dong and Chen2015) and Papilio glaucus (376 Mb) (Cong et al., Reference Cong, Borek, Otwinowski and Grishin2015). Although the size of the OFM genome is currently not available, it is assumed that the 243-Mb sequences obtained in our study represent a large portion of the OFM genome.

Microsatellites are versatile potential molecular markers in population genetics and evolution (Estoup & Angers, Reference Estoup and Angers1998). Changes, such as mutations and substitutions in flanking region sequences, may prevent the primer annealing to template DNA during amplification of the microsatellite locus by PCR, resulting in a null allele. The preferential amplification of short alleles owing to inconsistent DNA template quality, or quantity, and slippage during PCR amplification (Gagneux et al., Reference Gagneux, Boesch and Woodruff1997) might also result in microsatellite null alleles. Additionally, the enzyme activity can be reduced heavily at the end of the amplification reaction, leaving an unavoidable null allele. Thus, markers should be validated in multiple populations to minimize null allele occurrence (Guichoux et al., Reference Guichoux, Lagache, Wagner, Chaumeil, Leger, Lepais, Lepoittevin, Malausa, Revardel, Salin and Petit2011). In this study, we used eight individuals from eight geographical populations in the native range of the OFM for initial testing and 95 individuals from four natural populations for population-level validation. All loci used for examination had trinucleotide repeats except for GM5-S18, GM5-S23 and GM5-S44 (table 1), which had pentanucleotide. High, moderate and low frequency of null allele was found for the three loci with pentanucleotide repeat, which might indicate that there is no relation between size of the repeats and null allele frequency (table 2). Nevertheless, high null allele frequencies were present in most of the remaining loci. Our study, as well as previous reports (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010; Kirk et al., Reference Kirk, Dorn and Mazzi2013; Zheng et al., Reference Zheng, Peng, Liu, Pan, Dorn and Chen2013; Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015; Zheng et al., Reference Zheng, Qiao, Wang, Dorn and Chen2015), indicated that the OFM is plagued by null alleles, which is common in lepidopteran species (Sinama et al., Reference Sinama, Dubut, Costedoat, Gilles, Junker, Malausa, Martin, Neve, Pech, Schmitt, Zimmermann and Meglecz2011; Jiang et al., Reference Jiang, Zhu, Zhan, Chen, Song and Yu2014), but not in all species (Lebigre et al., Reference Lebigre, Turlure and Schtickzelle2015; Wang et al., Reference Wang, Cao, Zhu and Wei2016). The null allele frequencies in CD were higher than in the other three populations for almost all of the loci developed in this study. This corroborates the high genetic diversity in the CD population that was revealed in a previous study (Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015), and this may lead to a high mutation rate in the microsatellite-flanking sequences.

The presence of null alleles can sometimes cause heterozygosity deficiencies that lead to deviations from HWE. Because null alleles create false homozygotes, they are problematic for the exclusion of the true parents in parentage analyses (Dakin & Avise, Reference Dakin and Avise2004). The presence of the null allele may reduce the power to assign individuals to populations correctly, erroneously inflate levels of genetic differentiation (Carlsson, Reference Carlsson2008), and underestimate the genetic diversity estimates of the population that rely on HWE (Sousa et al., Reference Sousa, Finkeldey and Gailing2005; Chapuis & Estoup, Reference Chapuis and Estoup2007). In our study, the deviation of most loci from HWE, low H O values and relatively high F IS may indicate that the null alleles affect the estimations of genetic diversity, although the weak flight ability might cause high rates of inbreeding in the species (Torriani et al., Reference Torriani, Mazzi, Hein and Dorn2010; Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015). The results of neutrality test also suggest that selection was still an important factor causing the deviation.

The three panels, with different null allele frequencies, divided the four populations into three groups using STRUCTURE and DAPC, which is congruent with a previous study (Wei et al., Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015). When we used all of the loci for a STRUCTURE analysis, the CD population was divided into three clusters. The presence of subpopulations might be caused by the specimen sampling location and the implementation of the model in an algorithm of STRUCTURE (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000; Falush et al., Reference Falush, Stephens and Pritchard2003). There were 10, 4 and 1 individuals that were not assigned to their particular populations correctly when the marker panels HNA, LNA and ALL were used, respectively, indicating that the null allele affects individual assignments, as previously reported (Carlsson, Reference Carlsson2008). The STRUCTURE and the DAPC results were congruent with the analysis of Wei et al. (Reference Wei, Cao, Gong, Shi, Wang, Zhang, Guo, Wang and Chen2015), in which the CD, SL, NJ and SY populations were divided into CD, SL, CE (Central) and NO (Northern) groups. As is shown in the table 4, the average A R values for the four populations calculated from HNA were the highest and those from LNA were lower. The H O values for these two panels were obviously lower than the H E values, indicating that the null allele's presence might cause the low H O and the deviation from the HWE. The high F IS of loci from HNA were much higher than those from LNA. However, when LNA and HNA were subjected to STRUCTURE analyses, the populations were clearly divided into three clusters. It can be concluded that the null allele influenced estimations of genetic diversity parameters but not the OFM's genetic structure.

Conclusions

We characterized the distribution of microsatellites in the genomic sequences and developed a novel set of microsatellite markers for the OFM. A population-level validation showed that the microsatellites of the OFM were plagued by the presence of null alleles. Comparisons among the three marker panels showed that null alleles could influence the genetic diversity and individual assignments, but not the division of the population genetic structure. The microsatellites developed in our study are useful markers for further genetic studies of the OFM.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S0007485316000936.

Acknowledgements

The research was funded by the Beijing Natural Science Foundation (Grant no. 6162010), National Basic Research Program of China (Grant no. 2013CB127600) and the National Natural Science Foundation of China (Grant no. 31472025).

References

A'Hara, S. & Cottrell, J. (2013) Development and characterisation of ten polymorphic microsatellite markers for the pine-tree lappet moth Dendrolimus pini (Lepidoptera: Lasiocampidae). Conservation Genetics Resources 5, 11351137.Google Scholar
An, B., Deng, X., Shi, H., Ding, M., Lan, J., Yang, J. & Li, Y. (2014) Development and characterization of microsatellite markers for rice leaffolder, Cnaphalocrocis medinalis (Guenee) and cross-species amplification in other Pyralididae. Molecular Biology Reports 41, 11511156.Google Scholar
Antao, T., Lopes, A., Lopes, R.J., Beja-Pereira, A. & Luikart, G. (2008) LOSITAN: a workbench to detect molecular adaptation based on a Fst-outlier method. BMC Bioinformatics 9, 323.CrossRefGoogle ScholarPubMed
Anthony, N., Gelembiuk, G., Raterman, D., Nice, C. & R. Ffrench-Constant (2001) Isolation and characterization of microsatellite markers from the endangered Karner blue butterfly Lycaeides melissa samuelis (Lepidoptera). Hereditas 134, 271273.Google Scholar
Blacket, M.J., Robin, C., Good, R.T., Lee, S.F. & Miller, A.D. (2012) Universal primers for fluorescent labelling of PCR fragments – an efficient and cost-effective approach to genotyping by fluorescence. Molecular Ecology Resources 12, 456463.Google Scholar
Bouanani, M.A., Magné, F., Lecompte, É. & Crouau-Roy, B. (2014) Development of 18 novel polymorphic microsatellites from Coccinella septempunctata and cross-species amplification in Coccinellidae species. Conservation Genetics Resources 7, 445449.Google Scholar
Carlsson, J. (2008) Effects of microsatellite null alleles on assignment testing. Journal of Heredity 99, 616623.Google Scholar
Chapuis, M.P. & Estoup, A. (2007) Microsatellite null alleles and estimation of population differentiation. Molecular Biology and Evolution 24, 621631.Google Scholar
Cong, Q., Borek, D., Otwinowski, Z. & Grishin, N.V. (2015) Tiger swallowtail genome reveals mechanisms for speciation and caterpillar chemical defense. Cell Reports 10, 910919.CrossRefGoogle ScholarPubMed
Cox, M.P., Peterson, D.A. & Biggs, P.J. (2010) SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11, 485.Google Scholar
Dakin, E.E. & Avise, J.C. (2004) Microsatellite null alleles in parentage analysis. Heredity 93, 504509.Google Scholar
Dharmarajan, G., Beatty, W.S. & Rhodes, O.E. (2013) Heterozygote deficiencies caused by a Wahlund effect: dispelling unfounded expectations. Journal of Wildlife Management 77, 226234.Google Scholar
Du, L., Li, Y., Zhang, X. & Yue, B. (2013) MSDB: a user-friendly program for reporting distribution and building databases of microsatellites from genome sequences. Journal of Heredity 104, 154157.Google Scholar
Earl, D.A. & vonHoldt, B.M. (2011) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources 4, 359361.Google Scholar
Estoup, A. & Angers, B. (1998) Microsatellites and minisatellites for molecular ecology: theoretical and empirical considerations. Advances in Molecular Ecology Nato Sciences 38, 6975.Google Scholar
Falush, D., Stephens, M. & Pritchard, J.K. (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 15671587.Google Scholar
Flanagan, N.S., Blum, M.J., Davison, A., Alamo, M., Albarrán, R., Faulhaber, K., Peterson, E. & Mcmillan, W.O. (2002) Characterization of microsatellite loci in neotropical Heliconius butterflies. Molecular Ecology Notes 2, 398401.Google Scholar
Gaetano, J. (2013) Holm-Bonferroni Sequential Correction: An EXCEL Calculator - Version 1.2. Available online at https://www.researchgate.net/publication/242331583_Holm-Bonferroni_Sequential_Correction_An_EXCEL_Calculator_-_Ver._1.2.Google Scholar
Gagneux, P., Boesch, C. & Woodruff, D.S. (1997) Microsatellite scoring errors associated with noninvasive genotyping based on nuclear DNA amplified from shed hair. Molecular Ecology 6, 861868.Google Scholar
Goudet, J. (1995) FSTAT (Version 1.2): a computer program to calculate F-statistics. Journal of Heredity 86, 485486.Google Scholar
Guichoux, E., Lagache, L., Wagner, S., Chaumeil, P., Leger, P., Lepais, O., Lepoittevin, C., Malausa, T., Revardel, E., Salin, F. & Petit, R.J. (2011) Current trends in microsatellite genotyping. Molecular Ecology Resources 11, 591611.Google Scholar
Habel, J.C., Finger, A., Meyer, M., Schmitt, T. & Assmann, T. (2008) Polymorphic microsatellite loci in the endangered butterfly Lycaena helle (Lepidoptera: Lycaenidae). European Journal of Entomology 105, 361362.CrossRefGoogle Scholar
Jakobsson, M. & Rosenberg, N.A. (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23, 18011806.Google Scholar
Jiang, W., Zhu, J., Zhan, L., Chen, M., Song, C. & Yu, W. (2014) Isolation and characterization of microsatellite loci in Polytremis nascens (Lepidoptera: Hesperiidae) and their cross-amplification in related species. Applied Entomology & Zoology 49, 177181.Google Scholar
Jiggins, C.D., Mavarez, J., Beltran, M., McMillan, W.O., Johnston, J.S. & Bermingham, E. (2005) A genetic linkage map of the mimetic butterfly Heliconius melpomene . Genetics 171, 557570.Google Scholar
Jombart, T., Devillard, S., Dufour, A.B. & Pontier, D. (2008) Revealing cryptic spatial patterns in genetic variability by a new multivariate method. Heredity 101, 92103.CrossRefGoogle ScholarPubMed
Jurka, J. & Pethiyagoda, C. (1995) Simple repetitive DNA sequences from primates compilation and analysis. Journal of Molecular Evolution 40, 120126.CrossRefGoogle ScholarPubMed
Kirk, H., Dorn, S. & Mazzi, D. (2013) Worldwide population genetic structure of the oriental fruit moth (Grapholita molesta), a globally invasive pest. BMC Ecology 13, 12.Google Scholar
Lebigre, C., Turlure, C. & Schtickzelle, N. (2015) Characterisation of sixteen additional polymorphic microsatellite loci for the spreading but locally rare European butterfly, Brenthis ino (Lepidoptera: Nymphalidae). European Journal of Entomology 112, 389392.Google Scholar
Li, X., Fan, D., Zhang, W., Liu, G., Zhang, L., Zhao, L., Fang, X., Chen, L., Dong, Y. & Chen, Y. (2015) Outbred genome sequencing and CRISPR/Cas9 gene editing in butterflies. Nature Communications 6, 8212.Google Scholar
Meglécz, E., Costedoat, C., Dubut, V., Gilles, A., Malausa, T., Pech, N. & Martin, J.F. (2010) QDD: a user-friendly program to select microsatellite markers and design primers from large sequencing projects. Bioinformatics 26, 403404.Google Scholar
Park, S.D.E. (2001) Trypanotolerance in west african cattle and the population genetic effects of selection . PhD Thesis, University of Dublin, Dublin, Ireland.Google Scholar
Pavinato, V.A.C., Silva-Brandao, K.L., Monteiro, M., Zucchi, M.I., Pinheiro, J.B., Dias, F.L.F. & Omoto, C. (2013) Development and characterization of microsatellite loci for genetic studies of the sugarcane borer, Diatraea saccharalis (Lepidoptera: Crambidae). Genetics and Molecular Research 12, 16311635.Google Scholar
Peng, Y., Leung, H.C.M., Yiu, S.M. & Chin, F.Y.L. (2010) IDBA – a practical iterative de Bruijn Graph De Novo Assembler. Lecture Notes in Computer Science 6044, 426440.Google Scholar
Primmer, C.R., Møller, A.P. & Ellegren, H. (1995) Resolving genetic relationships with microsatellite markers: a parentage testing system for the swallow Hirundo rustica . Molecular Ecology 4, 493498.Google Scholar
Pritchard, J.K., Stephens, M. & Donnelly, P. (2000) Inference of population structure using multilocus genotype data. Genetics 7, 574578.Google Scholar
Quaintance, A.L. & Wood, W.B. (1916) Laspeyresia molesta, an important new insect enemy of the peach. Journal of Agricultural Research 7, 373378.Google Scholar
Raymond, M. & Rousset, F. (1995) GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. Journal of Heredity 86, 248249.Google Scholar
Rosenberg, N.A. (2004) distruct : a program for the graphical display of population structure. Molecular Ecology Notes 4, 137138.Google Scholar
Rothschild, G.H.L. & Vickers, R.A. (1991) Biology, ecology and control of the oriental fruit moth. pp. 389412 in der Geest, L.P.S. & Evenhuis, H.H. (Eds) Tortricid Pests: Their Biology, Natural Enemies and Control. Amsterdam, Elsevier.Google Scholar
Rousset, F. (2008) genepop ’007: a complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources 8, 103106.Google Scholar
Schuelke, M. (2000) An economic method for the fluorescent labeling of PCR fragments. Nature Biotechnology 18, 233234.Google Scholar
Silva Brandão, K.L., Brandão, M.M., Omoto, C. & Sperling, F.A. (2015) Genotyping-by-sequencing approach indicates geographic distance as the main factor affecting genetic structure and gene flow in Brazilian populations of Grapholita molesta (Lepidoptera, Tortricidae). Evolutionary Applications 8, 476485.Google Scholar
Sinama, M., Dubut, V., Costedoat, C., Gilles, A., Junker, M., Malausa, T., Martin, J.-F., Neve, G., Pech, N., Schmitt, T., Zimmermann, M. & Meglecz, E. (2011) Challenges of microsatellite development in Lepidoptera: Euphydryas aurinia (Nymphalidae) as a case study. European Journal of Entomology 108, 261266.Google Scholar
Sousa, S.N.D., Finkeldey, R. & Gailing, O. (2005) Experimental verification of microsatellite null alleles in Norway Spruce (Picea abies [L.] Karst.): implications for population genetic studies. Plant Molecular Biology Reporter 23, 113119.Google Scholar
Timm, A.E., Geertsema, H. & Warnich, L. (2008) Population genetic structure of Grapholita molesta (Lepidoptera : Tortricidae) in South Africa. Annals of the Entomological Society of America 101, 197203.Google Scholar
Torriani, M.V., Mazzi, D., Hein, S. & Dorn, S. (2010) Structured populations of the oriental fruit moth in an agricultural ecosystem. Molecular Ecology 19, 26512660.CrossRefGoogle Scholar
Van Oosterhout, C., Hutchinson, W.F., Wills, D.P.M. & Shipley, P. (2004) Micro-checker: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes 4, 535538.Google Scholar
Wahlund, S. (1928) Zusammensetzung von Populationen und Korrelationserscheinungen von Standpunkt der Vererbungslehre aus betrachtet. Hereditas 11, 65106.Google Scholar
Wang, Y.-Z., Cao, L.-J., Zhu, J.-Y. & Wei, S.-J. (2016) Development and Characterization of Novel Microsatellite Markers for the Peach Fruit Moth Carposina sasakii (Lepidoptera: Carposinidae) using next-generation sequencing. International Journal of Molecular Sciences 17, 362.Google Scholar
Wang, Z. (1994) The genetic bases of allozyme analysis (Part 2). Chinese Biodiversity 2, 213219.Google Scholar
Wei, S.J., Cao, L.J., Gong, Y.J., Shi, B.C., Wang, S., Zhang, F., Guo, X.J., Wang, Y.M. & Chen, X.X. (2015) Population genetic structure and approximate Bayesian computation analyses reveal the southern origin and northward dispersal of the oriental fruit moth Grapholita molesta (Lepidoptera: Tortricidae) in its native range. Molecular Ecology 24, 40944111.Google Scholar
Yang, X.M., Sun, J.T., Xue, X.F., Zhu, W.C. & Hong, X.Y. (2012) Development and characterization of 18 novel EST-SSRs from the western flower thrips, Frankliniella occidentalis (Pergande). International Journal of Molecular Sciences 13, 28632876.Google Scholar
You, M., Yue, Z., He, W., Yang, X., Yang, G., Xie, M., Zhan, D., Baxter, S.W., Vasseur, L. & Gurr, G.M. (2013) A heterozygous moth genome provides insights into herbivory and detoxification. Nature Genetics 45, 220225.Google Scholar
Zhang, D.X. (2004) Lepidopteran microsatellite DNA: redundant but promising. Trends in Ecology & Evolution 19, 507509.Google Scholar
Zheng, Y., Peng, X., Liu, G., Pan, H., Dorn, S. & Chen, M. (2013) High genetic diversity and structured populations of the oriental fruit moth in its range of origin. PLoS ONE 8, e78476.Google Scholar
Zheng, Y., Qiao, X., Wang, K., Dorn, S. & Chen, M. (2015) Population genetics affected by pest management using fruit-bagging: a case study with Grapholita molesta in China. Entomologia Experimentalis et Applicata 156, 117127.Google Scholar
Figure 0

Table 1. Twenty-seven microsatellite markers developed for Grapholita molesta.

Figure 1

Table 2. Frequencies of the null alleles of the 27 microsatellite loci developed for Grapholita molesta in four natural populations.

Figure 2

Table 3. Genetic diversity of the four Grapholita molesta populations calculated using 27 microsatellites markers.

Figure 3

Table 4. Average allelic richness values and comparisons calculated using the three marker panels.

Figure 4

Fig. 1. Genetic structures of four Grapholita molesta populations inferred from the three marker panels using the software STRUCTURE. (a) Results based on the marker panel ALL (all of the loci) when K = 5; (b) results based on the marker panel ALL when K = 3; (c) results based on the marker panel LNA (the top one-third of the loci with the lowest null allele frequencies) when K = 3; and (d) results based on the marker panel HNA (the top one-third of the loci with the highest null allele frequencies) when K = 3. CD, population of Chengdu, Sichuan Province; NJ, population of Nanjing, Jiangsu Province; SL, population of Shilin, Yunnan Province; SY, population of Shenyang, Liaoning Province.

Figure 5

Fig. 2. Genetic structures of four Grapholita molesta populations inferred from the three marker panels using DAPC. (a) Results based on the marker panel ALL; (b) results based on the marker panel LNA; and (c) results based on the marker panel HNA. Abbreviations for the populations are as in Fig. 1. NJ and SY populations formed one clustered, while SL and CD were each displayed as independent clusters.

Supplementary material: File

Song supplementary material

Tables S1-S3

Download Song supplementary material(File)
File 551.1 KB