Introduction
Red bean, which is also known as the azuki bean [Vigna angularis (Willd.) Ohwi & Ohashi], is a member of the legume family Fabaceae. The name azuki is a transliteration of the native Japanese name from the Chinese word Shōzu, which means small bean. The Korean name is pat. Red bean may be confused with other red-coloured beans such as the red ball bean, Mexican red bean and red kidney bean. Red bean is the second most important legume after soybean in South Korea. Although production is not as high as that of soybean, the adaptability of red bean is advantageous to growers.
V. angularis var. nipponensis is the presumed wild ancestor of the cultivated red bean (Yamaguchi, Reference Yamaguchi1992). The wild variety, var. nipponensis, can be distinguished from the cultivated red bean, var. angularis, in terms of reproductive traits such as seed number, seed size, seed dispersal and seed dormancy (Kaga et al., Reference Kaga, Isemura, Tomooka and Vaughan2008). The wild species is distributed across a wide area from Japan to the Korean peninsula, and from China to Nepal and Bhutan (Tomooka et al., Reference Tomooka, Vaughan, Moss and Maxted2002). The site and timing of domestication of the red bean is not known, although China, South Korea, Japan and East Asia (mainly in temperate regions) are thought to be the places of origin. China, Japan and Korea span a range of latitudes from tropical or subtropical to temperate and have a range of floral and faunal associations, along with climates controlled in part by the East Asian monsoon (Hsieh, Reference Hsieh1973; Ren et al., Reference Ren, Yang and Bao1985; Liu, Reference Liu1988). Hence, red beans from China, Korea and Japan are genetically distinct from each other (Xu et al., Reference Xu, Jing, Tomooka, Kaga, Isemura and Vaughan2008).
Cultivated red beans show differences in morphological and physiological traits that might be influenced by selection for adaptation to differing agroecologies. These differences, collectively called the domestication syndrome, result from selection over several thousands of years for adaptation to cultivated environments and human preferences (Hawkes, Reference Hawkes1983). Although many researchers from China and Japan have studied the red bean (Yamada et al., Reference Yamada, Teraishi, Hattori and Ishimoto2001; Tomooka et al., Reference Tomooka, Vaughan, Moss and Maxted2002; Zong et al., Reference Zong, Kaga, Tomooka, Wang, Han and Vaughan2003; Wang et al., Reference Wang, Kaga, Tomooka and Vaughan2004; Han et al., Reference Han, Kaga, Isemura, Wang, Tomooka and Vaughan2005; Kaga et al., Reference Kaga, Vaughan, Tomooka, Lorz and Wenzel2005, Reference Kaga, Isemura, Tomooka and Vaughan2008; Wang et al., Reference Wang, Cheng, Wang, Liu and Liang2009; Javadi et al., Reference Javadi, Tun, Kawase, Guan and Yamaguchi2011), limited research has been conducted on assessing the genetic diversity and population structure of the red bean in Korea.
Among molecular genetic markers, simple sequence repeat (SSR), or microsatellite, markers, are becoming popular because of their advantages. They have been used to study genetic diversity, phylogenetic relationships, classification, evolutionary processes, quantitative trait loci, and for marker-assisted selection in many crop species (Agrama et al., Reference Agrama, Eizenga and Yan2007; Chung and Park, Reference Chung and Park2010; Moe et al., Reference Moe, Chung, Cho, Moon, Ku, Jung, Lee and Park2010a, Reference Moe, Zhao, Song, Kim, Chung, Cho, Park, Park, Chae and Parkb; Zhao et al., Reference Zhao, Chung, Cho, Rha, Lee, Ma, Han, Bang, Park, Kim and Park2010a, Reference Zhao, Chung, Cho, Rha, Lee, Ma, Han, Bang, Park, Kim and Parkb). However, detailed analysis of the genetic diversity and population structure on the Korean red bean using SSR markers has not been reported. Therefore, the present study focused on the assessment of the genetic diversity, population structure and gene flow of the Korean red bean using SSR markers.
Materials and methods
Plant material and DNA extraction
To analyse the genetic diversity, population structure and gene flow of the 178 red bean cultivars in Korea, plant samples were collected from the National Institute of Horticultural and Herbal Science of the Rural Development Administration, Republic of Korea (see Supplementary Table S1, available online only at http://journals.cambridge.org). DNA was extracted from fresh leaves using a DNA extraction kit (Qiagen, Hilden, Germany). The relative purity and concentration of the extracted DNA was estimated with a NanoDrop ND-1000 (NanoDrop Technologies, Inc., Wilmington, DE, USA). The final concentration of each DNA sample was adjusted to 50 ng/μl.
SSR genotyping
Thirty-nine polymorphic SSR markers were selected according to their linkage group from Han et al. (Reference Han, Kaga, Isemura, Wang, Tomooka and Vaughan2005). The marker information is available on request from the NIAS Genebank (http://www.gene.affrc.go.jp/databases-marker_information_en.php). The M13-tailed PCR method of Schuelke (Reference Schuelke2000) was used to measure the size of the PCR products. Conditions for PCR amplification were 94°C for 3 min, 30 cycles each at 94°C for 30 s, 55°C (varied with different annealing temperature requirements of primers) for 45 s, 72°C for 1 min, followed by ten cycles at 94°C for 30 s, 53°C for 45 s, 72°C for 1 min, and a final extension at 72°C for 10 min. The SSR alleles were resolved on an ABI-3500 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA), using GeneMapper 4.1 software (Applied Biosystems), sized precisely using the GeneScan Installation Kit DS-33 (Applied Biosystems) and the GeneScan 600 LIZ Size Standard version 2.0 (Applied Biosystems).
Population structure
Population structure was determined using the model-based software program Structure 2.3.3 (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000; Falush et al., Reference Falush, Stephens and Pritchard2003). In this model, several populations (K) are assumed to be present, each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned to populations (clusters) or jointly to more populations if their genotypes indicate that they are admixed. All loci are assumed to be independent, and each K population is assumed to follow the Hardy–Weinberg equilibrium (HWE). Posterior probabilities were estimated using a Markov chain Monte Carlo (MCMC) method. The MCMC chains were run for 100,000 burn-in periods, followed by 200,000 iterations using a model allowing for admixture and correlated allele frequencies. At least three runs of the Structure software were performed with K ranging from 2 to 10, and an average likelihood value, L (K), across all runs was calculated for each value of K. The model choice criterion to detect the most probable value of K was ΔK, which is an ad hoc quantity related to the second-order change in the log probability of data [LnP(D)] with respect to the number of clusters inferred by Structure (Evanno et al., Reference Evanno, Regnaut and Goudet2005). An individual was assigned to a group if more than 75% of its genome fraction value was derived from that group.
Diversity analysis
The variability at each locus was measured in terms of the number of alleles, heterozygosity (H), major allele frequency (M AF), gene diversity (GD) and polymorphic information content (PIC) using PowerMarker 3.25 (Liu and Muse, Reference Liu and Muse2005). The neighbour-joining algorithm of MEGA4 software embedded in PowerMarker was used to construct an unrooted neighbour-joining tree of accessions based on the shared allele distances (Tamura et al., Reference Tamura, Dudley, Nei and Kumar2007). The same program was used to test the HWE and pairwise linkage disequilibrium (LD) values. The allele frequencies were analysed using a microsatellite toolkit in an Excel add-in program.
Results
SSR polymorphism
In total, 431 alleles were detected, with an average of 11 alleles per locus, among the 178 tested red bean accessions. Forty-six specific alleles were identified by 20 loci. Locus CEDG090 had the highest number (n = 22) of alleles, whereas only two alleles were observed at loci CEDG144 and CEDC018 (Table 1). The allele frequency data showed that rare alleles (n = 178) (with a frequency < 0.05) comprised 41.3% of all alleles, while intermediate (n = 162; 0.05 < frequency < 0.50) and abundant alleles (n = 91; frequency >0.50) comprised 37.6 and 21.1% of all detected alleles, respectively; thus, most of the alleles were of low frequency (Table 1). The values for H ranged from 0 to 0.99 with a mean value of 0.49. A Bonferroni correction for multiple comparisons was applied to the HWE at a significance level of P < 0.05, and 38 loci (except CEDG102) deviated from the HWE (Table 1). Highly significant (P < 0.01) LD values were observed among the 38 pairs of loci (except between locus CEDG019 and locus CEDG025, and between locus CEDG144 and locus CEDG148). The LD value of these two loci was significant only at the 5% probability level. The average GD and PIC values were 0.667 and 0.634, respectively, and ranged from 0.011 (CEDG144) to 0.896 (CEDG090), and from 0.011 (CEDG144) to 0.888 (CEDG090), respectively.
N A, number of alleles; GD, gene diversity; H, heterozygosity; PIC, polymorphic information content; HWE(p), likelihood probability value for the HWE; LD(p), probability value for the χ2 test for LD.
a Allele occurs only at a specific accession.
b Allele frequency less than 5%.
c Loci deviating from the HWE.
Distance-based phylogeny and genetic diversity
A genetic distance-based analysis was performed by calculating the shared allele frequencies, and an unrooted phylogram (Fig. 1) was computed using PowerMarker 3.23. All 178 accessions were placed into one of two main groups (G I and G II). The first group, G I, can be divided into two subgroups: G I-1 and G I-2. G I-1 is composed of 94 accessions (all are from South Korea, all nipponensis and many angularis). G I-2 is made up of only two outstanding accessions from South Korea (both Vigna nakashimae). G II can also be subdivided into two subgroups. G II-1 comprised 72 accessions (all North Korean accessions and some from South Korea). Subgroup G II-2 is composed of ten accessions (five each from North and South Korea).
The colours in Fig. 1 indicate the populations determined by structure analysis. According to these results, all 178 red bean accessions were divided into four populations, P 1, P 2, P 3 and P 4 with 83, 8, 11 and 71 accessions, respectively; the remaining five accessions were recognized as admixtures.
Genetic relationships and population structure analysis
The model-based clustering analysis was performed on all 178 accessions using the 39 selected SSR markers. The exact value of K (gene pool) was not clear because the estimated LnP(D) values increased until K = 9, although a drastic increase in the LnP(D) value occurred between K = 3 and K = 4; after K = 4, the LnP(D) values diminished (Fig. 2(a)). Therefore, the ad hoc quantity (ΔK) (Evanno et al., Reference Evanno, Regnaut and Goudet2005) was used to determine the real K value. The highest value of ΔK for the 178 accessions was K = 4 (Fig. 2(b)); however, a similar ΔK was also observed at K = 6. The lowest α value for K = 4 was 0.0251 and that for K = 6 was 0.0258. The relatively small value of the α parameter (α = 0.0251) indicated that most accessions originated from one primary ancestor, with a few admixed individuals (Ostrowski et al., Reference Ostrowski, David, Santoni, McKhann, Reboud, Le Corre, Camilleri, Brunel, Bouchez, Faure and Bataillon2006). Five accessions were identified as being derived from mixed ancestry at K = 4, while at K = 6, 13 accessions were found to have been derived from mixed ancestry in our study (by their inferred genome fraction value < 75%). Cluster 1 consisted of 82 accessions (from South Korea), whereas cluster 2 comprised eight accessions (from South and North Korea), cluster 3 comprised 11 accessions (from South Korea) and cluster 4 contained 71 accessions (from South and North Korea; Figs 1 and 2; Supplementary Table S1, available online only at http://journals.cambridge.org).
Gene flow study
The proportion of different alleles for microsatellite loci was analysed using a microsatellite toolkit. In locus CEDG029, one allele (144 bp) was shared in all the three groups of varieties and species (var. nipponensis, var. angularis and V. nakashimae), and three alleles (136, 144 and 188 bp) were shared between the wild ancestors and cultivated varieties (Fig. 3), while at locus CEDG090, one allele (83 bp) was shared in all the three groups and 12 alleles were shared between the wild ancestors and cultivated varieties (Fig. 4). The wild ancestry of the red bean is dispersed almost throughout South Korea, while V. nakashimae was found around the Research and Development Administration, Suwon, South Korea.
Discussion
Red bean is the second most important legume in Korea, and it has complex forms with its wild ancestor. Thus, characterization of these varieties is vital for understanding their genetic relationships for their conservation and management, as well as for crop improvement programmes. Many populations have phenotypically similar variations between wild ancestors and cultivated forms. Such populations are classified as complex populations (Tomooka et al., Reference Tomooka, Vaughan, Moss and Maxted2002). Analyses using amplified fragment length polymorphism (Xu et al., Reference Xu, Tomooka and Vaughan2000a) and random amplified polymorphic DNA (RAPD; Xu et al., Reference Xu, Tomooka, Vaughan and Doi2000b) have shown that a high level of genetic variation can be found in such complex populations. The collected red bean accessions from different regions of Korea (South and North) exhibited a high genetic diversity and PIC for 39 SSR loci with a high average allele (11) number per locus. Cui et al. (Reference Cui, Moe, Chung, Cho, Lee and Park2010) reported that the number of alleles detected and GD were strongly correlated with the number of accessions used. However, this result may depend on the specific accessions used in each study (Zhao et al., Reference Zhao, Chung, Ma, Kim, Kim, Shin, Kim, Koo and Park2009). The wide variation in genetic distance among the different accessions revealed by SSRs reflected a high level of polymorphism at the DNA level. SSRs have a high degree of variation, which is believed to be due to DNA slippage during replication, unequal crossing over and genetic recombination (Park et al., Reference Park, Lee and Kim2009). The present results also suggested introgression with complex relationship among the tested accessions. Introgression is the cause of higher genetic variation in complex populations, which naturally leads to the assumption that some individuals in complex populations are offspring from natural outcrossing between cultivated and wild forms (Wang et al., Reference Wang, Kaga, Tomooka and Vaughan2004). Although no precise information exists on the level of outcrossing in the red bean, insects such as the carpenter bee (Xylocopa appendiculata) are thought to contribute to temporal gene flow in red bean populations (Tomooka et al., Reference Tomooka, Vaughan, Moss and Maxted2002).
In total, 431 alleles were detected with 39 loci in this population of 178 accessions, which demonstrates the high discriminating power of SSR markers. Our collected population consisted of two groups by origin (South and North Korea) and three groups by variety or species (var. nipponensis, var. angularis and V. nakashimae; Supplementary Table S1, available online only at http://journals.cambridge.org). The genetic structure cannot clearly separate the population of South and North Korea. This was unexpected, as it seems that genetic variation is not primarily affected by the place from which the population was collected. Moreover, the different varieties cannot be differentiated clearly from the present data, as V. angularis var. nipponensis is primarily presumed to be the wild ancestor of the cultivated type V. angularis var. angularis (Yamaguchi, Reference Yamaguchi1992). Only V. nakashimae could be differentiated from the two varieties using diversity analysis but not with Structure analysis. Structure software assumed that within a population, the loci were in HWE and linkage equilibrium; however, in real life, many influences disturb the HWE, including non-random mating, mutations, selection, limited population size, random genetic drift, gene flow and meiotic drive (Hardy, Reference Hardy1908). Of a total of 39 loci, 38 loci deviated from the HWE and were highly significant in LD. V. nakashimae was proposed to belong to the genus Vigna, subgenus Ceratotropis (Verdcourt, Reference Verdcourt1970). Furthermore, Tateishi (Reference Tateishi1985) proposed that V. nakashimae should be treated as a subspecies in Vigna minima, under the name V. minima (Roxb.) Ohwi & Ohashi subsp. nakashimae (Ohwi) Tateishi (Tateishi, Reference Tateishi1985). Tomooka et al. (Reference Tomooka, Sumanasinghe, Kaga and Egawa1995) found that V. nakashimae and V. minima are in the same subgroup based on RAPD analysis (Tomooka et al., Reference Tomooka, Sumanasinghe, Kaga and Egawa1995).
Being cleistogamous plants, peas and beans are self-pollinating species. In other words, very little outcrossing can occur in natural populations. Although cleistogamy and cross-incompatibility reduces gene flow, our results clearly identified that introgression occurs in the red bean. A successful cross-compatibility of the wild ancestor of the red bean (V. angularis var. nipponensis) and subsp. V. nakashimae was detected by Siriwardhane et al. (Reference Siriwardhane, Egawa and Tomooka1991). The pattern of variation observed in this study explained gene flow from cultivated red beans, which are grown in the area, to wild plants in South and North Korean populations.
The first report of outcrossing events in natural populations of the red bean, which has always been considered to be a self-pollinating plant, was from Wang et al. (Reference Wang, Kaga, Tomooka and Vaughan2004). Our results also support the occurrence of outcrossing with the presence of H at multiple loci. The introgression of genes from unexpected genetic drift into natural populations from genetically modified (GM) plants by gene flow has become a matter of public concern in many countries. One must seriously consider whether GM red beans should be produced. A risk of unintentional transgene escape into wild populations exists at locations where complex or weedy populations are found (Wang et al., Reference Wang, Kaga, Tomooka and Vaughan2004). In fact, the dispersal of transgenes into wild relatives is difficult to control, even in predominantly self-pollinating crops such as red beans, and research is ongoing into the occurrence and probable or potential effects of gene flow into wild relatives.
Here, we demonstrated that some red bean accessions have specific alleles that could have unforeseeable benefit in genetic improvement strategies. Our findings help explain the genetic relationships and population structure of the red bean in Korea, and are useful for designing effective breeding programmes and broadening the genetic base of commercial varieties. Moreover, the results indicate that gene flow occurs between cultivated red beans and wild relatives in a given region.
Acknowledgements
This work was carried out with the support of Cooperative Research Program for Agriculture Science & Technology Development (Project No. 200908FHT020609001), Rural Development Administration, Republic of Korea.