Introduction
Grass pea (Lathyrus sativus L.) is a legume crop widely cultivated in several arid and semi-arid countries, especially in Ethiopia, covering 9% of total pulse growing area (CSA, 2007).
DNA markers, particularly simple sequence repeats (SSRs) have proven to be very powerful tools in a variety of genetic studies in plants (Katti et al., Reference Katti, Ranjekar and Gupta2001). The presence of SSR in gene coding regions and their availability of extensive full and partial cDNA/expressed sequence tag (EST) sequences for many plant species provide an effective way to develop SSR markers directly from EST (EST-SSR) (Eujayl et al., Reference Eujayl, Sledge, Wang, May, Chekhovskiy, Zwonitzer and Mian2004; Holton et al., Reference Holton, Christopher, McClure, Harker and Henry2002; Kantety et al., Reference Kantety, La Rota, Matthews and Sorrells2002). This strategy allows to overcome crucial limitations associated to the development of SSR markers, especially in crops for which molecular information is scant, such as grass pea.
Here, we present the first assessment of genetic variability and genetic structure among Ethiopian grass pea accessions using EST-SSRs.
Material and methods
Plant material
A total of 240 plants, representing 20 grass pea accessions from different regions of Ethiopia, were analyzed. The GenElute Plant Genomic DNA Miniprep Kit (Sigma-Aldrich, St. Louis, MO, USA) was used to isolate genomic DNA from 2-3-week-old leaves.
EST-SSR marker development
Nineteen new EST-SSRs were designed from the 65 L. sativus ESTs deposited from public database (http://www.ncbi.nlM.nih.gov/dbEST), using Batchprimer3 software (http://probes.pw.usda.gov/cgi-bin/batchprimer3/batchprimer3.cgi). In addition, 24 EST-SSR from M. truncatula, which proven to be transferable to other legume species (Gutierrez et al., Reference Gutierrez, Vaz Patto, Huguet, Cubero, Moreno and Torres2005) were selected to be used in grass pea.
PCR reaction and fragment analysis
PCR was performed in a final reaction volume of 15 μl containing 30 ng genomic DNA, 5 × PCR buffer, 0.2 mM each of dNTPs, 0.5 unit GoTaq® polymerase (Promega), 0.3 μl each of forward and reverse primers and 0.02 mM labelled M13 primer (6-FAM/VIC/PET/NED) (Schuelke, Reference Schuelke2000). Amplicons were analyzed using an ABI3130xl genetic analyzer (Applied Biosystems).
Data analysis
Allele size was determined as base pairs using GeneMapper® Software v3.7 (Applied Biosystems). The allelic data were subjected to diversity analysis within and among the accessions using PowerMarker 3.25 (Liu and Muse, Reference Liu and Muse2005). MICRO-CHECKER 2.2.1 (Van Oosterhout et al., Reference Van Oosterhout, Hutchinson, Wills and Shipley2004) was used to check for potential genotyping errors, such as allelic dropouts, stuttering or null alleles. Analysis of molecular variance (AMOVA) and other population statistics were measured using GenAlEx 6.1 (Peakall and Smouse, Reference Peakall and Smouse2006) and population structure was examined using STRUCTURE 2.3.1 (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000; Falush et al., Reference Falush, Stephens and Pritchard2003).
Results
Microsatellite validation and variability
Polymorphism screening was performed in five randomly chosen grass pea accessions. The screening revealed seven of the 19 Lathyrus EST and four of the 24 Medicago ESTs to be polymorphic and those 11 markers were utilized for the variation analysis.
From the 11 polymorphic EST-SSRs, a total of 45 alleles were detected in the 240 individual plants genotyped. The number of alleles/locus ranged from two (locus 942) to seven (locus MtBA32F05) and averaged four. Polymorphism information content (PIC) ranged from 0.184 (Ls942) to 0.776 (MtBA32F05), with a mean value of 0.416 (Table 1). The most informative markers were MtBA32F05 and MtBA10B02 with PIC value of 0.776 and 0.639, respectively. Rare alleles (frequencies < 0.05) were observed in all markers except marker Ls932, the highest being Ls074. Among the 45 alleles detected, rare alleles represent 35% of the alleles found in this analysis. The correlation coefficient between gene diversity (GD) and the number of alleles was high, r = 0.825 (P < 0.05).
Table 1 Characteristics of the EST-SSR markers used

AF, frequency of alleles; H, heterozygosity.
a Calculated over a set of 240 individual plants.
Diversity among and within accessions
Allele frequencies were re-adjusted within populations to account for null alleles and diversity analysis was performed using on the adjusted data. Average value of effective number of alleles/locus, percentage of polymorphic loci, H o and H e, Shannon's information index (I) were 1.96, 95.5%, 0.404, 0.419 and 0.704, respectively; therefore, our data show that moderately high diversity exists among the accessions under the study.
Accessions were grouped as seven populations based on their collection site to measure the diversity among regions. Average of effective number of alleles/locus, percentage of polymorphic loci, I, H o and H e were 2.09, 97.4%, 0.760, 0.390 and 0.430, respectively. Regions ‘Gojam, Welo and Gonder’ showed higher values in diversity measures, whereas the ‘Arsi and Hararge’ region exhibited lower levels of diversity.
Genetic structure
Ls989 locus showed null alleles in most of the accessions and it was excluded from further analysis. STRUCTURE was run for K = 1–10 based on the distribution of remaining 41 alleles at ten EST-SSR loci among the 240 plants. STRUCTURE simulation produced the highest K value at K = 3.
STRUCTURE revealed also that cluster I is composed of individuals from Northern regions (Tigray, Gojam, Gonder and Welo). Cluster II comprised of individuals from all the growing regions, and cluster III consisted of individuals primarily from Shewa and Gojam, and a few representatives from Welo and Gondar. None of the clusters had individuals exclusively from one region only (Fig. 1).

Fig. 1 Estimated population structure of grass pea landraces from Ethiopia. Summary plot of estimates by Q (estimated membership coefficients for each individual in the three clusters) as inferred from Structure.
AMOVA
AMOVA showed that the within-accession diversity explained most of the variation (84%). The mean Φpt value (analogous to F ST), 0.15, indicated the presence of moderate level of differentiation among the accessions, and a low level of differentiation (1%) among regions (Supplementary Table S1, available online only at http://journals.cambridge.org).
Discussion
Development of SSRs from EST databases has shown to be a feasible option for obtaining high-quality nuclear markers. For under-studied crops, this method is a relatively cheap way of developing sequence-based markers (Gupta et al., Reference Gupta, Rustgi, Sharma, Singh, Kumar and Balyan2003; Ellis and Burke, Reference Ellis and Burke2007). Thirty-seven per cent (7 out of the 19 newly designed Lathyrus EST-SSRs) of newly designed EST-SSR markers were polymorphic between the accessions under the study. Seventeen per cent (4 out of 24 Medicago EST-SSRs) were polymorphic under the study. Diversity analysis among the analyzed accessions showed the presence of a moderate level of diversity. Our analysis also confirmed the transferability of this type of markers also to related species, as demonstrated from the successful utilization of Medicago EST-SSRs. On the other hand, we also demonstrated the successful transferability of Lathyrus EST-SSRs to related species such as groundnut and green peas (data not shown).
High levels of heterozygosity were observed in Gojam, Gonder and Welo regions. Accessions from Gonder also showed high number of different alleles. Tadesse and Bekele (Reference Tadesse and Bekele2003) reported the presence of a significant variation among grass pea accessions from Ethiopia, based on morphological data. Their study showed a higher variability in accessions from Gondar and Tigray regions. The lowest diversity estimates using EST-SSR markers were observed in Arsi and Hararge regions. This might be due to the limited sampling, since these two regions were represented by only one accession each, but it could also be due to the actual low level of diversity present, since grass pea is not common in these two regions. Uneven distribution of alleles was observed among the analyzed samples, as revealed by the number of rare alleles (frequency ≤ 0.05), which accounted for 35% of the total number of alleles detected.
Population genetic structure across the analyzed accessions identified three groups in which individuals are clustered independently of their collection region, and it also showed admixture among accessions. This relatively low genetic differentiation among regions could be interpreted by gene flow due to movement of seeds – seed exchange among farmers being a mechanism used to enhance diversity of local germplasm and avoid crop failure. This results in an increase in distribution of alleles among different population, irrespective of geographical distance (Louette et al., Reference Louette, Charrier and Berthaud1997). Grass pea reproductive biology might also have contributed to increase within-population variation. In fact, although the floral biology of grass pea favours self-pollination (Campbell, Reference Campbell1997; Yadav and Bejiga, Reference Yadav, Bejiga, Brink and Belay2006), there are records of substantial outcrossing, which is dependent on environmental and/or genetic factors (Chowdhury and Slinkard, Reference Chowdhury and Slinkard1997; Gutiérrez-Marcos et al., Reference Gutiérrez-Marcos, Vaquero, Sáenz de Miera and Vences2006). Our result showed the existence of a significant level of variation among accessions, though most of the variation was present within populations. The current Ethiopian ex-situ collection of grass pea predominantly includes samples from the Shewa (45%) region. Based on present results, it might be useful to increase representative samples from other regions. Due to the establishment of second-generation high-throughput sequencing technologies, it is expected that, in the near future, the number of EST sequences from Lathyrus deposited at the public database will significantly increase. This could facilitate the development and application of a large set of molecular markers, such as EST-SSRs, which proved to be very informative and easy to develop. Due to modern genomic revolution, it is expected that the number of EST sequences getting deposited at the public database will increase exponentially and hence an increase in utilization of markers like EST-SSRs is highly recommended, as this study showed that the new EST-SSRs developed for the first time for grass pea are useful tools for the genetic diversity analysis in the species.
In our study, Gojam, Gonder and Welo regions had higher level of diversity compared with the others. This is consistent with the results of Tadesse and Bekele (Reference Tadesse and Bekele2003), whose studies of grass pea accessions from Ethiopia were based on morphological data. Most of the variations were due to the differences between individuals within accession (84%), with moderate high level of population differentiation (mean F ST = 0.15, P < 0.001). This may be due to farmer's habit to mix seeds from different sources before sowing as a mean to avoid crop failure. The results are similar to those reported by Chowdhury and Slinkard (Reference Chowdhury and Slinkard2000), who mentioned that 90.7% of the variability was due to within-region diversity. Our results showed the existence of moderate genetic variability in grass pea populations of Ethiopia mostly within accession.
Although still based on a limited number of molecular markers, our study represents the most accurate description to date of L. sativus germplasm from Ethiopia, where this crop is often the only choice available to harness in adverse cultural conditions. Modern breeding strategies for this important but neglected crop might take advantage of the information provided herewith.