Introduction
Guizotia abyssinica (L.f.) Cass. is an economically important edible oilseed crop that belongs to the tribe Heliantheae in the family Asteraceae. This crop, commonly known as ‘noug’ (in Amharic) or ‘niger’ (in English), is the only domesticated species of the small Afro-montane endemic and exclusively diploid (2n = 30) genus Guizotia (Baagøe, Reference Baagøe1974; Hiremath and Murthy, Reference Hiremath and Murthy1992; Dagne, Reference Dagne1995). It is an annual crop widely cultivated in Ethiopia and India (Riley and Belayneh, Reference Riley, Belayneh, Röbbelen, Downey and Shri1989; Getinet and Sharma, Reference Getinet and Sharma1996), and also on a small scale in several other African and Asian countries, as an edible oil crop (Murthy et al., Reference Murthy, Hiremath and Salimath1993; Getinet and Sharma, Reference Getinet and Sharma1996), and in the USA, mainly as a component of birdseed (Kandel and Porter, Reference Kandel and Porter2002). Additionally, the oil is used for various industrial purposes such as soaps, paints, illuminants and lubricants (Baagøe, Reference Baagøe1974; Riley and Belayneh, Reference Riley, Belayneh, Röbbelen, Downey and Shri1989; Kandel and Porter, Reference Kandel and Porter2002), and for cultural and medicinal purposes (Geleta et al., Reference Geleta, Asfaw, Bekele and Teshome2002). Niger seed oil is rich in linoleic acid (Dutta et al., Reference Dutta, Helmersson, Kebedu and Alemaw1994; Alemaw and Teklewold, Reference Alemaw and Teklewold1995; Dagne and Jonsson, Reference Dagne and Jonsson1997; Ramadan and Mörsel, Reference Ramadan and Mörsel2002). The species is strictly outcrossing via a self-incompatibility mechanism(s) (Riley and Belayneh, Reference Riley, Belayneh, Röbbelen, Downey and Shri1989; Nemomissa et al., Reference Nemomissa, Bekele and Dagne1999), and is pollinated mainly by insects (particularly bees) (Fichtl and Adi, Reference Fichtl and Adi1994; Geleta et al., Reference Geleta, Asfaw, Bekele and Teshome2002). Some evidence has been presented which indicates that the domesticated species originated from G. scabra ssp. schimperi through selection and further cultivation (Baagøe, Reference Baagøe1974; Hiremath and Murthy, Reference Hiremath and Murthy1988; Murthy et al., Reference Murthy, Hiremath and Salimath1993; Dagne, Reference Dagne1994, Reference Dagne1995, Reference Dagne2001).
Ethiopia is the centre of origin and diversity of niger (Harlan, Reference Harlan1969; Zeven and de Wet, Reference Zeven and de Wet1982), where it has the longest history of cultivation (Baagøe, Reference Baagøe1974; Hiremath and Murthy, Reference Hiremath and Murthy1988). It can be cultivated on waterlogged, marginal and poor soils where most other crops fail to grow (Getinet and Sharma, Reference Getinet and Sharma1996), because of its ability to withstand salinity and anoxia (Abebe et al., Reference Abebe, Yermanos and Bingham1978). It is also known for its suitability for multiple cropping, especially as a border crop in fields of other crops (Geleta et al., Reference Geleta, Asfaw, Bekele and Teshome2002). In Ethiopia, it is grown mainly from 1600 to 2200 m above sea level (asl), where the range in temperature is 15–23°C, and in annual rainfall is 500–1000 mm (Getinet and Sharma, Reference Getinet and Sharma1996). Niger occupies about 50% of the total oil crop area and production volume in Ethiopia, where its production is mainly based on local landraces in need of genetic improvement in terms of various traits such as seed yield, seed oil content, oil fatty acid composition and pest/disease resistance. Such genetic improvement through breeding depends on the magnitude of genetic diversity and the extent to which this diversity is utilized. The few genetic diversity studies published to date are based on morphological characterization (Nayakar, Reference Nayakar1976; Alemaw and Teklewold, Reference Alemaw and Teklewold1995; Pradhan et al., Reference Pradhan, Mishra and Paikary1995; Genet and Belete, Reference Genet and Belete2000), and the only published marker-based study used random amplified polymorphic DNA (RAPD) (Geleta et al., Reference Geleta, Bryngelsson, Bekele and Dagne2007).
The present study was undertaken to investigate the extent of genetic variation within and among populations of Ethiopian niger using the amplified fragment length polymorphism (AFLP) marker technique. It also set out to identify any hotspots of diversity that may be important in optimizing both conservation strategies and the utilization of existing genetic diversity.
Materials and methods
Plant material and DNA extraction
Seventeen populations of niger were collected from farmers' fields in Ethiopia. A single farmer's field was considered as a population, and each population was represented by a single seed collected from each of ten individual plants. The populations were sampled from a wide range of altitudes (1590–2550 m asl) representing all regions where niger is currently cultivated. Each region was represented by at least one population (Fig. 1, see also Table 1). Seeds were grown in a greenhouse and fresh 15- to 30-day-old leaves were used for genomic DNA extraction, using the modified cetyl trimethyl ammonium bromide (CTAB) procedure described in Assefa et al. (Reference Assefa, Merker and Hailu2003).
*Populations collected from minor niger-producing regions (MiNPR) and their mean values for different parameters.
**Populations collected from major niger-producing regions (MaNPR) and their mean values for different parameters.
a Mean of each parameter for the 17 populations (corresponding value of PPL is P P).
b Corresponding values of each parameter when all individuals were considered together (corresponding value of PPL is P S).
Amplified fragment length polymorphism
AFLP analysis was performed according to Vos et al. (Reference Vos, Hogers, Bleeker, Reijans, van de Lee, Hornes, Frijters, Pot, Peleman, Kuiper and Zabeau1995) with modifications as follows. Genomic DNA (1 μg) was sequentially digested first with 5 U MseI at 65°C for 1 h and then with 5 U EcoRI at 37°C for 90 min in a volume of 50 μl in 6.6 mM Tris-acetate, pH 7.9, 2 mM magnesium acetate, 13.2 mM potassium acetate and 20 ng/μl bovine serum albumin (BSA). Ligation was effected in a 10 μl mixture of 0.5 μmol EcoRI adapter, 6 μmol MseI adapter, 50 mM Tris–HCl, pH 7.6, 10 mM MgCl2, 1 mM adenosine triphosphate (ATP), 1 mM dithiothreitol (DTT), 5% (w/v) polyethylene glycol-8000 and 1 U T4 DNA ligase, incubated for 3 h at 37°C. The product was diluted 1:2.3 with T10E0.1 (10 mM Tris–HCl, pH 8.0, 0.1 mM EDTA), of which 10 μl was used as a template in a 20 μl pre-amplification reaction containing 20 mM Tris–HCl, pH 8.55, 16 mM (NH4)SO4, 0.01% Tween20 and 2 mM MgCl2, 30 ng each of EcoRI-A and MseI-C primers, 0.2 mM dNTP, 1.5 mM MgCl2 and 0.5 U Thermowhite Taq DNA polymerase (Saveen Werner AB). The reaction was subjected to 20 cycles of 92°C/1 min, 60°C/30 s and 72°C/1 min, diluted 1:25 with T10E0.1, and used as a template for selective amplification.
Seven selective primer combinations (PCs) were selected (see Table 2) from a set of 56, on the basis that they detected sufficient polymorphism and generated amplification profiles which were easy to score. The selective amplification primers carried three selective nucleotides (SNs). The 20 μl amplification reaction contained polymerase chain reaction (PCR) buffer (as above), 25 ng EcoRI primer+three SNs, 30 ng MseI primer+three SNs, 0.2 M dNTP, 1.5 mM MgCl2, 0.5 U Taq DNA polymerase and 10 μl diluted pre-amplification product. The amplification profile was 94°C/2 min, followed by 36 cycles as described in Ferdinandez and Coulman (Reference Ferdinandez and Coulman2004), and a final step of 72°C/2 min. The amplified product was denatured by adding 15 μl of 98% formamide, 10 mM EDTA, 0.05% (w/v) each of bromophenol blue and xylene cyanol FF, and incubating at 96°C for 5 min. Seven microlitres of the amplification product was loaded on 5% (w/v) polyacrylamide gels and separated at 90 W constant power until the xylene cyanol FF dye had run two-thirds of the length of the plate. Before loading the samples, the gel was pre-run for 45 min. DNA bands were visualized using the silver staining technique of Caetano-Anollés and Gresshoff (Reference Caetano-Anollés and Gresshoff1994) with the following modifications: (1) 10% acetic acid was used as fixer solution and stopping solution; and (2) the concentration of sodium thiosulphate in the developing solution was 2 mg/l.
*Mean of values in column.
a Highly significant difference in (P = 0.000) and G ST (P = 0.005) between primer combinations.
b No significant difference between primer combinations as revealed by ANOVA.
c E = EcoRI primer (5′-GACTGCGTACCAATTC-3′).
d M = MseI primer (5′-GATGAGTCCTGAGTAA-3′).
Data scoring and analysis
Each AFLP fragment was considered as a single bi-allelic locus with one amplifiable and one null allele. Data were recorded as 1 for the presence and 0 for the absence of each amplified fragment in the size range 50–600 bp. Gels were routinely scored twice. Genetic diversity was calculated based on: (1) Shannon diversity index and (2) Nei's gene diversity with the modification provided by Lynch and Milligan (Reference Lynch and Milligan1994), as described in Geleta et al. (Reference Geleta, Bryngelsson, Bekele and Dagne2007), using polymorphic loci only. Gene flow was estimated using Wright's (Reference Wright1951) equation, as modified by Crow and Aoki (Reference Crow and Aoki1984). The NTSYSpc program (Rohlf, Reference Rohlf2000) was used to calculate genetic distances, matrix correlation coefficients, the Mantel test, and to perform cluster and principal coordinate analyses. POPGENE version 1.31 (Yeh and Boyle, Reference Yeh and Boyle1997) was used for analysis of number and percentage of polymorphic loci. Analysis of molecular variance (AMOVA) was conducted using Arlequin version 2 (Schneider et al., Reference Schneider, Roessli and Excoffier2000), and the FreeTree-Freeware program (Pavlicek et al., Reference Pavlicek, Hrda and Flegr1999) was used for bootstrap analysis.
Results
Genetic polymorphism and AFLP primer combinations
A total of 539 fragments were detected among the 170 individual plants, of which 483 (over 89%) were polymorphic (Fig. 2, Table 1). The number of polymorphic loci averaged 69 per primer combination (PC). When each population was considered separately, the percentage of polymorphic loci ranged from 46% (T-1) to 60% (Wl-1) with the mean (P P) of about 51% (Table 1). Comparisons of profiles of individuals across all loci revealed that each individual was genetically unique, implying the presence of a high level of genetic polymorphism.
AFLP PCs used in this study were significantly different in the number and percentage of polymorphic loci (NPL and PPL) they detected (P < 0.01). Of the seven PCs, the highest mean number of polymorphic loci (77) was revealed by E-AGG/M-CTA, while the highest number of unique alleles specific to a given population was revealed by E-ACA/M-CTA (Table 2). There was no significant difference between PCs in terms of total ( and H T) and within-population ( and H S) genetic variation. Contrary to this, there was a significant difference in (P < 0.001) and G ST (P < 0.01) between the PCs. Among the PCs, E-AAG/M-CTC revealed the highest estimate of population differentiation (; G ST = 0.354; F ST = 0.357; Table 2).
Total and within-population genetic variation
The overall genetic diversity estimated by Shannon diversity index as and gene diversity estimate (Nei, Reference Nei1978) as was 0.628 and 0.320, respectively. Similarly, the overall within-population variation estimated by Shannon diversity index and Nei's gene diversity estimate were 0.434 and 0.205, respectively (Table 2). The extent of genetic diversity of each population was calculated using Shannon diversity and gene diversity estimates as and H j, respectively, which are the average values across the whole polymorphic loci. ranged from 0.392 (Sh-2) to 0.500 (Wl-1), while H j ranged from 0.183 (T-1) to 0.241 (Wl-1) (Table 1). Taking the two parameters into consideration, Sh-2, T-1 and Wl-2 showed lower genetic diversity as compared to other populations, while Wl-1 showed the highest genetic diversity, followed by Gj-1.
The evaluation of the AFLP fingerprints revealed unique alleles in 12 of the populations (Table 1). We grouped the 17 populations according to the major (MaNPR) and minor (MiNPR) niger-producing regions to determine whether there is any significant difference in the level of genetic variation between them. The mean and H j for populations from MiNPR were 0.438 and 0.206, respectively, while these parameters were 0.430 and 0.200, respectively, for populations from MaNPR, indicating a similar level of genetic variation in both groups.
Genetic variation between populations and groups
The population differentiation was calculated as from Shannon diversity index, as G ST from gene diversity estimates (Nei, Reference Nei1973) and as F ST from AMOVA, which resulted in the overall corresponding means of 0.330 , 0.269 and 0.234 (Table 2). AMOVA revealed that the observed genetic variation among populations is highly significant (P < 0.001; Table 3A). On the other hand, AMOVA showed that the genetic variation between MaNPR and MiNPR populations was less than 1% of the total variation (P>0.100) (Table 3B). Similarly, AMOVA conducted by grouping the populations into a higher altitude group (>2000 m asl) and a lower altitude group ( < 2000 m asl) revealed no significant difference between them (Table 3C). We also grouped populations into five groups based on their geographic proximity and better access to germplasm exchange (Table 3D), where AMOVA revealed significant variation between the groups (7.5%; P < 0.001). The presence of unique alleles in each group contributed to the significant variation obtained. For example, 12 unique alleles were recorded in group II (Gj-1 and Gj-2) with frequencies ranging from 0.05 to 0.35 (data not shown). The estimate of gene flow (Nm), calculated based on AMOVA-derived F ST, was 0.924 (Table 3).
AP, among populations; WP, within populations; AG, among groups; APWG, among populations within groups.
Genetic distance, cluster analysis and principal coordinate analysis (PCoA)
The significant population differentiation was further analysed using genetic distance coefficient and multivariate analyses to identify populations that are more differentiated from the majority and to reveal their clustering pattern. Nei's standard genetic distance coefficient (Nei, Reference Nei1972) was used to evaluate the extent of genetic similarity between each pair of populations. A more than fourfold variation in genetic distance between pairs of populations that ranged from 0.040 (Gr-1 versus Gr-2; Gr-2 versus Wg-1) to 0.175 (H-1 versus T-1) was obtained, with the overall mean genetic distance of 0.118 (Supplementary Table 1, available online only at http://journals.cambridge.org). The comparison of matrices of Nei's standard genetic distances and geographic distances through normalized Mantel statistics (Mantel, Reference Mantel1967) with 1000 permutations revealed a significant positive correlation (r = 0.258; P < 0.01).
The cophenetic correlation coefficients between genetic distance and its cophenetic value matrix, and between genetic distance and distance matrix calculated from eigen vector matrix were 0.91 and 0.90, respectively (see Fig. 3a, b). Three clusters (I, II and III) were revealed in the unweighted pair group method with arithmetic average (UPGMA) cluster analysis (Fig. 3a), which was also clearly depicted in the PCoA (Fig. 3b). Clusters I, II and III consist of nine, three and five populations, respectively. The bootstrap value for branching of cluster I from the other two clusters was the maximum (100), while the other two clusters were separated from one another with a lower bootstrap value (53; Fig. 3a). In the case of PCoA, the first three principal coordinate axes explained 67% of the total variation in the AFLP data. The first principal coordinate axis (PC-I) explained 34% of the total variation, and the three clusters were more clearly discriminated on this axis (Fig. 3b). The second axis (PC-II) explained 20% of the total variation and was better than PC-I in discriminating populations within clusters.
Discussion
Genetic polymorphism and within-population genetic variation
The overall genetic polymorphism within the species was high, with about half of the loci in each population being polymorphic. This allows for an easy means to distinguish between niger populations and even between individuals within populations. Several different approaches have been used to estimate within-population genetic variation. We used both and H S in order to broaden the comparison with previous studies. The overall means of these parameters revealed in this study were slightly higher than that obtained from a RAPD-based study (Geleta et al., Reference Geleta, Bryngelsson, Bekele and Dagne2007), indicating the superiority of AFLP over RAPD in detecting genetic variation in niger populations. The proportion of within-population variation reported by Geleta et al. (Reference Geleta, Bryngelsson, Bekele and Dagne2007) for AMOVA and Shannon diversity index was 65% and 57%, respectively, which is lower than that revealed in the present study (77% and 67%, in that order). Despite the relatively higher proportion of within-population variation revealed by AFLP as compared to RAPD, both marker systems demonstrated that a higher proportion of the total variation is to be found within, rather than between, populations.
The extent of genetic variation in niger populations was wide ranging and the number of unique alleles per population also varied. Despite its highest genetic diversity, only one unique allele with a frequency of 0.2 was recorded in Wl-1 (Table 1). Thus, the highest genetic variation revealed in this population is mainly due to the fact that both alleles were maintained in a relatively higher frequency per locus. Here, it is interesting to note that Wl-1 (designated as Wl-2 in Geleta et al., Reference Geleta, Bryngelsson, Bekele and Dagne2007) showed the highest diversity of the 70 populations studied using RAPD, indicating a reasonable degree of agreement between the two marker systems in detecting the extent of genetic diversity. The second most genetically diverse population revealed in the present study was Gj-1 (Table 1). This population was different from Wl-1 in that it contained several unique alleles revealed by five of the seven PCs. The maximum possible values of and H j that could be obtained for dominant markers of two alleles at each locus for a population represented by ten individuals are 1.000 and 0.538, respectively (Geleta et al., Reference Geleta, Bryngelsson, Bekele and Dagne2007). Comparing the mean (0.430) and H j (0.200) obtained in the present study to these maximum possible diversity estimates leads to the conclusion that Ethiopian niger has sufficient genetic variability to be able to breed varieties with desirable traits.
The significantly higher genetic variation in MiNPR populations over MaNPR populations reported for a RAPD-based study (Geleta et al., Reference Geleta, Bryngelsson, Bekele and Dagne2007) was not supported by the present study. The fact that the extent of genetic diversity was not associated with either the altitude or the extent of cultivation, both in this and in the RAPD-based study, leads to the conclusion that the existing genetic diversity of Ethiopian niger is distributed within all growing regions regardless of the altitude and the extent of cultivation.
Genetic variation and genetic distance between populations and groups
The main evolutionary forces responsible for population differentiation are selection, gene flow and genetic drift, which operate within the historical and biological context of each plant species (Loveless and Hamrick, Reference Loveless and Hamrick1984). Thus, the extent of population differentiation depends on the relative strength of these individual forces in interaction with the type of mating system and other life history traits of the species. Genetic differentiation of populations may occur for any genetically variable trait that is favoured under the existing selection conditions (Bossdorf et al., Reference Bossdorf, Auge, Lafuma, Rogers, Sicmann and Prati2005) and the estimate of such population differentiation can be calculated using various parameters such as , G ST and F ST. Nybom (Reference Nybom2004) analysed eight AFLP-based studies of outcrossing species and obtained a mean F ST and G ST value of 0.23 and 0.24 respectively, which is comparable with that of the present study. Therefore, this study demonstrated an average level of population differentiation with significant variation among populations.
Some loci were polymorphic in only one population but monomorphic in all others. Such unique alleles may serve as population-specific markers in future generations, provided that they are favoured under both natural and artificial selection conditions and that gene flow between populations is limited. The significant genetic variation between groups of populations based on geographical proximity and access of gene flow is strong evidence to suggest a considerable degree of ‘regional’ differentiation of niger populations. Such population differentiation into ecotypes is important for the selection of parental materials to maximize heterozygosity in the progeny. Furthermore, genetically differentiated populations are often suggested as candidates for genetic conservation to prevent the loss of unique genetic variants. Thus, conserving a large number of populations from all its growing regions ex situ in gene banks as a complement to on-farm conservation is the best policy for conserving a high level of unique genetic variants in the gene pool.
Although the populations were differentiated to a significant degree, the among-population variation was less than the within-population variation. Moreover, population-specific monomorphic markers were not detected for all the 483 polymorphic loci. The absence of such population-specific markers is an indication of strong gene flow between niger populations (Nm = 0.924; Table 3), most likely through germplasm exchange. The lower proportion of among-population variation as compared to the proportion of within-population variation is likely a result of the high level of genetic variability maintained by the outcrossing nature of the plant, in agreement with the general understanding that outcrossing species tend to be more diverse within, with less genetic differentiation between, populations (Hamrick and Godt, Reference Hamrick and Godt1996; Nybom, Reference Nybom2004).
The overall mean Nei's genetic distance (0.118) revealed in this study is lower than that of the RAPD-based study (0.176) and significant positive correlations between geographic distances and Nei's genetic distance were obtained. Positive correlations between geographic distance and genetic distance in outcrossing species have been reported by several authors (e.g. Ayres and Ryan, Reference Ayres and Ryan1997; Shim and Jørgensen, 2000).
UPGMA cluster analysis and PCoA
It has been suggested that the use of cluster analysis in combination with PCoA helps to extract maximum information from molecular data (Messmer et al., Reference Messmer, Melchinger, Boppenmaier, Herrmann and Brunklaus-Jung1992) as PCoA facilitates the detection of intermediate populations (Lessa, Reference Lessa1990). Our cluster analysis and PCoA fit well with the genetic distance data, as shown by high cophenetic correlation coefficients (Rohlf, 2000). PCoA is used to allow for a visualization of differences among the populations and the identification of possible groups, as long as the first two or three axes (PCs) explain most of the variation (Mohammadi and Prasanna, Reference Mohammadi and Prasanna2003). Three clusters were obtained, by applying the principle of an ‘acceptable number of clusters’ (i.e. where the within-cluster genetic distance is less than the overall mean genetic distance and where the between-cluster distances are greater than the within-cluster distance of the two clusters involved; Brown-Guedira et al., Reference Brown-Guedira, Thompson, Nelson and Warburton2000). All three clusters (Fig. 3a, b) contain populations from geographically distinct regions, which may indicate long-distance gene flow along with human movement. On the other hand, populations from the same region were clustered together, except for Arsi and Shewa, where the two populations from each region were placed into different clusters. This further supports the, previously reported, considerable degree of population clustering according to region of origin (Geleta et al., Reference Geleta, Bryngelsson, Bekele and Dagne2007).
This study generated comprehensive information regarding the genetic diversity of niger and demonstrated that AFLP is an appropriate technique for its evaluation. With more than 20% of the total genetic variation found between populations, we conclude that all the populations have unique genetic properties that make each niger population a significant unit for conservation and breeding purposes. Thus, our recommendation is that as many populations as possible should be conserved, as this reduces the risk of losing unique genetic variants due to shifting of cultivation practices and other factors. Furthermore, conserving a large number of genetically differentiated populations would also preserve a larger evolutionary potential of the crop that exists due to co-adaptation of gene complexes and local adaptation of populations. The extent and distribution of genetic variation in niger accessions conserved ex situ could be evaluated reasonably by as few as two AFLP PCs. If populations are to be ranked, emphasis should be given to those with high genetic variation and genetic distance, to capture unique genetic variation. The study also strengthens our previous recommendation of further germplasm collection by giving special emphasis to regions and areas underrepresented in the gene bank collections.
Acknowledgements
The authors wish to thank The Swedish International Development Agency (SIDA/SAREC) for financing this work and the International Science Program (ISP) for coordinating the program and facilitating the working atmosphere for the first author. We are grateful to Mrs Britt Green for her assistance with the laboratory work.