Introduction
It is generally acknowledged that wheat was domesticated across the Near East about 10,000 years ago (Nesbitt, Reference Nesbitt, Caligari and Brandham2001). The putative donor ancestors of wheat are still growing in this area and have been accumulating genetic diversity more than what had been inherited by cultivated Triticum taxa (Chantret et al., Reference Chantret, Salse, Sabot, Rahman, Bellec, Laubin, Dubois, Dossat, Sourdille, Joudrier, Gautier, Cattolico, Beckert, Aubourg, Weissenbach, Caboche, Bernard, Leroy and Chalhoub2005; Haudry et al., Reference Haudry, Cenci, Ravel, Bataillon, Brunel, Poncet, Hochu, Poirier, Santoni, Glémin and David2007). These genetic resources are highly desirable and precious for improving the wheat gene pool and maintaining sustainable agroecosystems that are endangered by increasing climatic changes (Maxted et al., Reference Maxted, Kell, Ford-Lloyd, Maxted, Ford-Lloyd, Kell, Iriondo, Dulloo and Turok2008). Therefore, the natural genetic resources of diploid Triticum including T. monococcum s. lat. and T. urartu have been frequently studied (Heun et al., Reference Heun, Schafer-Pregl, Klawan, Castagana, Accerbi, Borghi and Salamini1997; Kilian et al., Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b; Dvorak et al., Reference Dvorak, Luo and Akhunov2011; Peng et al., Reference Peng, Sun and Nevo2011) in the Fertile Crescent, which is well known as the origin of wheat domestication. However, the above-mentioned gene pool in one of the oldest centres of wheat domestication, i.e. the Iranian part of the Fertile Crescent (Riehl et al., Reference Riehl, Benz, Conard, Darabi, Deckers, Fazeli Nashli and Zeidi-Kulehparcheh2012, Reference Riehl, Zeidi and Conard2013) and areas very close to it, although not completely but to a great extent, has been ignored and can be considered as the lost ring of the chain of investigations.
Recently, DNA sequencing of nuclear gene loci, including Acc-1 and Pgk-1, which are the focus of the present study, has emerged as an important tool for assessing the inter- and intra-relationships among the Triticeae taxa (Huang et al., Reference Huang, Sirikhachornkit, Su, Faris, Gill, Haselkorn and Gornicki2002a, Reference Huang, Sirikhachornkit, Faris, Su, Gill, Haselkorn and Gornickib; Fan et al., Reference Fan, Zhang, Sha, Zhang, Yang, Ding and Zhou2007; Kilian et al., Reference Kilian, Ozkan, Deusch, Effgen, Brandolini, Kohl, Martin and Salamini2007a, b; Takenaka et al., Reference Takenaka, Mori and Kawahara2010). Acc-1 and Pgk-1 are predominantly known as single-copy genes in grasses and encode plastid acetyl-CoA carboxylase and plastid 3-phosphoglycerate kinase, respectively (Huang et al., Reference Huang, Sirikhachornkit, Su, Faris, Gill, Haselkorn and Gornicki2002a, b). The significant role played by these genes in the ecogenetic balance of plants (Longstaff et al., Reference Longstaff, Raines, McMorrow, Bradbeer and Dyer1989; Sasaki and Nagano, Reference Sasaki and Nagano2004) has made their sequencing data important for determining plant genetic structure and relationships across geographical areas (Avise, Reference Avise2000). The variability of these genes was fundamentally screened in a vast area by Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b) and Golovnina et al. (Reference Golovnina, Kondratenko, Blinov and Goncharov2009), albeit the Iranian part has mostly remained untouched.
The main goal of this study was to preliminarily evaluate the molecular diversity of the wild diploid Triticum in Iran, using two functionally important genes, Acc-1 and Pgk-1. In addition, the genetic structure and geographical distribution patterns of the above-mentioned loci were the focus of this study.
Materials and methods
Plant materials
In total, 176 individuals collected from ten locations across the distribution range of T. urartu and T. monococcum ssp. aegilopoides (Table S1, available online) were evaluated. The materials used in this study are deposited in the herbarium of the University of Isfahan, Iran. Taking into account the inbreeding nature of wheat and the topography and coordinates of the area under study, we subdivided the distribution range of the Triticum collection evaluated into three geographical regions, each representing one population (Fig. 1). The taxonomic identifications were made on the basis of the work of van Slageren (Reference van Slageren1994) and Nasernakhaei et al. (Reference Nasernakhaei, Rahiminejad, Saeidi and Tavassoli2013).
DNA extraction and polymerase chain reaction (PCR) amplification
DNA extraction was carried out using fresh leaves of a single plant following the Cetyltrimethylammonium bromide method of Gawel and Jarret (Reference Gawel and Jarret1991). The Acc-1 and Pgk-1 loci were amplified with the specific primers Acc1T (2T)s (5′-GGA CTT AGT TTT TTG TCG TCA GTT-3′), Acc1Ta new (5′-CTT CCA AAC GTA AGG ACC AAT ACA-3′), Pgk4Ts (5′-GCT TGG CTC CCC TTG TGC CCC G-3′) and Pgk1Ta (5′-CAC ACT TCT CCA GCA GGG ATT CGA-3′) (Golovnina et al., Reference Golovnina, Kondratenko, Blinov and Goncharov2009). The PCR procedure was optimized according to thermal and MgCl2 concentration gradient tests. DNA amplifications were carried out in a 25 μl reaction volume containing 40–50 ng of genomic DNA, 200 μM of each dNTPs, 0.1 μM of each primer, 2.5 μl of 10 × buffer, 3 mM MgCl2 and 1 U Taq DNA polymerase. The PCR program for each of the specific primers was implemented at the optimized conditions as follows: one cycle of 5 min at 94°C and 33 cycles of 45 s at 94°C, 45 s at 61°C (Acc-1) or at 67.5°C (Pgk-1) and 1 min at 72°C, followed by 10 min at 72°C.
Pre-screening for polymorphisms by non-denaturing gel electrophoresis
PCR products were denatured for 5 min at 100°C and characterized by single-strand conformation polymorphism (SSCP) analysis, as described by Rodriguez et al. (Reference Rodriguez, Cai, Teng and Spooner2011). Electrophoresis was carried out at 300 V for 19–21 h at 8°C and visualized by silver staining as described by Sanguinetti et al. (Reference Sanguinetti, Dias Neto and Simpson1994).
Sequencing of PCR products
Based on the revealed SSCP patterns for each of the loci, a subset of 76 of the 176 individuals representing three geographical areas (Fig. 1) were sequenced in both directions (Bioneer, Korea).
Data analysis
The sequence data were aligned manually using BioEdit version 7 (Hall, Reference Hall1999). The sequences were edited using ChromasPro version 1.41 (Technelysium Pty Ltd, Tewantin, Australia). For each species, the number of segregating sites, the number of haplotypes, the haplotype diversity (H d; Nei, Reference Nei1987) and the nucleotide diversity (P i; Nei, Reference Nei1987) were calculated using DnaSP version 5 (Liberado and Rozas, Reference Liberado and Rozas2009). To check the concordance between taxonomic status and genetic structure, the combined haplotype data of Acc-1 and Pgk-1 were screened using a Bayesian, model-based clustering algorithm implemented in STRUCTURE version 2.3.4 (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000). File format conversion was done using PGDSpider version 2.0.5.1 (Lischer and Excoffier, Reference Lischer and Excoffier2012). The analysis was run for the range of genetic clusters from K= 1 to K= 10 with the admixture, LOCPRIOR and allele frequency-independent models, and it was replicated ten times for each K value. Each run was implemented with a burn-in period of 10,000 steps followed by 100,000 Markov chain Monte Carlo replicates. The most optimal value for K based on Evanno's ΔK (Evanno et al., Reference Evanno, Regnaut and Goudet2005) was selected using STRUCTURE HARVESTER (Earl and vonHoldt, Reference Earl and vonHoldt2012), and graphical representations of population assignments were obtained using CLUMPP (Jakobsson and Rosenberg, Reference Jakobsson and Rosenberg2007) and DISTRUCT (Rosenberg, Reference Rosenberg2004).
Population structure was also investigated via analysis of molecular variation (AMOVA; Excoffier et al., Reference Excoffier, Smouse and Quattro1992), which was used to hierarchically partition genetic variation using 1000 permutations, and was calculated using ARLEQUIN version 3.1 (Excoffier et al., Reference Excoffier, Laval and Schneider2005). The hierarchies were tested ‘between species’, ‘among populations (different regions) within the species’ and ‘within populations’ in the entire dataset. Subsequently, AMOVA models were analysed to test the distribution of genetic variance ‘among’ and ‘within’ the populations of T. monococcum ssp. aegilopoides and T. urartu separately.
The median-joining networks (Bandelt et al., Reference Bandelt, Forster and Rohl1999) representing the relationships among the Iranian wild diploid Triticum haplotypes evaluated in this study and those taken from the NCBI (Kilian et al., Reference Kilian, Ozkan, Deusch, Effgen, Brandolini, Kohl, Martin and Salamini2007a, b; Golovnina et al., Reference Golovnina, Kondratenko, Blinov and Goncharov2009) were drawn gene by gene using Network version 4.6.1.1 based on a maximum parsimony approach.
Results
Genetic variation
Totally, 791 and 679 bp were acquired for the Acc-1 and Pgk-1 loci, respectively. In sum, 44 sequences were deposited for both loci in the GenBank (accession numbers KC965047–KC965055 and KF023037–KF023071). All together, six haplotypes, i.e. three for each of the loci, were found and similarly named as A, B and C (Table 1). In both loci, haplotype B was new for Iran. The Acc-1 and Pgk-1 loci exhibited four versus 29 nucleotide substitutions and two versus four insertions/deletions (indels), respectively (Tables 2 and 3). Haplotype A (45/76), vastly distributed in an area all along the Zagros Mountains to Azerbaijan provinces, and haplotype B, restricted to only six individuals collected from North-west of Iran (6/76), of T. monococcum ssp. aegilopoides exhibited the maximum and minimum frequencies, respectively, and haplotype C of T. urartu exhibited a frequency of 25/76 (Table 1). Geographically, North-west of Iran representing all the three haplotypes appeared to be the most variable zone (Table 1 and Fig. 1). The greatest genetic diversity (H d= 0.416 ± 0.090 and P i= 0.007 ± 0.001) was found in the population belonging to T. monococcum ssp. aegilopoides from North-west of Iran (Table S2, available online), while genetic diversity in populations belonging to the other regions and also T. urartu was null (H d= 0 and P i= 0).
–, i and e correspond to insertion/deletion (indels), intron and exon, respectively.
a Only polymorphic sites are indicated.
–, i and e correspond to insertion/deletion (indels), intron and exon, respectively.
a Only polymorphic sites are indicated.
Population structure
Based on ΔK developed by Evanno et al. (Reference Evanno, Regnaut and Goudet2005), a sharp peak in ΔK was observed at K= 2, clearly dividing the collection evaluated into two clusters (Fig. 1), which was fully consistent with the diagnostic morphological features of T. monococcum ssp. aegilopoides and T. urartu (van Slageren, Reference van Slageren1994; Nasernakhaei et al., Reference Nasernakhaei, Rahiminejad, Saeidi and Tavassoli2013). The probable K value could not be inferred based on the LnP(D) value because there was no sudden change in it. T. monococcum ssp. aegilopoides was completely made up by cluster 1, while T. urartu had contribution mainly from cluster 2, but it also had a small contribution from cluster 1. The relationship between haplotype B (in light green) and the clusters is shown in Fig. 1.
AMOVA carried out using a concatenated sequence of both loci indicated that the highest variation (86.28%) occurred ‘between species’ (Table 4). Within species, most of the variation in T. monococcum ssp. aegilopoides was observed within regions (76%), though the overall F ST value among regions was high (F ST= 0.24). No variation was detected for T. urartu.
d.f., Degrees of freedom; F CT, variance among groups relative to total variance; F ST, variance among populations; F SC, variance among populations within groups (species); NA, statistics that could not be calculated due to a lack of variation.
Haplotype networks (Fig. 2) visualized a global relationship combined of our materials and the relevant haplotype data taken from the NCBI for both loci.
Discussion
Genetic variation and population structure
The genetic diversity encountered among the Iranian wild einkorn wheat populations was screened by several researchers. Nejat-Boshehri and Fakhr Tabatabaei (Reference Nejat-Boshehri and Fakhr Tabatabaei2001) demonstrated high variability among the seed storage protein profiles. Maleki et al. (Reference Maleki, Naghavi, Alizadeh, Potki, Kazemi, Pirseyedi, Mardi and Fakhr Tabatabaei2006) using amplified fragment length polymorphism (AFLP) and Naghavi et al. (Reference Naghavi, Malaki, Alizadeh, Pirseiedi and Mardi2009) analysing random amplified polymorphic DNA, AFLP and simple sequence repeat markers concluded that there is a vast variation in the above-mentioned gene pool. In the present study, finding many polymorphic characteristics in Acc-1 and Pgk-1 (Tables 2 and 3) can be considered as evidence for the presence of high nucleotide diversity within the Iranian wild einkorn.
Our observations are in agreement with those of Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b), who reported two haplotypes (equivalent to haplotypes A and C in the present study; Table S3, available online) from two localities (Kermanshah and Lorestan) for each of the two loci among their 15 Iranian lines. Furthermore, two haplotypes (both named B, equivalent to haplotypes Acc-1 Tb1331-V and Pgk-1 Tm15-V evaluated by Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b); Table 1, Table S3, available online, and Fig. 1) are reported herein for the first time for Iran.
On the basis of the comprehensive study carried out by Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b), haplotypes A and B are the first and second most frequently found alleles in Turkey, and the presence of the latter (only six individuals) in Iran can be interpreted as a trace of its general distribution.
At the species level, individuals identified as T. urartu exhibited no genetic variability, which is in agreement with the findings of Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b) and Adderley and Sun (Reference Adderley and Sun2014). This is parallel with the taxonomic homogeneity dominating in this species (Nasernakhaei et al., Reference Nasernakhaei, Rahiminejad, Saeidi and Tavassoli2013). The relatively high genetic diversity observed among the individuals belonging to T. monococcum ssp. aegilopoides collected from North-west of Iran (Table S2, available online, and Fig. 1) could be associated with the proximity of this area to the putative centre of diversity of diploid Triticum gene pool, i.e. South-east of Turkey according to Heun et al. (Reference Heun, Schafer-Pregl, Klawan, Castagana, Accerbi, Borghi and Salamini1997).
The genetic leakage from T. monococcum ssp. aegilopoides into T. urartu (Fig. 1), which is against the morphological distinctness, could clearly justify the occasional outcrossing, despite the fact that einkorn is a typical inbreeder (Zohary and Hopf, Reference Zohary and Hopf2000; Kilian et al., Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b).
Relationship between morphology and haplotype variability
In this study, T. monococcum ssp. aegilopoides and T. urartu were recognized based on two morphological features: (1) loose long hairs on leaf and anther length of 3–5 mm and (2) dense short hairs on leaf and anther length of 1.9–2.8 mm, respectively (Nasernakhaei et al., Reference Nasernakhaei, Rahiminejad, Saeidi and Tavassoli2013). However, superimposing the revealed haplotypes on the above-mentioned classification revealed a conflict with morphological features (Fig. 2 and Fig. S1, available online). It is worth mentioning that there is no purpose herein to make any generalization between the evolutionary changes of only one or two genes and the evolutionary trend of one organism. Any genetic segment has its own story, and to make a rational comparison between the phylogeny of organisms and gene trees, sequencing of multiple genes (>50) is required ( J. Dubcovsky, 2013, pers. commun.).
The 46 bp indel in Acc-1 in haplotype A (Table 2), which is probably a consequence of a non-homologous double-strand repair (Puchta, Reference Puchta2005), was also utilized by Golovnina et al. (Reference Golovnina, Kondratenko, Blinov and Goncharov2009) to subdivide their Acc-1 haplotypes into two variants.
In addition, some studies have shown that while the deletion form of this indel characterizes only the sequences equivalent to haplotypes B and C evaluated in the present study, its insertion form is exclusively present in a group of species including Aegilops tauschii Coss., Ae. speltoides Tausch, Hordeum vulgare L. and Secale cereale L., which are very similar to haplotype A evaluated in the present study (Huang et al., Reference Huang, Sirikhachornkit, Su, Faris, Gill, Haselkorn and Gornicki2002a, b; Kilian et al., Reference Kilian, Ozkan, Deusch, Effgen, Brandolini, Kohl, Martin and Salamini2007a; Goncharov et al., Reference Goncharov, Golovnina, Kilian, Glushkov, Blinov, Shummy, Dobretsov, Kolchanov, Rozanov and Zavarzin2008; Golovnina et al., Reference Golovnina, Kondratenko, Blinov and Goncharov2009; Kang et al., Reference Kang, Fan, Zhang, Sha, Sun and Zhou2010). This conflict may be hypothesized based on the suggestion of T. monococcum s. lat. being a paraphyletic group, with respect to only the above-mentioned indel. This suggestion can be justified on the basis of a common Triticum diploid ancestor that on Acc-1 gene diverged into two lineages, one carrying variant A and the other variant B–C. Taking into account the ‘commonality concept’ of ‘polarization’, variant A can be regarded as the plesiomorphic state of this gene. More recently, T. urartu has derived morphologically from the lineage of B–C variant; a hypothetical visualization of this suggestion is shown in Fig. S1 (available online). The relevant literature has shown that currently there is confusion regarding the taxonomy of T. monococcum s. lat. (van Slageren, Reference van Slageren1994; Nasernakhaei et al., Reference Nasernakhaei, Rahiminejad, Saeidi and Tavassoli2013), which may be due to the paraphyletic origin of this species.
Explanation for this conflict for Pgk-1 is more complex. The closer relationship of Pgk-1 haplotype B with haplotype C than with haplotype A (Table 3 and Fig. 2) may be interpreted based on a paraphyletic group, rapid speciation (unexpectedly short divergence times) and the genomic integration created by ‘introgressive hybridization’. Wicker et al. (Reference Wicker, Krattinger, Lagudah, Komatsuda, Pourkheirandish, Matsumoto, Cloutier, Reiser, Kanamori, Sato, Perovic, Stein and Keller2009) demonstrated that between quite divergent subspecies and species there may be haplotype segments much older than those found during the divergence of the (sub)species. They reported that in barley there were regions in the genome that diverged about one million years ago immediately next to regions that diverged only a few 100,000 years ago. They believe that these kinds of introgressions occur quite frequently both in wild species and in domesticated species. They also found that extremely divergent segments could be exchanged between rye and wheat (T. Wicker, 2013, pers. commun.).
Therefore when we found such divergent haplotype segments in our materials, this will produce phylogeny of single genes contradicts the overall phylogeny of the species (Doyle, Reference Doyle1992). This hypothesis is confirmed by the haplotype networks shown in Fig. 2 in which while the individuals having each of the haplotypes are identified carefully, their taxonomic relationships are overshadowed by the distribution of the haplotypes.
As a generalization, the observations made in this study stipulate that part of the haplotype variability observed in the local flora may be ignored in the general studies (haplotype B was not reported by Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b)) and also the urgent necessity of in and ex situ conservation of regional vegetations that are assumed to be the origin of domestication of some strategic plants. In addition, the present work will be completed by screening the important functional loci like those which mentioned in Kilian et al. (Reference Kilian, Özkan, Walther, Kohl, Dagan, Salamini and Martin2007b). In conclusion, we suggest that SSCP analysis is an applicable molecular tool for pre-screening genetic variability in regions where thorough sequencing of an enormous number of DNA samples is time consuming and not affordable.
Supplementary material
To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S1479262114000549
Acknowledgements
The authors are grateful to Dr Yuan Li for her valuable guidance, comments and corrections and to Drs Thomas Wicker, Jorge Dubcovsky, Benjamin Kilian, Nikolay P. Goncharov, Sidram Dhanagond, Piotr Gawronski, Heidi Lischer and Hong Chang Lim for their valuable help and guidance. They are also grateful to two anonymous reviewers and Dr Robert Koebner for improving the overall quality of the manuscript and Ms Faye Kalloniatis for her help. This study was carried financially supported by the University of Isfahan.