Introduction
The plant genus Brassica L. belongs to the tribe Brassiceae, which in turn belongs to the largest family Brassicaceae (Rakow, Reference Rakow, Pua and Douglas2004; El-Esawi, Reference El-Esawi2012). Brassicaceae includes 380 genera and 3000 species (Heywood, Reference Heywood1993), whereas Mabberley (Reference Mabberley1997) recorded 365 genera and 3250 species. Judd et al. (Reference Judd, Campbell, Kellogg and Stevens1999) also recorded 419 genera and 4130 species, while Warwick et al. (Reference Warwick, Francis and Al-Shehbaz2006) recorded 338 genera and 3709 species belonging to this family. Brassica species play an essential role in horticulture and agriculture, as well as contributing to the health of populations around the world (Rakow, Reference Rakow, Pua and Douglas2004; El-Esawi, Reference El-Esawi2012; El-Esawi et al., Reference El-Esawi, Bourke, Germaine and Malone2012a). They are important sources of vegetable oil, vegetables and condiments (Zhao, Reference Zhao2007). Brassica juncea, Brassica napus, Brassica rapa and Brassica carinata provide about 12% of the worldwide vegetable oil supply (Labana and Gupta, Reference Labana, Gupta, Labana, Banga and Banga1993). The oil is either utilized for human consumption or as a biofuel or renewable resource in the petrochemical industry. Moreover, Brassica species are valuable sources of potassium, dietary fibre, phenolics, vitamins A, C and E, and various health-enhancing factors such as anticancer compounds (Fahey and Talalay, Reference Fahey, Talalay, Gustine and Florens1995; Zhao, Reference Zhao2007). Brassicaceae produce glucosinolates that are broken down to isothiocyanates that limit tumour development and provide protection against human cancers and heart diseases (King, Reference King2005; El-Esawi, Reference El-Esawi2012). Several biotic and abiotic factors affect Brassicaceae growth (Relf and McDaniel, Reference Relf and McDaniel2009; Consentino et al., Reference Consentino, Lambert, Martino, Jourdan, Bouchet, Witczak, Castello, El-Esawi, Corbineau, d'Harlingue and Ahmad2015; El-Esawi et al., Reference El-Esawi, Glascoe, Engle, Ritz, Link and Ahmad2015; Jourdan et al., Reference Jourdan, Martino, El-Esawi, Witczak, Bouchet, d'Harlingue and Ahmad2015). Moreover, disease resistant Brassica varieties are required in future breeding programmes in order to improve their agricultural production and conservation strategies. Knowledge of the amount and distribution of genetic variability within a species is vital for establishing efficient conservation and breeding practices (Avise, Reference Avise1994; Chaveerach et al., Reference Chaveerach, Sudmoon, Tanee, Mokkamul and Tanomtong2007; El-Esawi, Reference El-Esawi2008, Reference El-Esawi2012). It helps plant breeders develop, through selection and breeding, new or more productive crops, that are resistant to pests and diseases and highly adapted to changing environments. It also provides information for domestication and designing sampling protocols (Bretting and Widrlechner, Reference Bretting and Widrlechner1995; Yu et al., Reference Yu, Mosjidis, Klingler and Woods2001). Therefore, this review discusses the taxonomy, gene pool and Brassica-derived phytochemicals and their nutraceutical importance. It also highlights the recent and current knowledge of the application of morphological, cytological, biochemical and molecular markers in the genus Brassica L. that help understand its genetic variability, phylogeny, conservation and breeding system as a basis for further research to improve the crop and develop new varieties.
Brassica-derived phytochemicals and their nutraceutical importance
Brassica vegetables are valuable sources of nutrients and health-promoting phytochemicals possessing nutraceutical and antioxidant activity such as vitamins, carotenoids, dietary fibre, phenolics, soluble sugars, minerals and glucosinolates (Wagner et al., Reference Wagner, Terschluesen and Rimbach2013). Brassicaceae produces glucosinolates which are broken down to isothiocyanates that limit tumour development and provide protection against a range of human cancers and heart diseases (King, Reference King2005; El-Esawi, Reference El-Esawi2012). Glucosinolates and other sulphur-containing metabolites act as anti-cancer agents due to their ability to induce detoxification enzymes in mammalian cells and to decrease the rate of tumour development. Isothyocyanates are modulators of Phases 1 and 2 enzyme activity and neutralize cancer-causing chemicals that damage cells by interfering with tumour growth (King, Reference King2005). Brassica secondary products have antioxidant, antiviral and antibacterial effects as well as inducing the immune system and modulating steroid metabolism (King, Reference King2005). Moreover, Brassica-derived phytochemicals may counteract different genetic pathways and exhibit chemopreventive activity (Wagner et al., Reference Wagner, Terschluesen and Rimbach2013).
Anti-inflammatory properties of Brassica-derived phytochemicals have been reported (Juge et al., Reference Juge, Mithen and Traka2007). These useful effects may be mediated through the stimulation of antioxidants and the prohibition of proinflammatory signalling pathways via regulation of different transcription factors that may be further controlled by miRNAs and epigenetic modifications (Wang et al., Reference Wang, Kaur, Cogan, Dobrowolski, Salisbury, Burton, Baillie, Hand, Hopkins, Forster, Smith and Spangenberg2009a, Reference Wang, Cavell, Alwi and Packhamb; Wagner et al., Reference Wagner, Terschluesen and Rimbach2013). Moreover, Brassica-derived phytochemicals show antiviral and anti-infective activity (Yanaka et al., Reference Yanaka, Fahey and Fukumoto2009). The kappa B (nuclear factor κB) transcription factor plays an important role in inflammatory processes, and is an attractive target to treat inflammation-related diseases (Wagner et al., Reference Wagner, Terschluesen and Rimbach2013). Brassica-derived phytochemicals may also mediate anti-inflammatory effects through an interaction with reduced redox regulators such as thioredoxin, glutathione, or redox factor 1, resulting in changes of the reducing milieu needed for the correct DNA binding (Heiss and Gerhäuser, Reference Heiss and Gerhäuser2005; Wagner et al., Reference Wagner, Terschluesen and Rimbach2013).
Nrf2 is a transcription factor playing an essential role in regulating inflammation and chemoprevention (Wagner et al., Reference Wagner, Terschluesen and Rimbach2013). Under basal conditions, Nrf2 is bound to its cytosolic inhibitor (Keap1) (Surh, Reference Surh2003). In the presence of isothiocyanates, Nrf2 could be activated through two distinct cellular signalling pathways, leading to the liberation of Nrf2 from its inhibitor Keap1. The liberated Nrf2 is transferred to the nucleus and binds together with many cofactors, including small Maf proteins (MafF, MafG and MafK). Brassica-derived phytochemicals also influence epigenetic mechanisms. Epigenetic aberrations, occurring in the early stages of carcinogenesis, represent an initial process of cancer development. Phytochemicals may intervene in this process to prevent cancer (Gerhauser, Reference Gerhauser2013; Wagner et al., Reference Wagner, Terschluesen and Rimbach2013).
Taxonomy and gene pool of Brassica species and their wild relatives
The genus Brassica L. is one of the most economically important genera in the tribe Brassiceae, which in turn belongs to the family Brassicaceae (Cruciferae) (Rakow, Reference Rakow, Pua and Douglas2004). There are several different opinions regarding the number of genera and species included in the Brassicaceae. Heywood (Reference Heywood1993) recorded about 380 genera and 3000 species in this family, whereas Mabberley (Reference Mabberley1997) recorded 365 genera and 3250 species. Judd et al. (Reference Judd, Campbell, Kellogg and Stevens1999) also recorded 419 genera and 4130 species, while Warwick et al. (Reference Warwick, Francis and Al-Shehbaz2006) recorded 338 genera and 3709 species belonging to this family. The genus Brassica L. comprises a diverse group of species including major vegetable and oilseed crops with a wide range of agronomic traits (Rich, Reference Rich1991; Christopher et al., Reference Christopher, Andrew, Geraldine, Clare, Jacqueline, Gary, German and David2005). The cytogenetic relationship of the six main economically important species of the Brassica genus was depicted in the U triangle (UN, 1935). Three of these species are diploid (B. oleracea, 2n = 18; B. rapa, 2n = 20; B. nigra, 2n = 16), and three are amphidiploid (B. napus, 2n = 38; B. juncea, 2n = 36; B. carinata, 2n = 34). Brassica species are characterized by a wide range of adaptations and abilities to adapt to a wide range of habitats and growing environments (King, Reference King2005; Hong et al., Reference Hong, Kwon, Kim, Yang, Park and Lim2008).
Brassica oleracea is an important vegetable crop species which includes many vegetable cultivars called cole crops (Katz, Reference Katz2003). These cole crops comprise cabbage (B. oleracea subspecies capitata), cauliflower (B. oleracea subspecies botrytis), brussels sprout (B. oleracea subspecies gemmifera), broccoli (B. oleracea subspecies italica), Kale and collards (B. oleracea subspecies acephala) and kohlrabi (B. oleracea subspecies gongylodes). The cole crops have extreme morphological characteristics. Examples of such morphologies include the enlarged infloresences of cauliflower and broccoli; the enlarged stems of kohlrabi and marrowstem kale; the enlarged single apical bud of cabbage; and the several axillary buds of brussels sprout (Hong et al., Reference Hong, Kwon, Kim, Yang, Park and Lim2008). Brassica oleracea generally grows slowly and has a large storage capacity for nutrients, which accounts for its adaptation to diverse natural habitats. It has a recent history of cultivation (Gomez-Campo and Prakash, Reference Gomez-Campo, Prakash and Gomez-Campo1999; Navabi, Reference Navabi2009).
Brassica rapa, commonly known as field mustard or turnip mustard, is a crop species widely cultivated as a leaf vegetable, a root vegetable and an oilseed. There are three well-defined groups of B. rapa, based on their morphological characteristics (CFIA, 1999): (1) the oleiferous or oil-type rape, often referred to Polish rape or summer turnip rape, of which canola is a specific form containing low erucic acid in its oil and low glucosinolate content in its meal protein; (2) the leafy type B. rapa, comprising the chinensis group (pak-choi, celery mustard), the pekinensis group (Chinese cabbage) and the perviridis group (tendergreen); and (3) the rapiferous type B. rapa, including the rapifera group (turnip, rapini) and the ruvo group (turnip broccoli, Italian turnip) (CFIA, 1999). Rakow (Reference Rakow, Pua and Douglas2004) also stated that seven varieties of vegetable B. rapa types are known, and these are: var. campestris, var. pekinensis, var. chinensis, var. para-chinensis, var. narinosa, var. japonica and var. rapa. The var. pekinensis is adapted to a cooler climate. The var. chinensis is a leaf vegetable which differentiated from oilseed rape types of middle China, var. para-chinensis is a derivative of the var. chinensis, and var. campestris is the most primitive leaf vegetable. The var. narinosa has a high cold tolerance and is similar to var. chinensis in its adaptation. The var. japonica is a leaf vegetable of Japan. The var. rapa (turnip) is cultivated all over the world as a vegetable and as fodder for animals (Rakow, Reference Rakow, Pua and Douglas2004).
Brassica napus L., commonly known as canola or oilseed rape, is the amphidiploid (allotetraploid) of B. rapa and B. oleracea (Tsunoda, Reference Tsunoda, Tsunoda, Hinata and Gomez-Campo1980; Rakow, Reference Rakow, Pua and Douglas2004). It is the most productive oilseed species under cultivation. Both winter and summer annual forms of B. napus are grown as oilseeds in many countries of the world. Its high-yield potential might be related to the high photosynthetic rate per unit leaf area which is positively related to chloroplast number per unit leaf area and to chloroplast volume. There are also root-forming B. napus types, known as tuber-bearing swede or rutabaga, grown as vegetables and fodder for animals (Rakow, Reference Rakow, Pua and Douglas2004).
Brassica nigra, commonly known as black mustard, is an annual weedy plant cultivated for its seeds. It is found growing wild as a weed in the cultivated fields in the Mediterranean region (Tsunoda, Reference Tsunoda, Tsunoda, Hinata and Gomez-Campo1980). Plants of B. nigra can reach a height of up to 2 m and do not require vernalization for flower induction (Rakow, Reference Rakow, Pua and Douglas2004). Brassica juncea L. is an amphidiploid species originated from crosses between B. rapa and B. nigra (Rakow, Reference Rakow, Pua and Douglas2004). It has a great seed yield potential for semi-arid conditions, and is known to be more drought tolerant than rapeseed species (Rabbani et al., Reference Rabbani, Iwabuchi, Murakami, Suzuki and Takayanagi1999) It is grown as an oilseed and leafy vegetable. Brassica carinata, or Ethiopian mustard, is an amphidiploid species originated from crosses between B. nigra and B. oleracea, and contains mustard oil (Rakow, Reference Rakow, Pua and Douglas2004).
The genetic resources available for the breeding of Brassica crops are regulated by the boundaries of their primary, secondary and tertiary gene pools (Harlan, Reference Harlan1975; Branca and Cartea, Reference Branca, Cartea and Kole2011). Brassica oleracea represents the primary gene pool, but many studies have been achieved to assess the other gene pools and their important use (Branca and Cartea, Reference Branca, Cartea and Kole2011). Studies on pachytene chromosome morphology helped in investigating the secondary gene pool, and identified the basic genomes of Brassica crops (Branca and Cartea, Reference Branca, Cartea and Kole2011): AA (2n = 20) is for B. rapa, BB (2n = 16) for B. nigra and CC (2n = 18) for B. oleracea. Investigations on genomic libraries of B. napus and B. oleracea revealed shared fragments among A, B and C-genomes, suggesting their partial homology and the origin of the amphidiploid species B. napus, B. carinata and B. juncea from the parental diploid ones (Hosaka et al., Reference Hosaka, Kianian, McGrath and Quiros1990; Branca and Cartea, Reference Branca, Cartea and Kole2011). The phylogenetic studies explain the evolution of Brassica and allied genera from a common ancestor with n = 6 through increase in the number of chromosomes and partial homology of A, B and C genomes (Song et al., Reference Song, Osborn and Williams1990; Branca and Cartea, Reference Branca, Cartea and Kole2011). Finally, the tertiary gene pool involves species and genera related to Brassica crops in 36 cytodemes such as Diplotaxis, Enarthrocarpus, Eruca, Erucastrum, Hirschfeldia, Rhynchosinapis, Sinapis, Sinapodendron and Trachystoma genera (Harbered, Reference Harbered, Vaughan, MacLeod and Jones1976; Branca and Cartea, Reference Branca, Cartea and Kole2011). These gene pools can confer favourable alleles and useful traits using special methods. Tissue culture techniques and protoplast culture helped in the introgression of beneficial genes overcoming genetic boundaries (Branca and Cartea, Reference Branca, Cartea and Kole2011).
Different wild B. oleracea species with a chromosome number of n = 9 (including Atlantic B. oleracea) were collected. Four wild B. oleracea-related species were found in Sicily (B. rupestris, B. incana, B. villosa and B. macrocarpa) (Branca and Cartea, Reference Branca, Cartea and Kole2011). Each accession was divided into three parts, stored at the UPM (Spain), the University of Tohoku (Sendai, Japan) and also at seed banks of those countries, where the collection was done (Izmir, Thessaloniki, Greece, Bari, Italy, France, Kew, UK) (Branca and Cartea, Reference Branca, Cartea and Kole2011). In Europe, and under the aegis of the European Cooperative Program for Crop Genetic Resources Networks (ECPGR), a working group on Brassicas was established in 1991. One of the major efforts of this group has been to set up a European Brassica database (Bras-EDB), which was developed by the Center for Genetic Resources, Netherlands (Boukema and van Hintum, Reference Boukema and van Hintum1998; Branca and Cartea, Reference Branca, Cartea and Kole2011). This database involves cultivated plant materials as well as wild ones and comprises 36 collections from 22 countries and more than 19,600 accessions (Branca and Cartea, Reference Branca, Cartea and Kole2011). Brassica collections were characterized for their broadening agricultural use, including assessing and utilizing genetic variation in B. carinata for its utilization as an oilseed crop (Branca and Cartea, Reference Branca, Cartea and Kole2011). The main aim was to create a core collection of four important Brassica species (B. oleracea, B. rapa, B. napus and B. carinata). This project was an important attempt to unify efforts on Brassica germplasm within the EU and it was complementary to the activities of the ECPGR Working Group on Brassica who also evaluated different wild species based on DNA analysis, morphological traits and quality aspects focused on oils and nutraceutical compounds. The role of the Working Group is to highlight the usefulness of the wild germplasm for breeding practices and their importance for improving Brassica crops, as well as to select the most appropriate accessions of the future European Genebank Integrated System (Astley et al., Reference Astley, Bas, Branca, Daunay, Diez, Keller, van Dooijeweert, van Treuren, Maggioni and Lipman2007; Branca and Cartea, Reference Branca, Cartea and Kole2011).
Genetic diversity of Brassica species
Genetic diversity is the variation of individual genotypes within and among species. It permits species to adjust to a changing environment, whether these changes are due to human or natural factors (Chaveerach et al., Reference Chaveerach, Sudmoon, Tanee, Mokkamul and Tanomtong2007; El-Esawi, Reference El-Esawi2008, Reference El-Esawi2012). The genetic composition of whole populations varies from place to place across a species range. These differences might emerge as a result of chance occurrences, such as the genetic composition of outspreading individuals which form a new population (founder effect), or changes in allele frequencies which result from chance crossings in very small populations (genetic drift) (Meffe and Carroll, Reference Meffe and Carroll1994; Husband and Schemske, Reference Husband and Schemske1996; Falk et al., Reference Falk, Knapp and Guerrant2001). Differences among populations may also arise when the environment in different locations exposes the individuals to different conditions and optima for survival and reproduction (fitness). For these and other reasons, populations can diverge from one to another in their genetic profile. This divergence is especially strong and quick when there is a little gene flow among populations (e.g., limited dispersal of pollen or seeds or limited movement of animals across physiographic barriers) (Falk et al., Reference Falk, Knapp and Guerrant2001). Over evolutionary time, such among-population genetic differences could accumulate, resulting in the development of a new species (allopatric speciation).
Effective conservation of Brassica genetic resources requires a complementary approach that makes use of both ex situ and in situ conservation methods to maximize the genetic diversity available for use (Karp et al., Reference Karp, Kresovich, Bhat, Ayad and Hodgkin1997). In situ conservation is a method for conserving forest species and wild crop relatives and is based on the maintenance of the whole ecosystem in which the target taxa are present with other species (Hodgkin et al., Reference Hodgkin, Brown, Hintum and Morales1995). In situ conservation helps evolution to continue, increases the amount of diversity which could be conserved, and strengthens the links between conservation workers and the communities who have maintained and used the resources (Karp et al., Reference Karp, Kresovich, Bhat, Ayad and Hodgkin1997).
Ex situ conservation aims at maintaining the accessions without change in their genetic constitution (Frankel et al., Reference Frankel, Brown and Burdon1995). The methodologies used are designed to minimize the possibility of selection, mutation, random genetic drift or contamination. For many species, long-term ex situ conservation can be undertaken by storing seeds for long periods at low humidities and temperatures (Karp et al., Reference Karp, Kresovich, Bhat, Ayad and Hodgkin1997). The major world germplasm collections of Brassica are located in Centre for Genetic Resources (CGN, The Netherlands), Institute for Horticultural Plant Breeding (IVT, The Netherlands), Horticultural Research Institute (HRI, UK) and Gene Bank of Crop Research Institute (UK) (van der Meer et al., Reference van der Meer, Toxopeus, Crisp, Roelofsen and Astley1984). Genetic diversity can be estimated based on morphological, cytological, biochemical and molecular markers.
Morphological traits
Morphological variation in plant species has been described for traits which are controlled by a single or multiple gene systems. The greater the number of gene loci determining the trait, the more continuous the variability will be (Ayala, Reference Ayala1982; El-Esawi et al., Reference El-Esawi, Bourke, Germaine and Malone2012a). The expression of morphological traits is affected by the environment. Thus, the variation patterns in these morphological traits are considered to be the result of both environmental and genetic attributes. The differentiation of populations in a species is carried out by coding the measurements of quantitative traits and their evaluation based on different biological and mathematical assumptions using a number of computer softwares (Rohlf, Reference Rohlf2000). Morphological traits have been used by different evolutionists to assess the genetic variation and phylogenetic relationships among populations of the plant species, for example, soybean (Iqbal et al., Reference Iqbal, Arshad, Ashraf, Mahmood and Waheed2008), rice (Rashid et al., Reference Rashid, Cheema and Ashraf2008; Bibi et al., Reference Bibi, Khan, Bughio, Odhano, Asad and Khatri2009), Lactuca (El-Esawi, Reference El-Esawi2008) and Brassica (Rabbani et al., Reference Rabbani, Iwabuchi, Murakami, Suzuki and Takayanagi1999; Kop et al., Reference Kop, Teakle, McClenaghan, Lynn and King2003; Balkaya et al., Reference Balkaya, Yanmaz and Kar2005).
Rabbani et al. (Reference Rabbani, Iwabuchi, Murakami, Suzuki and Takayanagi1999) assessed the morphological variation in oilseed mustard. The results showed a considerable level of variation among all accessions for various traits. Seedling characteristics exhibited less variation, while the largest variation was recorded for flowering and maturity stage characters. Some of the related traits were significantly correlated with each other. Furthermore, Balkaya et al. (Reference Balkaya, Yanmaz and Kar2005) evaluated the morphological variation of white head cabbage (B. oleracea var. capitata subvar. alba). Cluster analysis based on 12 quantitative and ten qualitative variables identified ten groups. Morphological variability was high among the genotypes studied.
Cytological markers and karyotyping
Karyotype study is a beneficial tool in taxonomy either to characterize taxa or to reconstruct their phylogeny (Koopman and De Jong, Reference Koopman and De Jong1996; El-Esawi, Reference El-Esawi2008). Its value for phylogeny reconstruction in Brassica species has been confirmed and demonstrated (Allam et al., Reference Allam, Hussein, Abo-Bakr and Hassan1985). The cytological markers have been used by taxonomists to assess the relationships and genetic variability in plants, for example, barley (Jahan and Vahidy, Reference Jahan and Vahidy2008), Lactuca (Sammour et al., Reference Sammour, Badr, Mustafa and El-Esawi2013; El-Esawi and Sammour, Reference El-Esawi and Sammour2014) and Brassica (Allam et al., Reference Allam, Hussein, Abo-Bakr and Hassan1985; Hasterok and Maluszynska, Reference Hasterok and Maluszynska2000; Kulak et al., Reference Kulak, Hasterok and Maluszynska2002).
Allam et al. (Reference Allam, Hussein, Abo-Bakr and Hassan1985) studied the karyotype and meiotic behaviour of B. oleracea. The results revealed that the two karyotypes appeared to be similar in their gross morphology though they varied in minute details. In meiosis, the two varieties were almost normal though they varied in their chiasma frequency. Kulak et al. (Reference Kulak, Hasterok and Maluszynska2002) also studied the karyotypes of three amphidiploid species (B. napus, B. juncea and B. carinata). They identified eight out of 19 pairs of chromosomes in B. napus, ten out of 18 pairs in B. juncea and six out of 16 pairs in B. carinata. Brassica species have small, morphologically similar chromosomes.
The chromosomes of Brassica species are relatively small, morphologically similar and numerous in allotetraploids. However, their analysis based on morphometric and karyological features only, is extremely difficult and requires additional markers (Olin-Fatih and Heneen, Reference Olin-Fatih and Heneen1992; Kulak et al., Reference Kulak, Hasterok and Maluszynska2002). Therefore, isozyme and molecular markers have been established and used.
Biochemical markers
Biochemical markers including storage proteins and isozymes have been used to assess the genetic diversity and phylogenetic relationships of Brassica and several plant species.
Storage proteins
Proteins are the translational products of DNA molecules, and can form structural and enzymatic components of plant cells. The transcription and translation of the nucleotide sequences of genes result in the formation of amino acids (El-Esawi, Reference El-Esawi2008; Kephart, Reference Kephart1990). Therefore, the variation detected in proteins is as a mirror for genetic variations. Proteins have been extracted and assessed using different methodologies including, but not limited to, chromatography, ultracentrifugation and electrophoresis. Electrophoresis proved to be the most suitable method for separation and comparison of proteins (Gordon et al., Reference Gordon, Huang, Pentoney and Zare1988; El-Esawi, Reference El-Esawi2008), and could be used to characterize different plant genotypes (Sammour, Reference Sammour1990, Reference Sammour1999; Sammour et al., Reference Sammour, El-Shourbagy, Aboshady and Abasary1994; DellaGatta et al., Reference DellaGatta, Polignano and Bisignano2002; Liang et al. Reference Liang, Luo, Holbrook and Guo2006; Toosi et al., Reference Toosi, Arumugam, Baki and Tayyab2011). Electrophoretic analysis of seed storage proteins was used to assess the genetic variation, which provide a useful information for evaluating the taxonomic relationships at the plant species and subspecies levels (Vries, Reference Vries1996; Rabbani et al., Reference Rabbani, Qureshi, Afzal, Anwar and Komatsu2001; Sihag et al., Reference Sihag, Hooda, Vashishtha and Malik2004; Sammour et al., Reference Sammour, Mustafa, Badr and Tahr2007; Kakaei and Kahrizi, Reference Kakaei and Kahrizi2011; Toosi et al., Reference Toosi, Arumugam, Baki and Tayyab2011; Khurshid and Rabbani, Reference Khurshid and Rabbani2012; Khan et al., Reference Khan, Iqbal, Khurshid, Zia, Shinwari and Rabbani2014; Choudhary et al., Reference Choudhary, Rai, Rai, Parveen, Rai and Salgotra2015; Mir et al., Reference Mir, Islam and Kudesia2015). Toosi et al. (Reference Toosi, Arumugam, Baki and Tayyab2011) used sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS–PAGE) to characterize the protein profiles of B. juncea var. Ensabi at different growth stages. Out of 11 protein bands detected in seed proteins, five bands matched the seed protein profiles of other B. juncea varieties. A comparison of the protein profiles at different growth stages revealed a steady expression of numerous genes encoding different proteins in B. juncea. Furthermore, SDS–PAGE was used to characterize the seed storage protein of 52 accessions of oilseed mustard germplasm from Pakistan (Rabbani et al., Reference Rabbani, Qureshi, Afzal, Anwar and Komatsu2001). The results indicated that the generated protein markers could not distinguish the closely related oilseed cultivars from each other. However, these polypeptides proved to be efficient to distinguish B. juncea and B. campestris. Khurshid and Rabbani (Reference Khurshid and Rabbani2012) also studied the genetic diversity of Brassica species based on protein markers using SDS–PAGE technique. The results a revealed considerable degree of polymorphism and distinguished among the genotypes analysed.
Khan et al. (Reference Khan, Iqbal, Khurshid, Zia, Shinwari and Rabbani2014) used SDS–PAGE to analyse the seed storage proteins of a collection of 136 accessions of B. napus L. A total of 21 protein sub-units were detected and used to distinguish among the accessions. Out of these 21 bands, 16 (76.19%) were polymorphic and five (23.81%) were monomorphic. The similarity coefficient among these accessions varied between 0.83 and 0.98. The cluster analysis divided the accessions into five major clusters. The results also showed a low level of genetic variation. Further studies are recommended using two-dimensional (2D) gel electrophoresis along with molecular markers in order to reveal high levels of genetic variation among these accessions. Mir et al. (Reference Mir, Islam and Kudesia2015) evaluated the genetic variation and phylogenetic relationships among B. juncea accessions using SDS–PAGE. The cluster analysis divided the accessions into two main clusters and distinguished among the accessions analysed. SDS–PAGE of seed storage proteins proved to be an efficient method for distinguishing among plant populations.
Isozyme markers
Isozymes are molecular forms of an enzyme, having a different structure and a similar catalytic function (El-Esawi, Reference El-Esawi2008). Allozymes are allelic variants of enzymes encoded by structural genes of the same locus. Isozymes originate through the changes of amino acids, that change the net charge, the spatial structure of the enzyme molecules and their electrophoretic mobility. After specific staining, the isozyme profile of individuals can be detected (Dziechciarková et al., Reference Dziechciarková, Lebeda, Doležalová and Astley2004; Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009). Isozyme traits have been successfully utilized to study the genetic variation, phylogenetic relationships, population genetics, taxonomy and developmental biology as well as to direct utilization in plant genetic resources management and plant breeding (Dziechciarková et al., Reference Dziechciarková, Lebeda, Doležalová and Astley2004; Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009; El-Esawi, Reference El-Esawi2015a, Reference El-Esawib), for example, red clover (Mosjidis et al., Reference Mosjidis, Greene, Klingler and Afonin2004), blue pine (Bakshi and Konnert, Reference Bakshi and Konnert2011), Lactuca (El-Esawi, Reference El-Esawi2008) and Brassica (Lázaro and Auginagalde, Reference Lázaro and Auginagalde1998a; Raybould et al., Reference Raybould, Mogg, Clarke, Gliddon and Gray1999). Lázaro and Auginagalde (Reference Lázaro and Auginagalde1998a) assessed the genetic diversity of B. oleracea based on five enzyme systems. The average values for the expected heterozygosity and percentage of polymorphic loci were 0.224 and 54%, respectively. The intra- and interpopulational variations were 67 and 33%, respectively. Raybould et al. (Reference Raybould, Mogg, Clarke, Gliddon and Gray1999) also evaluated the genetic diversity in natural populations of B. oleracea using four isozymes and seven microsatellite loci. All loci were polymorphic, and the diversity index of microsatellite loci was similar to that of isozymes. Genetic differentiation among accessions (F ST) was significant for all loci. The above studies have definitely proven that isozyme traits are efficient for characterizing genetic variability, taxonomic relationships and species identity.
Molecular markers
Molecular markers are regions in the genome that are heritable as simple Mendelian traits (Schulman et al., Reference Schulman, Flavell and Ellis2004) and can be used to assess the genetic variation and phylogenetic relationships in plants. Several criteria should be considered in choosing molecular techniques for genetic diversity studies including the following: whether the techniques are highly reproducible between laboratories and whether the data that are generated can be reliably transferred; whether markers are dominant or codominant, allowing heterozygotes and homozygotes to be distinguished; the amount of genomic sequence information required; and whether the markers detect highly polymorphic loci (Osman et al., Reference Osman, Jordan, Lessard, Muhammad, Haron, Riffin, Sinskey, Rha and Housman2003; El-Esawi, Reference El-Esawi2008, Reference El-Esawi2012). At present, various dominant and codominant molecular markers are available for assessing genetic diversity in plants, such as random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), restricted fragment length polymorphism (RFLP), microsatellites (simple sequence repeats (SSRs)) and single nucleotide polymorphisms (SNPs).
Random amplified polymorphic DNA
RAPD is a PCR-based technology based on enzymatic amplification of target or random DNA fragments with arbitrary primers (El-Esawi, Reference El-Esawi2008; Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009). Each product is derived from a genome region, which contains two short segments in inverted orientation, on opposite strands which are complementary to the primer. Amplified products are generally separated on agarose gels in the presence of ethidium bromide and visualized under ultraviolet light (El-Esawi, Reference El-Esawi2008; Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009). The RAPD system has been used in characterization of different resistance genes, hybrid origin identification (Friesen et al., Reference Friesen, Fritsch and Bachmann1997) and breeding utilization (Baril et al., Reference Baril, Vehaegen, Vigneron, Bouvet and Kremer1997). The RAPD technique has also been used for assessing the genetic variation in plant species and to establish differences among lines of apparently closely related populations in germplasm collections, for example, rice (Bibi et al., Reference Bibi, Khan, Bughio, Odhano, Asad and Khatri2009), Lactuca (El-Esawi, Reference El-Esawi2008) and Brassica (Lázaro and Auginagalde, Reference Lázaro and Auginagalde1998b; Crockett et al., Reference Crockett, Bhalla, Lee and Singh2000; Ali et al., Reference Ali, Munir, Ahmad, Muhammad, Ahmed, Durrishahwar, Ali and Swati2007; Saha et al., Reference Saha, Molla, Chandra and Rahman2008).
Lázaro and Auginagalde (Reference Lázaro and Auginagalde1998b) used RAPD markers to evaluate the genetic diversity in B. oleracea group. The average value of Nei genetic distance among taxa was 1.825. Analysis of molecular variance (AMOVA) detected the highest amount of genetic variation in B. oleracea subsp. cretica (variance = 0.519), followed by B. oleracea subsp. rupestris (variance = 0.420). Saha et al. (Reference Saha, Molla, Chandra and Rahman2008) also studied the genetic diversity and relationship among Brassica species based on RAPD markers. The highest percentage of polymorphic loci (37.29%) was found in the accession of BARI sarisha-12 (B. rapa). In conclusion, genetic analysis with dominant RAPD markers is often quick, less technical and less expensive than other molecular markers (Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009).
Restricted fragment length polymorphism
RFLP is a technique in which plants and organisms may be differentiated by analysis of patterns derived from cleavages of their DNA (Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009). RFLP markers have been used in studying genetic variability and relationships in plant species, for example, Lathyrus (Chtourou-Ghorbel et al., Reference Chtourou-Ghorbel, Lauga, Combes and Marrakchi2001) and Pleurotus eryngii (Urbanelli et al., Reference Urbanelli, Rosa, Punelli, Porretta, Reverberi, Fabbri and Fanelli2007). RFLP markers have also been used in assessing genetic variation and developing detailed genetic maps of Brassica (Diers and Osborn, Reference Diers and Osborn1994; Diers et al., Reference Diers, McVetty and Osborn1996; Pradhan et al., Reference Pradhan, Gupta, Mukhopadhyay, Arumugam, Sodhi and Pental2003) and lettuce (Landry et al., Reference Landry, Kesseli, Farrara and Michelmore1987; Kesseli et al., Reference Kesseli, Paran and Michelmore1994), assessing polymorphisms in Brassica lines resistant and susceptible to Xcc (Malvas et al., Reference Malvas, Melotto, Truffi and Camargo2003; El-Esawi, Reference El-Esawi2012), and identification of origin of cultivated lettuce (Kesseli et al., Reference Kesseli, Ochoa and Michelmore1991). Malvas et al. (Reference Malvas, Melotto, Truffi and Camargo2003) identified two DNA fragments of RPS2 gene homologues in two Brassica lines resistant and susceptible to Xcc. The digestion of these fragments with restriction enzymes showed polymorphisms at the XbaI restriction sites. In addition to their high genomic abundance and their moderate polymorphism, RFLP markers are codominant and having high reproducibility (Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009).
Amplified fragment length polymorphism
AFLP is a DNA fingerprinting method that detects restricted DNA fragments by PCR (Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009; El-Esawi, Reference El-Esawi2012). The method is based on selectively amplifying a subset of restriction fragments from a mixture of DNA fragments obtained after genomic DNA digestion with restriction endonucleases (Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009). Polymorphisms can be detected from the length differences of the amplified fragments by PAGE or by capillary electrophoresis. The technique involves four steps: (1) restriction of DNA and ligation of oligonucletide adapters, (2) preselective amplification, (3) selective amplification and (4) gel analysis of amplified fragments. The major advantages of AFLPs scored as dominant markers, include the short time required to assay large numbers of DNA loci, the effectively unlimited number of loci, and the greatly enhanced performance in terms of reproducibility, sensitivity, resolution and time efficiency (Dziechciarková et al., Reference Dziechciarková, Lebeda, Doležalová and Astley2004; Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009).
AFLP is a widely valued technology for accelerating plant improvement and gene mapping studies (Vos et al., Reference Vos, Hogers, Bleeker, Reijans, Lee van de, Hornes, Frijters, Pot, Peleman, Kuiper and Zabeau1995). AFLP markers have been successfully used for evaluating the genetic diversity and relationships in plant species, for example, sesame (Laurentin and Karlovsky, Reference Laurentin and Karlovsky2006), common bean (Kumar et al., Reference Kumar, Sharma, Kero, Sharma, Sharma, Kumar and Bhat2008), potato (Wang et al., Reference Wang, Li, Wang, Zhou and Sun2011) and Brassica (Lombard et al., Reference Lombard, Baril, Dubreuil, Blouet and Zhang2000; Warwick et al., Reference Warwick, Francis and La Fleche2000, Reference Warwick, James and Falk2008; Genet et al., Reference Genet, Viljoen and Labuschagne2005; Ren-Hu and Jin-Ling, Reference Ren-Hu and Jin-Ling2006; Watson-Jones et al., Reference Watson-Jones, Maxted and Ford-Lloyd2006; van Hintum et al., Reference van Hintum, van de Wiel, Visser, van de Treuren and Vosman2007; Christensen et al., Reference Christensen, Bothmer, Poulsen, Maggioni, Phillip, Andersen and Jørgensen2011; Faltusová et al., Reference Faltusová, Kučera and Ovesná2011; El-Esawi, Reference El-Esawi2012). The studies on Brassica populations showed considerable levels of genetic variation.
Genet et al. (Reference Genet, Viljoen and Labuschagne2005) demonstrated that AFLP is a reliable tool for assessing the genetic diversity of B. carinata. Polymorphic rates varied from 50 to 80%. Cluster analysis divided these genotypes into seven distinct clusters. Watson-Jones et al. (Reference Watson-Jones, Maxted and Ford-Lloyd2006) used AFLP markers to evaluate the genetic variation within three species of Brassica (B. nigra, B. oleracea and B. rapa). The results revealed higher diversity within B. oleracea populations than the other two species and it had the highest range of diversity among populations. van Hintum et al. (Reference van Hintum, van de Wiel, Visser, van de Treuren and Vosman2007) also studied the genetic diversity in B. oleracea using AFLP markers. The average genetic diversity within single accessions was 0.13 and the total diversity (H T) was 0.24.
Warwick et al. (Reference Warwick, James and Falk2008) assessed the genetic variation and relationships among taxa of B. rapa using seven AFLP primer pairs which displayed similar amounts of polymorphisms (84–97%) among accessions. Christensen et al. (Reference Christensen, Bothmer, Poulsen, Maggioni, Phillip, Andersen and Jørgensen2011) also studied the diversity and genetic structure among 17 Brassica accessions using AFLP markers. Several landraces showed higher levels of diversity than the wild populations. An AMOVA showed that 62% of the total variation was found within accessions. Furthermore, Faltusová et al. (Reference Faltusová, Kučera and Ovesná2011) evaluated the genetic diversity of B. oleracea using AFLP markers. A total of 806 polymorphic fragments were found across the accessions. The accessions were clustered into two main groups. Special subgroups, reflecting origin, were observed within these groups.
Microsatellites (SSRs)
Microsatellites, alternatively known as SSRs or short tandem repeats, are sections of DNA consisting of tandemly repeating mono-, di-, tri-, tetra-or penta-nucleotide units that are arranged throughout genomes of most eukaryotic species (Powell et al., Reference Powell, Marchray and Provan1996; Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009; El-Esawi, Reference El-Esawi2012). The advantages of microsatellites include the codominance of alleles, their high abundance in eukaryotes and their random distribution throughout the genome, with an association in low-copy regions (Morgante et al., Reference Morgante, Hanafey and Powell2002). Due to using long PCR primers, the microsatellites reproducibility is high, and their analyses do not require a high quality or quantity of DNA. Although microsatellites are codominant markers, mutations in the primer annealing sites may result in the appearance of null alleles (no amplification of the intended PCR product) (Kumar et al., Reference Kumar, Gupta, Misra, Modi and Pandey2009). Microsatellites are ideal markers in gene mapping studies (Jarne and Lagoda, Reference Jarne and Lagoda1996), and assessing genetic variation and relationships in germplasm collections, for example, Oryza sativa (Chakravarthi and Naravaneni, Reference Chakravarthi and Naravaneni2006), wheat (Iqbal et al., Reference Iqbal, Tabasum, Sayed and Hameed2009), Prunus avium (Ercisli et al., Reference Ercisli, Agar, Yildirim, Duralija, Vokurka and Karlidag2011) and Brassica (Flannery et al., Reference Flannery, Mitchell, Coyne, Kavanagh, Burke, Salamin, Dowding and Hodkinson2006; Hasan et al., Reference Hasan, Seyis, Badani, Pons-Kuhnemann, Friedt, Luhs and Snowdon2006; Louarn et al., Reference Louarn, Torp, Holme, Andersen and Jensen2007; Ofori et al., Reference Ofori, Becker and Kopisch-Obuch2008; Moghaddam et al., Reference Moghaddam, Mohammmadi, Mohebalipour, Toorchi, Aharizad and Javidfar2009; Wang et al., Reference Wang, Kaur, Cogan, Dobrowolski, Salisbury, Burton, Baillie, Hand, Hopkins, Forster, Smith and Spangenberg2009a, Reference Wang, Cavell, Alwi and Packhamb; El-Esawi, Reference El-Esawi2012; El-Esawi et al., Reference El-Esawi, Germaine and Malone2012b; Wu et al., Reference Wu, Li, Xu, Gao, Chen, Yan, Wang, Qiao, Li, Li, Zhang, Song and Wu2014). The above studies on Brassica populations helped in understanding their genetic variability and taxonomic relationships.
Flannery et al. (Reference Flannery, Mitchell, Coyne, Kavanagh, Burke, Salamin, Dowding and Hodkinson2006) used ten plastid SSR primer sets to detect polymorphism in Brassica, Arabidopsis, Camelina, Raphanus and Sinapis. Eight loci were polymorphic. SSR data separated the individuals of Brassicaceae into taxon-specific groups (Arabidopsis, Camelina, Sinapis and Brassica genera). Within Brassica, B. oleracea is separated from B. napus and B. rapa. Louarn et al. (Reference Louarn, Torp, Holme, Andersen and Jensen2007) also evaluated 59 B. oleracea cultivars for microsatellites polymorphisms. All SSR markers, except one, reported a polymorphic information content (PIC value) of 0.5 or above. Ofori et al. (Reference Ofori, Becker and Kopisch-Obuch2008) used 16 microsatellie markers to evaluate the genetic diversity in European winter B. rapa. The results showed that the majority of genetic variation (83%) resided within cultivars. Furthermore, Moghaddam et al. (Reference Moghaddam, Mohammmadi, Mohebalipour, Toorchi, Aharizad and Javidfar2009) used microsatellite and RAPD markers to assess the genetic variability among 32 rapeseed cultivars. The PIC of microsatellite markers varied from 0.60 to 0.91. Na12-C01, a microsatellite marker amplifying two different genomic regions in the Brassica genome was observed in the spring cultivars in one of the regions.
Genome sequencing and SNPs
Genomics technologies apply recombinant DNA, DNA sequencing methods, and bioinformatics to sequence, assemble, and analyse the structure and function of genomes (NHGRI, 2014). Sequencing aims to determine the exact order of the bases in a DNA strand. Next-generation sequencing (NGS), known as high-throughput sequencing, is a term used to describe a number of different novel sequencing technologies including Illumina (Solexa), Roche 454, Ion torrent Proton/PGM and SOLiD (NHGRI, 2014). These recent technologies conduct sequencing DNA and RNA quicker and cheaper than the previously used Sanger sequencing, and as such have revolutionized the studies of genomics and molecular biology. DNA sequencing can be used to assess mutations that may play a role in developing diseases (NHGRI, 2014). The mutations may be substitution (SNPs) or insertion and deletion (INDELs) of a single base pair or a deletion of thousands of bases. Besides their abundance in genomes, SNP markers have the advantages of being codominant and amenable to high-throughput automation (Tsuchihashi and Dracopoli, Reference Tsuchihashi and Dracopoli2002). Whole-genome sequencing and SNP markers have been recently used in the genetic analyses of Brassica and other plants, such as phylogenetic analysis, taxonomy, estimation of genetic variation and population structure, genome-wide association studies and construction of genetic linkage maps (Trick et al., Reference Trick, Long, Meng and Bancroft2009; Bancroft et al., Reference Bancroft, Morgan, Fraser, Higgins, Wells, Clissold, Baker, Long, Meng and Wang2011; Lai et al., Reference Lai, Duran, Berkman, Lorenc, Stiller, Manoli, Hayden, Forrest, Fleury and Baumann2012; Huang et al., Reference Huang, Deng, Guan, Li, Lu, Wang, Fu, Mason, Liu and Hua2013).
Lai et al. (Reference Lai, Duran, Berkman, Lorenc, Stiller, Manoli, Hayden, Forrest, Fleury and Baumann2012) sequenced leaf transcriptomes across a mapping population of B. napus. Analysis of sequence variation and transcript abundance helped constructing single nucleotide polymorphism linkage maps of B. napus, comprising 23,037 markers. They also used these analyses to align the B. napus genome with that of Arabidopsis thaliana, and to genome sequence assemblies of B. rapa and B. oleracea. Huang et al. (Reference Huang, Deng, Guan, Li, Lu, Wang, Fu, Mason, Liu and Hua2013) identified a total of 892,536 bi-allelic SNPs throughout the B. napus genome. Using the GoldenGate genotyping platform, 94 of 96 SNPs sampled could effectively differentiate genotypes of 130 lines from two mapping populations, with an average call rate of 92%. SNPs identified in this could also be used to directly identify causal genes in association studies. In conclusion, it is expected that current advances in Brassica genomics will stimulate researchers to use these commercially available genechips based on SNPs markers for fast and cost-efficient breeding and population genetic studies in the near future. Sequencing platforms will also continue to improve the output length and quality, and that the complementary algorithms and bioinformatic software required to handle large genomes, will be enhanced.
Conclusions
In this paper, the issues and recent knowledge of the genetic diversity and phylogenetic studies of Brassica genetic resources were elucidated at different levels extending from morphological traits to advanced molecular markers passing with cytological and biochemical traits which have proven to be efficient for assessing the genetic variability, relationships and species identity. This information could be potentially used for enhancing future Brassica breeding programmes of highly agronomic Brassica species as well as improving their phylogeny, propagation and conservation strategies. Furthermore, it may be utilized for marker-aided selection and quantitative trait loci analyses.
Acknowledgements
This work was supported by Tanta University in Egypt. The author declares there are not conflicts of interest.