Introduction
India is home to a large number of rice landraces (Pachauri et al., Reference Pachauri, Taneja, Vikram, Singh and Singh2013) with all the major rice-growing states having their preferred assemblage (Singh and Singh, Reference Singh and Singh2003). Adaptation to wide-ranging edaphic and climatic factors has shaped >120,000 of them with extensive phenotypic variations (Khus, Reference Khush, Sasaki and Moore1997). Rice accessions from northeastern (NE) states of India are regarded as the storehouse of agronomically important genes (Choudhury et al., Reference Choudhury, Khan and Dayanandan2013; Das et al., Reference Das, Sengupta, Parida, Roy, Ghosh, Prasad and Ghose2013) as they have endured extreme conditions of this mountainous terrain. It is estimated that about 10,000 such cultivars/landraces of ecological and cultural significance are preserved here (Hore, Reference Hore2005; Das et al., Reference Das, Sengupta, Parida, Roy, Ghosh, Prasad and Ghose2013) wanting in-depth evaluation on their offerings.
Sikkim has considerable antiquity in rice cultivation (Rahman and Karuppaiyan, Reference Rahman, Karuppaiyan, Arrawatia and Tambe2011). Rice is grown here at an elevation ranging from foothills to 1700 m above sea level, and from sub-tropical to temperate climate. It is the second-largest crop after maize occupying 15% (10,327 ha) of the area under agriculture (Portel, Reference Portel2015). As the state receives plenty of rainfall (3250 mm/annum), well distributed over 6 months from May to October, rice farming has emerged as the major source of livelihood for the majority of the indigenous communities inhabiting this pristine land. Landraces, which are grown in a traditional way in the hill slopes with and without terracing, dominate the rice production here with a 55–60% share in acreage (Rahman and Karuppaiyan, Reference Rahman, Karuppaiyan, Arrawatia and Tambe2011). There are 70 plus such landraces reported so far (Sharma et al., Reference Sharma, Partap, Sharma, Rasul and Awasthe2016) which await detailed characterization. However, barring a few scattered works (Rahman and Karuppaiyan, Reference Rahman, Karuppaiyan, Arrawatia and Tambe2011; Choudhury et al., Reference Choudhury, Khan and Dayanandan2013; Das et al., Reference Das, Sengupta, Parida, Roy, Ghosh, Prasad and Ghose2013; Sharma et al., Reference Sharma, Partap, Sharma, Rasul and Awasthe2016), not much has been done in this direction.
The majority of these landraces suffer from poor yield, lodging and late maturity although they boast other superior traits such as hardiness, pest tolerance/resistance, low input requirements and weed coexistence (Sharma et al., Reference Sharma, Pradhan, Bhutia, Tamang and Bhutia2015). In addition, the soils in Sikkim are highly acidic, leached and generally poor in fertility and water holding capacity (Das and Avasthe, Reference Das, Avasthe, Avasthe, Pradhan and Bhutia2016). Even though high-yielding varieties (HYV) are available (https://icar-nrri.in/released-varieties), their introduction in place of existing landraces/cultivars is perceived as detrimental to the native gene pool owing to evident genetic erosion (Rathi et al., Reference Rathi, Singh and Misra2018). Moreover, farmers still prefer landraces over HYVs due to inherent cultural preferences, and low input costs (Kapoor et al., Reference Kapoor, Avasthe, Chettri, Gopi and Kalita2017). Thus, developing high-yielding, lodging resistant varieties tolerant to acidic soils by upgrading the prevailing landraces through marker-assisted backcross breeding (MAB) is the need of the hour to simultaneously achieve productivity and acceptability for the rice production in the Sikkim. More importantly, this will safeguard the local gene pool and their cropping system, the heritage which has survived for generations under the custodianship of the native people.
DNA markers have been the choice tool for the genetic characterization of rice genotypes (Karkousis et al., Reference Karkousis, Barr, Chalmers, Ablett, Holton, Henry, Lim and Langridge2003). Among them, simple sequence repeats (SSRs) are preferred due to polymorphic nature, abundance in the eukaryotic genome and ease of generating (Tautz, Reference Tautz1989; Morgante and Olivieri, Reference Morgante and Olivieri1993). In rice, SSRs have been used earlier for the assessment of genetic diversity (Rahman et al., Reference Rahman, Rasaul, Hossain, Iftekharuddaula and Hasegawa2012; Choudhury et al., Reference Choudhury, Khan and Dayanandan2013; Singh et al., Reference Singh, Choudhury, Tiwari, Singh, Kumar, Srinivasan, Tyagi, Sharma, Singh and Singh2016; Suvi et al., Reference Suvi, Shimelis, Liang, Mathew and Shayanowako2019), conservation planning (Sharma et al., Reference Sharma, Chaudhary, Ojha, Yadav, Pandey and Shrestha2007), marker-assisted selection (Perez-Sackett et al., Reference Perez-Sackett, Cianzio, Kara, Aviles and Palmer2011; Rani and Adilakshmi, Reference Rani and Adilakshmi2011), cultivar identification and hybrid purity analysis, besides gene mapping (Weising et al., Reference Weising, Winter, Hüttel and Kahl1997; Altaf-Khan et al., Reference Altaf-Khan, Qureshi, Saha, Jenkins, Brubaker and Reddy2006; Sarao et al., Reference Sarao, Vikal, Singh, Joshi and Sharma2010).
Further, any MAB initiative entails the identification of candidate genes/Qualitative Trait Loci (QTLs) for the traits of interest. Typically, linkage mapping and/or association analysis is used for this purpose. The major shortcoming in the linkage analysis is that only two alleles at any given locus can be considered in bi-parental crosses which provides a low mapping resolution (Flint-Garcia et al., Reference Flint-Garcia, Thornsberry and Buckler2003). Alternatively, association mapping categorizes QTLs by exploiting the natural diversity and identifies important genes (Zhu et al., Reference Zhu, Gore, Buckler and Yu2008). It has been successfully used in several crop species (Abdurakhmonov and Abdukarimov, Reference Abdurakhmonov and Abdukarimov2008) including rice (Famoso et al., Reference Famoso, Zhao, Clark, Tung, Wright, Bustamante, Kochian and McCouch2011; Huang et al., Reference Huang, Zhao, Wei, Li, Wang, Zhao, Li, Guo, Deng, Zhu, Fan, Lu, Weng, Liu, Zhou, Jing, Si, Dong, Huang, Lu, Feng, Qian, Li and Han2012; Zhou et al., Reference Zhou, You, Ma, Zhu and He2012; Han and Huang, Reference Han and Huang2013). The present study was therefore conceived to (a) assess the genetic relationship and population structure among the 53 rice landraces and (b) to test their efficacy for the association analysis and advance early insights on the loci controlling the key agronomic (7) and nutritional quality traits (14).
Materials and methods
Plant materials
The rice landraces for the present study were sourced from different agro-climatic regions in Sikkim through collection trips during the years 2017–19. The information on varietal preference, landrace name, traditional uses, etc., was acquired through structured interviews. Prior informed consent was obtained for the accessions collected from the farmers. The collection strategy involved capturing maximum variability through minimum representative accessions which were arrived from the data of the structured interview. Based on the preliminary screening, a total of 68 landraces were shortlisted. After eliminating the redundant collections, 53 were retained for the phenotypic and genetic characterization (online Supplementary Table S1). The plants were raised during the Kharif season of 2019 at the experimental farm situated at Nandok, East Sikkim, Sikkim state (27o17′36″ N and 88o36′18″) at an elevation of 972 m above sea level. Initially, seeds were sown in a raised bed nursery and, after 30 days, seedlings were transplanted in the field. The experiment was laid out in a randomized block design with three replications. Each entry was transplanted in three rows with 20 cm spacing between row to row, and 14 cm between plant to plant. The standard agronomic practice was followed for crop growth.
Phenotyping
Twenty-one quantitative and qualitative traits of agronomic (7), quality (9) and nutritional (5) significance were evaluated (online Supplementary Table S2). For the seven agronomic traits such as days to 50% flowering, flag leaf length (cm), flag leaf width (cm), stem thickness (cm), plant height (cm), panicles per plant and number of effective tillers per plant, observations were recorded on five randomly selected plants as per the method prescribed by the National test guidelines for DUS (PPV & FR Act, 2007). For the measurement of quality traits such as grain length, grain width, grain thickness, kernel width and kernel length, 10 grains were chosen randomly and measured in millimetres (mm) using a digital Vernier caliper following the National test guidelines for DUS (PPV & FR Act, 2007). In the case of cooked kernels, 5 g of the dehusked samples were soaked in 15 ml water for 10 min and cooked for 20 min. After cooling, the grains were measured using a digital Vernier caliper. For 100-grain weight, three replicates of 10 whole grains, dried to 13% moisture content, were chosen randomly and weighed in grams (g), and then the average was multiplied by 10. Proximate analysis of crude protein, carbohydrate, crude fat and crude fibre was carried out following the protocols of the Association of Analytical Chemists (AOAC, 2004) and the American Association of Cereal Chemists (AACC, 2000), respectively. The amylose value was determined using the standard protocol (Juliano, Reference Juliano1979).
Genotyping
Genomic DNA was isolated from fresh young leaves (14–20 days) following the modified cetyl-trimethylammonium bromide method (Doyle and Doyle, Reference Doyle and Doyle1987). For genotyping, 45 SSR primer pairs (online Supplementary Table S4) used previously for the association analysis (Islam et al., Reference Islam, Khalequzzaman and Bashar2018b) were downloaded from the Gramene's marker database (http://www.gramene.org) and primers were synthesized. These markers offered coverage on all the 12 chromosomes of the rice genome. DNA amplification was carried out in a BIORAD T100 thermal cycler (USA) with a reaction mixture containing 1 μl of genomic DNA (50 ng), 0.5 μl of each primer (10 pmol/μl), 6 μl of PCR master mix (Nucleo Spin Takara) and 2 μl of PCR grade water. PCR amplification was performed with an initial denaturation at 94 °C for 4 min followed by 35 cycles of denaturation, annealing and extension at 94 °C for 1 min, 55 °C for 1 min and 72 °C for 1 min with a final extension at 72 °C for 5 min. The PCR products were resolved on a 3% agarose gel using 1× Tris-boric acid-EDTA (TBE) buffer. After staining with the ethidium bromide, bands were visualized using the Gel Documentation system (BIORAD-Chemi Doc XRS+, USA). All PCR reactions were repeated at least twice before the data analysis.
Allele scoring
Post amplification, a cluster of two to five distinct bands was visible in the stained gel for the majority of the markers. The size of the most intensely amplified band of each microsatellite marker was determined by comparing it with the standard 50 bp DNA ladder. For individual markers, the presence of an allele was recorded as ‘1’ and the absence as ‘0’. The representative gel images for SSR marker RM 314 (a and a1) and RM 253 (c and c1) are provided in online Supplementary Fig. S1.
Data analysis
Power Marker V.3.25 (Liu and Muse, Reference Liu and Muse2005) was used to calculate the major allele frequency (M AF) and the polymorphism information content (PIC). The marker characteristics such as observed heterozygosity (Ho), expected heterozygosity (He), total number of alleles (Na), effective number of alleles (Ne) and percentage of polymorphic loci (%P) were computed using GenAlEx V.6.5 (Peakall and Smouse, Reference Peakall and Smouse2012). The genetic diversity indices such as total genetic diversity (Ht), genetic diversity within populations (Hs), genetic differentiation (Gst), Shannon's information index (I) and Nei's gene diversity (h) were determined using POPGENE V.1.32 (Yeh et al., Reference Yeh, Yang, Boyle, Ye and Mao1999). The distance-based clustering was performed with an Un-Weighted Pair Group Method with Arithmetic mean (UPGMA) using FreeTree software (Pavlicek et al., Reference Pavlicek, Hrda and Flegr1999). The Principal Coordinate Analysis (PCoA) plot was constructed using DARwin V.6.0 (Perrier and Jacquemound-Collet, Reference Perrier and Jacquemound-Collet2006). The robustness of the UPGMA dendrogram was tested with a bootstrap analysis using 1000 iterations.
Population structure
Using STRUCTURE V. 2.3.4 software (Pritchard et al., Reference Pritchard, Stephens, Rosenberg and Donnelly2000), the Bayesian model was implemented to ascertain the optimal genetic clusters within the 53 rice accessions. The admixtures model and correlated allele frequencies were applied for each run with 100,000 burn-in periods and 200,000 Markov Chain Monte Carlo (MCMC) simulations. The optimum K value was determined from 10 replicate runs for each value of K (Evanno et al., Reference Evanno, Regnaut and Goudet2005). The ΔK was based on the change in the log probability [LnP(D)] of the data between successive K values. To obtain a clear peak at the ΔK, the output from the structure analysis was loaded onto STRUCTURE HARVESTER V. 6.0 (Earl and Von Holdt, Reference Earl and Von Holdt2012). Analyses of molecular variance (AMOVA) and pairwise F ST were carried out by GenAlEx V. 6.5.
Marker-trait association
Software TASSEL V. 5.0 (Bradbury et al., Reference Bradbury, Zhang, Kroon, Casstevens, Ramdoss and Buckler2007) was used to perform the marker-trait association (MTA) analysis. Both the models – the general linear model (GLM) and the mixed linear model (MLM) – were tested. The major limitation of the GLM is that it does not incorporate the kinship matrix (K matrix) but rather uses population structure (Q matrix) for finding a MTA which sometimes results in false positives. This limitation is overcome by the MLM which incorporates both Q and K matrix reducing spurious association (Yu et al., Reference Yu, Pressoir, Briggs, Bi, Yamasaki, Doebley, McMullen, Gaut, Nielsen, Holland, Kresovich and Buckler2006). The significance of MTA was tested at P < 0.05 and the correlation coefficient (R 2 ≥ 10%) was used to determine the phenotypic variation explained by each MTA.
Results
Marker characteristics
Totally 45 SSR primer pairs were used to genotype 53 rice landraces of which 42 (93.33%) were polymorphic, two (RM 25 and RM 455) were monomorphic, and one (RM 170) failed to amplify. These 42 polymorphic SSR primers pairs were used in all the subsequent analyses. A total of 227 alleles were detected which ranged from 2 (in many) to 11 (RM 20) with an average of 5.262 alleles per locus. The major allele frequency (M AF) was high with a mean value of 0.679 and ranged from 0.460 to 0.930. The number of different alleles (Na) per locus varied from 1 to 9 with a mean value of 4.721, while the number of effective alleles (Ne) per locus varied from 1.210 to 6.384 with a mean of 3.344. The fact that Na>Ne indicates that only a handful of alleles contributed to the overall diversity. The expected heterozygosity (He) ranged from 0.175 (RM 322) to 0.842 (RM 284) with an average of 0.632. The observed heterozygosity (Ho) values had a mean value of 0.222, and ranged from 0.00 (RM 237, RM 513, RM 178, RM 277, RM 271, RM 44) to 0.98 (RM 514). The value of PIC, which measures the extent of polymorphism revealed by the markers based on the number of alleles per locus, ranged from 0.121 (RM 322) to 0.375 (RM 44) with an average of 0.323. While the major portion of the microsatellite markers (34) was moderately polymorphic with PIC values ranging between 0.25 and 0.50, eight markers generated low polymorphism with PIC<0.25. The details of the marker attributes are provided in online Supplementary Table S3.
Genetic diversity
The mean values of Nei's gene diversity (h) and Shannon's information index (I) in our landrace collection were 0.217 and 0.354 respectively signifying moderate genetic diversity. The mean value of total gene diversity (Ht = 0.322) further supported this. Two populations were assumed based on the presence/absence of aroma trait of which the first population (population-1) comprised of 16 aromatic accessions, while the second population (population-2) represented 37 non-aromatic accessions. In population-1, about 97.67% of SSR markers were polymorphic, while 96.51% were polymorphic in population-2. Among the two groups, the aromatic group with I = 0.365 and h = 0.223 was slightly more diverse vis-a-vis a non-aromatic group with I = 0.343 and h = 0.209. The summary of genetic diversity is presented in Table 1.
Na, total number of alleles per locus; Ne, number of effective alleles per locus; I, Shannon's information index; Ho, observed heterozygosity; He, expected heterozygosity; SE, standard error.
The UPGMA dendrogram resolved 53 landraces, with significant bootstrap support, into two major clusters (Fig. 1) – CL-I and CL-II – with a single outlier accession. Of these, CL-I included 48 accessions belonging to both aromatic and non-aromatic groups and was further divided into two smaller groups – sub-cluster-IA (SCL-IA) and sub-cluster-IB (SCL-IB). Interestingly, all the aromatic accessions such as Raja bara, Ram jeera, Ram bhog, Rudhua, Shyam jeera, Krishna bhog, Kalo dhan, Kalo nunia-I, Kalo nunia-II, Kaanchi, Dharnali and Birinful grouped within SCL-IB with a few non-aromatic rice accessions, while SCL-IA was entirely composed of non-aromatic rice accessions (Table 2). The CL-II encompassed only four non-aromatic accessions namely Taprey, Thamba, Thulo tulashi and Tulashi. Doodh kalami was an outlier in UPGMA cluster. Overall, the grouping did not fully correspond to either the geographical origin or aroma trait except for the noticeable influence of aroma in the grouping of SCL-IB accessions. A similar dispersion was observed in the PCoA (Fig. 2) plot where the first two axes accounted for 14.82 and 8.86% of the variation, respectively.
Population structure
The population structure analysis using STRUCTURE revealed K = 3 as the highest log-likelihood value signifying three genetic groups within our landrace collection (Fig. 3). These were named RC-I, RC-II and RC-III, respectively. Based on the Q-value (membership proportions), accessions were further designated as pure (Q-value ≥80%) and admixtures (Q-value <80%) which revealed 39 pure and 14 admixtures in our collection. Out of the 39 pure accessions, 19 were grouped with RC-I with five aromatic accessions, 13 aligned in RC-II with nine aromatic accessions, while RC-III included seven non-aromatic accessions only. The analysis of molecular variance (AMOVA) showed 99% of the molecular variance was contributed by within-population differences, while only 1% was due to variation between the populations. The results of AMOVA displayed highly significant genetic differences (P < 0.001) among the individuals and within the individuals.
Marker-trait association
The MTA analysis for the two trait groups – biochemical and yield-related – was carried out using 42 polymorphic SSR markers across 53 rice accessions. Counting both GLM (62) and MLM (53), a total of 115 significant MTAs were observed (P < 0.05) for 21 traits which were contributed by 71% (30) of the SSR markers. In total, 25 MTAs for 17 traits were identical in both the models (Table 3), and the remaining five MTAs were detected in either of them.
In the case of GLM, a total of 62 significant MTAs (P < 0.05, R 2 ≥ 10%) were found which ranged from a minimum of 1 to a maximum of 7. The minimum number of MTAs was detected for 100-grain weight, kernel width after cooking, crude fat and amylose content while the maximum number of MTAs was reported for grain thickness. The correlation value (R 2) for GLM ranged from 10 to 26%. The highest R 2 value (26%) was between carbohydrate content and marker RM 528. In the case of MLM, 53 significant (P < 0.05, R 2 ≥ 10%) MTAs were detected. The minimum number of MTA (1) was detected for grain width, kernel width after cooking, kernel length after cooking, number of effective tillers, stem thickness and amylose value. The maximum number of MTAs (4) were detected for 100-grain weight, number of panicles per plant, flag leaf width, carbohydrate content and crude fibre content. The correlation value (R 2) for MLM ranged from a minimum of 10% in several MTAs to a maximum of 22%. The highest R 2 value (22%) was found between the kernel length and marker RM 489. In addition, the following seven markers were associated with two or more than two traits: RM 5 (associated with grain width and kernel width); RM 408 (associated with grain length and crude fibre); OSR 13 (associated with grain thickness and effective tillers per plant); RM 551 (associated with grain thickness and number of panicle per plant); RM 10 (associated with grain thickness and kernel thickness); RM 489 (associated with 100-grain weight and kernel length); RM 447 (associated with the number of panicles per plant and carbohydrate content). Further, the markers RM 447, RM 6, RM 408 and RM 118 were found to be associated with carbohydrate, crude protein, crude fibre and crude fat contents. Similarly, markers OSR13, RM 551 and RM 10 were associated with grain thickness, and RM 489, RM 215 and RM 44 were associated with kernel length only. Likewise, RM 551, RM 447 and RM 515 were associated with the number of panicles per plant.
Discussion
Rice has great antiquity in the Sikkim-Himalayan region as evident from the etymological significance of its old name ‘Denzong’ meaning ‘the valley of rice’. Rice farming has been an inspiring enterprise here as it has evolved in response to the dynamic needs of a multi-ethnic society of diverse food habits, limited landholding; inhabiting difficult topographies of a harsh climate. Their enduring creative labour, for centuries, has led to the selection of an array of landraces, which occupy a significant niche in the local rice production system (Rahman and Karuppaiyan, Reference Rahman, Karuppaiyan, Arrawatia and Tambe2011). However, of late, like other Himalayan states, rice production in Sikkim is witnessing a steady shift from the traditional to modern agro-ecosystem. Consequently, landraces are being slowly faced out by the high-yielding varieties (FS&ADD, 2018–19). It is thus cautioned that – in the absence of appropriate scientific intervention – indigenous rice landraces and their farming technologies in the region might become history with time (Kambewa et al., Reference Kambewa, Mfitilodze, Huttner, Wollny, Phoya, Mathias, Ranganekar and McCorkle1997; Zuberi, Reference Zuberi, Mathias, Ranganekar and McCorkle1997).
The majority of these landraces suffer from low yield and lodging (Sharma et al., Reference Sharma, Partap, Sharma, Rasul and Awasthe2016). Besides, high soil acidity is adversely impacting their production potential (Das and Avasthe, Reference Das, Avasthe, Avasthe, Pradhan and Bhutia2016). Identifying favourable markers/alleles for yield, tolerance to high soil acidity and lodging resistance via association mapping can provide fast and efficient means to strategize marker-assisted breeding programmes to improve these traits. Before that, there is a need to adequately characterize the local germplasm and generate information related to diversity and population structure. Hence, the main aim of the present work was to determine the genetic diversity and relationship among the rice landraces of Sikkim. In addition, we have explored the prospects of association mapping through an extended experiment.
In our genetic diversity estimation, the number of alleles ranged from 2 to 11 with a mean value of 5.27 per locus. This suggests superior allelic diversity in our germplasm when compared with aromatic rice varieties of Assam (3.7; Talukdar et al., Reference Talukdar, Rathi, Pathak, Chetia and Sharma2017), rice varieties released in India during 1940–2013 (3.11; Singh et al., Reference Singh, Choudhury, Tiwari, Singh, Kumar, Srinivasan, Tyagi, Sharma, Singh and Singh2016) as well as elite rice varieties of Bangladesh (4.18; Rahman et al., Reference Rahman, Rasaul, Hossain, Iftekharuddaula and Hasegawa2012) and Malaysia (4.09; Aljumaili et al., Reference Aljumaili, Rafii, Latif, Sakimin, Arolu and Miah2018); but lower as compared with eastern Himalayan landraces (Choudhury et al., Reference Choudhury, Khan and Dayanandan2013; 7.9 – Das et al., Reference Das, Sengupta, Parida, Roy, Ghosh, Prasad and Ghose2013; 8.49 – Roy et al., Reference Roy, Marndi, Mawkhlieng, Banerjee, Yadav, Misra and Bansal2016). The variability in the number of alleles/locus observed in pan-Himalayan studies, including ours, reflects the diverse nature of the rice germplasm of this region. Besides, the number of effective alleles per locus ranged from 1.21 to 6.84, with a mean of 3.34 – the value close to (3.77) that reported by Chen et al. (Reference Chen, He, Nassirou, Nsabiyumva, Dong, Adedze and Jin2017) indicating a good number of alleles contributed to the genetic diversity. The major allele frequency across all the 42 SSR primers ranged from 0.46 (RM 277) to 0.93 (RM 322) with an average of 0.679 (online Supplementary Table S3) which is higher than the previous studies on Indian (0.53; Upadhyay et al., Reference Upadhyay, Neeraja, Kole and Singh2012) and Korean landraces (0.5; Li et al., Reference Li, Lee, Kwon, Li and Park2014). The greater number of alleles indicates the usefulness of the selected SSR markers for genetic studies.
The PIC of a marker is a good measure of their usefulness for the linkage analysis (Elston, Reference Elston2014). Often, PIC values are the reflection of allelic diversity among the varieties (Meti et al., Reference Meti, Samal, Bastia and Rout2013). In our study, a moderate PIC value with a mean of 0.32 (range 0.121–0.375) was observed for 53 accessions. A similar PIC value (0.37) was reported earlier by Choudhury et al. (Reference Choudhury, Khan and Dayanandan2013) in rice landraces of NE states of India. Saha et al. (Reference Saha, Naveed and Arif2013) and Pachauri et al. (Reference Pachauri, Taneja, Vikram, Singh and Singh2013) also found the mean PIC values of 0.37 and 0.38 in their study on different Indian rice varieties. However, higher PIC values up to 0.74 and 0.65 have also been reported in some other studies from the eastern Himalayan region (Das et al., Reference Das, Sengupta, Parida, Roy, Ghosh, Prasad and Ghose2013; Roy et al., Reference Roy, Marndi, Mawkhlieng, Banerjee, Yadav, Misra and Bansal2016). This may be due to specific variability held by these landraces. Further, as PIC and inbreeding coefficient (FIS) are also the functions of how heterozygosity is partitioned within and among the accessions based on differences in allele frequencies (Mulualem et al., Reference Mulualem, Mekbib, Shimelis, Gebre and Amelework2018), it can be inferred that a large majority of the loci targeted by our markers are homozygous.
Our landraces revealed a gene diversity estimate of 0.217 which is much lower than 0.51 reported by Umakanth et al. (Reference Umakanth, Vishalakshi, Sathish Kumar, Rama Devi, Bhadana, Senguttuvel, Kumar, Sharma, Sharma, Prasad and Madhav2017) for rice landraces of northeast India. The mean observed heterozygosity was 0.217 which is analogous to other studies on Indian rice varieties (Choudhury et al., Reference Choudhury, Singh, Singh, Kumar, Srinivasan and Tyagi2014; Nachimuthu et al., Reference Nachimuthu, Muthurajan, Duraialaguraja, Sivakami, Pandian and Ponniah2015). Shannon's information index (I) ranged from 0.343 to 0.365 with an average of 0.354. This is lower than the findings of Aljumaili et al. (Reference Aljumaili, Rafii, Latif, Sakimin, Arolu and Miah2018) and Suvi et al. (Reference Suvi, Shimelis, Liang, Mathew and Shayanowako2019) who reported an index of 0.88 and 0.82, respectively. The moderate value of Shannon's information index is another indication of the moderate genetic diversity among the on-hand germplasm. Further, the inbreeding coefficient (FIS) represents the average deviation of the population's genotypic proportions from Hardy–Weinberg equilibrium for a locus. The FIS values in our study revealed that only three of the 42 markers (RM 215, OSR13 and RM 514) showed higher heterozygotes (−0.081, −0.085 and −0.27).
In our UPGMA analysis, landraces from the same geographic locations, or sharing similar traits such as aroma, did not correspond to specific groups indicating the nominal influence of these attributes on the genetic relationship. Similar results have been observed for 67 hill rice accessions of NE India (Roy et al., Reference Roy, Marndi, Mawkhlieng, Banerjee, Yadav, Misra and Bansal2016) as well as 48 traditional aromatic rice landraces from eastern India (Meti et al., Reference Meti, Samal, Bastia and Rout2013). According to Mulualem et al. (Reference Mulualem, Mekbib, Shimelis, Gebre and Amelework2018), this may be due to their common descent as they are obtained from the nearby areas. Conversely, high gene flow aided by the cross-pollination might have also caused this (Musyoki et al., Reference Musyoki, Kioko, Mathew, Muriira, Wavinya, Felix, Chemtai, Mwenda, Kiambi and Ngithi2015). However, the accretion of a significant number of aromatic accessions in a particular sub-group of UPGMA calls for further investigation on the role of aroma in shaping the phylogenetic relationships in rice landraces.
Population structure analysis using STRUCTURE revealed the highest ΔK value at K = 3 indicating 53 rice accessions represented three sub-populations (Fig. 3). They revealed 39 pure and 14 admixtures. The AMOVA showed significant genetic differences (P ≤ 0.001) among and within individuals and lower variation among the population groups. Of the total genetic variation, 70% was due to variation among the individuals. A similar trend was reported even in earlier studies (Anandan et al., Reference Anandan, Anumalla, Pradhan and Ali2016; Roy et al., Reference Roy, Marndi, Mawkhlieng, Banerjee, Yadav, Misra and Bansal2016; Islam et al., Reference Islam, Khalequzzaman, Prince, Siddique, Rashid, Ahmed, Pittendrigh and Ali2018a). The presence of such variability provides for generating wide crosses and creating the desirable heterotic group in the base breeding population (Alam et al., Reference Alam, Juraimi, Rafii, Hamid, Arolu and Latif2015). On the other hand, the contribution of variation among the population groups was 1% indicating only a small collection within a given source captured the genetic diversity present in the test accessions. Further, the low mean value of F ST (0.011) and G ST (0.028) signify poor genetic differentiation (Wright, Reference Wright1978) and the allele frequencies between population groups are likely to be similar. This was further supported by a very high estimate of gene flow (Nm: 3.96). Put together, these results point to a close evolutionary lineage of the rice accessions of Sikkim-Himalayas or extensive out-crossing between the accessions (Nuijten et al., Reference Nuijten, Van Treuren, Struik, Mokuwa, Okry, Teeken and Richards2009). It is plausible that the exchange of rice accessions between the farmers may have also contributed to gene flow across different landraces.
In our association analysis, the majority of the associations detected in the MLM were also supported by the GLM approach with 25 MTAs for 17 traits being common in both (Table 3). In earlier works on the association analysis, Zhou et al. (Reference Zhou, You, Ma, Zhu and He2012) reported a total of 16 MTAs (P < 0.01) in the japonica rice varieties collected from China. Similarly, Anandan et al. (Reference Anandan, Anumalla, Pradhan and Ali2016) reported 16 MTAs in 629 rice accessions using 39 SSR markers for the identification of genes associated with the early seedling vigour. Furthermore, Patil et al. (Reference Patil, Premi, Sahu, Dubey, Sahu and Chandel2014) found 13 significant SSR markers associated with the 18 agronomic traits in 58 rice accessions of Chhattisgarh, India.
Among the MTAs revealed, markers RM 489 and RM 408 were significantly associated with QTL controlling the grain length (GL) and grain width (GW). This is consistent with the findings of Huggins et al. (Reference Huggins, Chen, Fjellstrom, Jackson, McClung and Edwards2019) who reported a strong association of RM 489 (chromosome 3) with grain length in brown rice, and grain width in rough rice of USDA National small grains collections. Similarly, RM 6, RM 215 and RM 44 located on chromosomes 2, 9 and 8, respectively, were significantly associated with the protein content, kernel breadth (KB) and kernel length (KL). A related earlier study by Septiningsih et al. (Reference Septiningsih, Prasetiyono, Lubis, Tai, Tjubaryat, Moeljopawiro and McCouch2003) had revealed RM 6 and RM 215 to have associated with the grains per plant (GPP) and panicle length (PL). These parallels between our results and similar studies in rice from across the world suggest the reliability of our MTAs for further testing in the relevant breeding programmes. For instance, markers RM 271 and OSR 13 mapped on chromosomes 10 and 3 are significantly associated with the flag leaf length (FLL) and flag leaf width (FLW) in our study. An earlier work reporting QTLs for leaf width (Anandan et al., Reference Anandan, Anumalla, Pradhan and Ali2016) also established their involvement in leaf-related traits. Thus, it is worthwhile examining whether they could be used to improve leaf traits in Thulo Attey – a popular rice landrace of Sikkim. Similarly, the other MTAs could be explored as potential leads for testing in the relevant breeding programmes.
Conclusion
In conclusion, moderate diversity, low divergence and high gene flow among the rice landraces of Sikkim-Himalaya signify their prospects for association mapping. The reliability of the deployed SSR markers implies their utility in genetic studies. The identified MTA signals scope for marker-assisted breeding aiming at the improvement of locally relevant traits in these landraces. However, given the unique requirements of the region, a participatory plant breeding approach is likely to provide the right direction for these efforts. The future work must also focus on comprehensive screening of the on-hand germplasm for genes/markers related to up-land agriculture and other traits relevant to this region.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S1479262121000411.
Acknowledgement
The authors acknowledge the financial support from the Department of Biotechnology (DBT), Government of India (Grant No. BCIL/NER-BPMC/2016). They also thank The Director, National Rice Research Institute (NRRI) Cuttack, Odisha, India for extending the necessary facilities for this work. Further, all the authors duly acknowledge the contribution of local farmers of the Sikkim state for providing the rice germplasm/accession for this study.