Introduction
The analysis of the loci controlling genetic variation for quantitative traits is performed by so-called quantitative trait loci (QTL) analysis, where trait values are compared between molecular marker genotypes, which are linked to different alleles of causal genes. For this type of analysis, large populations with accurate and dense genetic marker maps as well as accurate phenotyping are required. The latter can be achieved when one can work with homozygous lines, which can be analysed in replications. Such materials can be derived by repeated selfing of initially heterozygous individuals or by the use of naturally occurring inbred genotypes. Immortal mapping populations have also the advantage that experiments analysing different traits and conditions can be compared on the basis of QTL map positions. This has already led to the identification of new pleiotropic functions of known genes (Koornneef et al., Reference Koornneef, Alonso-Blanco and Vreugdenhil2004). In this study, we will describe some of the properties and characteristics of a number of immortal populations used in Arabidopsis thaliana.
Types of mapping populations used in Arabidopsis
The quantitative variation in complex traits and the control by heritable factors are broadly acknowledged in Arabidopsis (Alonso-Blanco and Koornneef, Reference Alonso-Blanco and Koornneef2000). This observation, together with the ease to generate segregating material and the availability of genomic and genetic resources, led to a manifold of QTL studies aiming at the detection of the genes underlying this natural variation (reviewed in Alonso-Blanco et al., Reference Alonso-Blanco, Aarts, Bentsink, Keurentjes, Reymond, Vreugdenhil and Koornneef2009).
Although genetic analysis can be performed on any type of segregating population, this review focuses on collections of homozygous lines. Widely used are recombinant inbred lines (RILs) derived by single seed descent from F2 individuals, which are the progeny of a hybrid between two distinct homozygous (inbred) lines. The largest disadvantage of inbred lines, compared with natural (outbreeding) populations, is the limited number of effective meiotic recombination events. To overcome this limitation, large population sizes are needed to sample enough meiotic events. Alternatively, one can artificially increase the number of meiotic events and thereby the resolution to map loci by additional rounds of selective outcrossing. Several RIL populations have been derived after inter-crossing F2 plants and later generations (Balasubramanian et al., Reference Balasubramanian, Schwartz, Singh, Warthmann, Kim, Maloof, Loudet, Trainer, Dabi, Borevitz, Chory and Weigel2009). However, even after repeated inter-crossing, RIL populations still suffer from slowly decaying linkage disequilibrium (LD), which together with their often limited population size makes that identified QTLs have rather large confidence intervals. Resolution can be increased by using large populations, or when QTL analyses of different populations can be combined as was recently demonstrated for the analysis of seed dormancy in six RIL populations (Bentsink et al., Reference Bentsink, Hanson, Hanhart, Blankestijn-de Vries, Coltrane, Keizer, El-Lithy, Alonso-Blanco, de Andres, Reymond, van Eeuwijk, Smeekens and Koornneef2010).
Studying only one QTL at the time avoids complications of the segregation of multiple loci (e.g. epistasis). This can be achieved by the construction of introgression or near isogenic lines (NILs), in which small chromosomal regions from a donor parent are introduced in the genetic background of a recurrent parent (Koumproglou et al., Reference Koumproglou, Wilkes, Townson, Wang, Beynon, Pooni, Newbury and Kearsey2002; Keurentjes et al., Reference Keurentjes, Bentsink, Alonso-Blanco, Hanhart, Blankestijn-De Vries, Effgen, Vreugdenhil and Koornneef2007; Torjek et al., Reference Torjek, Meyer, Zehnsdorf, Teltow, Strompen, Witucka-Wall, Blacha and Altmann2008). NILs, although often providing less accurate map positions, allow the detection of minor QTLs that cannot be detected in RILs (Keurentjes et al., Reference Keurentjes, Bentsink, Alonso-Blanco, Hanhart, Blankestijn-De Vries, Effgen, Vreugdenhil and Koornneef2007). The use of heterogeneous inbred line families selected from early generation inbred lines that are still heterozygous in the region of interest (Tuinstra et al., Reference Tuinstra, Ejeta and Goldsbrough1997) achieves the same goal, which is often referred too as ‘Mendelizing’ a QTL. Apart from the inaccuracy of the map positions, biparental populations have as disadvantage that they allow only the analysis of genetic variation present between the two parents.
The solution for both mapping accuracy and the analysis of more variation is the use of genome wide association (GWA) mapping (Nordborg and Weigel, Reference Nordborg and Weigel2008). This method uses (in practice a subset of) all variants available and aims to associate trait differences with specific genotypes. Due to the fast LD decay in natural Arabidopsis accessions, often less than 10 Kbp (Kim et al., Reference Kim, Plagnol, Hu, Toomajian, Clark, Ossowski, Ecker, Weigel and Nordborg2007), this procedure requires dense genetic maps (Weigel and Mott, Reference Weigel and Mott2009). Although a powerful approach, a number of factors limit the successful use of GWA. One is the often present population structure resulting in false positive associations. Compensating for the effects of population structure often also removes true positives. These so-called false negatives (Brachi et al., Reference Brachi, Faure, Horton, Flahauw, Vazquez, Nordborg, Bergelson, Cuguen and Roux2010) do show up and are therefore much better studied in experimental populations. Another important limitation is that low-frequency alleles, even when having strong effects, remain undetected. An example of this is the CRY2 allele from the Cvi accession that results in a major QTL in crosses (El-Assal et al., Reference El-Assal, Alonso-Blanco, Peeters, Raz and Koornneef2001; Brachi et al., Reference Brachi, Faure, Horton, Flahauw, Vazquez, Nordborg, Bergelson, Cuguen and Roux2010) but is not detected in GWA analyses (Atwell et al., Reference Atwell, Huang, Vilhjalmsson, Willems, Horton, Li, Meng, Platt, Tarone, Hu, Jiang, Muliyati, Zhang, Amer, Baxter, Brachi, Chory, Dean, Debieu, de Meaux, Ecker, Faure, Kniskern, Jones, Michael, Nemri, Roux, Salt, Tang, Todesco, Traw, Weigel, Marjoram, Borevitz, Bergelson and Nordborg2010).
Intermediate mapping populations that combine the advantages of both approaches consist of experimental crossings with increasing numbers of parents, sizes of populations and recombination events. One solution is the combined analysis of multiple populations as was demonstrated by nested association mapping in maize, where 25 RIL populations were generated by crossing diverse parents with a common reference (Buckler et al., Reference Buckler, Holland, Bradbury, Acharya, Brown, Browne, Ersoz, Flint-Garcia, Garcia, Glaubitz, Goodman, Harjes, Guill, Kroon, Larsson, Lepak, Li, Mitchell, Pressoir, Peiffer, Rosas, Rocheford, Romay, Romero, Salvo, Sanchez Villeda, da Silva, Sun, Tian, Upadyayula, Ware, Yates, Yu, Zhang, Kresovich and McMullen2009). In Arabidopsis, this was demonstrated at a smaller scale, using Ler (Bentsink et al., Reference Bentsink, Hanson, Hanhart, Blankestijn-de Vries, Coltrane, Keizer, El-Lithy, Alonso-Blanco, de Andres, Reymond, van Eeuwijk, Smeekens and Koornneef2010) or Col (Brachi et al., 2010) as a common parent, respectively. The use of multiple parents in different combinations is another alternative, and two of such multiple parent combinations are now described for Arabidopsis. One is the so-called MAGIC population developed by Kover et al. (Reference Kover, Valdar, Trakalo, Scarcelli, Ehrenreich, Purugganan, Durrant and Mott2009). This population is derived by inter-crossing 19 parents for several generations, after which RILs were developed. The other, the so-called AMPRIL population (Paulo et al., Reference Paulo, Boer, Huang, Koornneef and van Eeuwijk2008; Huang et al., Reference Huang, Paulo, Boer, Effgen, Keizer, Koornneef and van Eeuwijk2011), was derived from inter-crossing four pairwise hybrids from eight founder lines, followed by RIL construction via single seed descent up to the F5.
QTL analysis of flowering time
Table 1 summarizes the results of different population types. Many QTLs were identified at similar positions and most likely represent the same loci. Only when field conditions were employed predominantly different, QTLs were detected (Brachi et al., Reference Brachi, Faure, Horton, Flahauw, Vazquez, Nordborg, Bergelson, Cuguen and Roux2010), indicating the importance of genotype × environment interactions. When considering multiple parent populations, the number of QTLs was surprisingly low in the MAGIC and AMPRIL populations (Kover et al., Reference Kover, Valdar, Trakalo, Scarcelli, Ehrenreich, Purugganan, Durrant and Mott2009; Huang et al., 2010). Partially this can be explained by the reduced power due to the multiple alleles that segregate at multiple loci in multi-parent populations. A recent GWA study included flowering time analyses from various experiments (Atwell et al., Reference Atwell, Huang, Vilhjalmsson, Willems, Horton, Li, Meng, Platt, Tarone, Hu, Jiang, Muliyati, Zhang, Amer, Baxter, Brachi, Chory, Dean, Debieu, de Meaux, Ecker, Faure, Kniskern, Jones, Michael, Nemri, Roux, Salt, Tang, Todesco, Traw, Weigel, Marjoram, Borevitz, Bergelson and Nordborg2010). The study showed the complication of population structure that is relatively strong for flowering time, as well as the known issues of significance thresholds and false positives. By combining the RIL and GWA approaches in the same experiment, these alleged false positives could be reduced and clear candidates could be indicated (Brachi et al., Reference Brachi, Faure, Horton, Flahauw, Vazquez, Nordborg, Bergelson, Cuguen and Roux2010). The recent developments in GWA mapping together with the relatively ease to generate experimental mapping populations, either as F2/F3 and/or the efficient development of doubled haploids using centromere-mediated genome elimination (Ravi and Chan, Reference Ravi and Chan2010), will continue to contribute to our understanding of complex traits. We therefore argue that multiple types of populations and approaches are needed for a thorough understanding of the complex genetic architecture of quantitative traits.
Table 1 QTL analyses of flowering time in studies using different types of mapping populations
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921032120319-0766:S1479262111000086:S1479262111000086_tab1.gif?pub-status=live)
R 2, the total explained variance.
a Data of diverse populations using a common reference accession (Ler or Col).
b Number of detected QTLs and total explained variance in GWA analyses depends strongly on the applied methods and threshold levels.