Introduction
Cotton is the most important fibre crop and grown over an area of around 32 million hectares (ICAC, 2015). It is considered as the backbone of cotton industry. For farmers, the most important trait is seed cotton yield, whereas for the textile industry, fibre traits are more important. Breeders consider both yield and fibre traits as the primary objective for the improvement of cotton fibre quality. Cotton boll is a basic unit of seed cotton yield and fibre quality (Tang and Xiao, Reference Tang and Xiao2013, Reference Tang and Xiao2014; Jones et al., Reference Jones, Joy and Smith2014). This basic unit can be divided into several components such as number of locules per boll, seed cotton yield per locule, seed cotton yield per boll and lint percentage. These components play a vital role in the selection of cotton genotypes for seed cotton yield with better fibre quality traits (Wu et al., Reference Wu, Jenkins, McCarty and Zhu2004; Tang and Xiao, Reference Tang and Xiao2013; Bechere et al., Reference Bechere, Zeng and Boykin2014). However, it is the ability of the breeder to identify and evaluate the genetic association of these yield components for a breeding programme.
Seed cotton yield is a complex trait that depends on the contribution of several other traits. Genetic correlation among these traits describes the genetic association and pleiotropic effects of the traits (Wu et al., Reference Wu, Jenkins, McCarty and Zhu2004; Karademir et al., Reference Karademir, Karademir, Ekininci and Gencer2010). Path coefficient analysis splits the genetic association into direct and indirect effects to explain the pleiotropic magnitude of other traits (Bhatt, Reference Bhatt1973; Bechere et al., Reference Bechere, Zeng and Boykin2014). This analysis also helps to analyse the contribution of each component trait to the complex trait (Board et al., Reference Board, Kang and Harville1997; Salahuddin et al., Reference Salahuddin, Abro, Kandhro, Salahuddin and Laghari2010).
Within-boll yield-related traits are very important in determining the seed cotton yield. Because the measurement of a within-boll yield-related trait is not easy, only a small selection pressure is exerted on these traits (Coyle and Smith, Reference Coyle and Smith1997; Wu et al., Reference Wu, Jenkins, McCarty and Zhu2004). Number of bolls per plant, number of locules per boll, seed cotton yield per boll and lint percentage have also been reported (Wu et al., Reference Wu, Jenkins, McCarty and Zhu2004; Desalegn et al., Reference Desalegn, Ratanadilok and Kaveeta2009; Imran et al., Reference Imran, Shakeel, Azhar, Farooq, Saleem, Saeed, Nazeer, Riaz, Naeem and Javaid2012) to correlate positively with the seed cotton yield. Some basic within-boll-related traits such as number of locules per boll, seed cotton yield per locule and number of seeds per locule have been ignored in previous studies. Previous studies (Bednarz et al., Reference Bednarz, Nichols and Brown2007; Bechere et al., Reference Bechere, Zeng and Boykin2014) have suggested the need for continuous research to investigate the contribution of within-boll yield-related traits to the seed cotton yield. They reported that cotton boll traits were controlled by multiple genes, and that with the development of new genotypes, there may be changes in the genetic association among the traits.
Given the above-mentioned facts, the main objective of the present study was to (1) investigate the association of basic cotton boll traits with seed cotton yield and (2) determine whether the genetic association of within-boll yield-related traits changed with the development of new genotypes.
Materials and methods
Plant material for genetic studies
A total of 25 genotypes of cotton, collected from different research institutes of Pakistan, were sown in polythene bags in May 2012 to study their genetic diversity. Leaves were taken from 30 d-old seedlings. For DNA extraction, two true leaf samples were taken from each genotype. DNA was extracted by the CTAB method, and quality of the extracted DNA was assessed by running it on gel electrophoresis (0.86% agarose gel). PCR reaction mixture included 10 μl of 2.5 mM DNTPs, 2 μl Taq polymerase enzyme, 1.5 μl primer, 3 μl MgCl2, 2 μl PCR buffer and 2 μl DNA template. From each of 50 SSR polymorphic primers, 35 cycles were allowed to amplify the required segment of DNA. The amplified products were examined on gel electrophoresis (1.5% agarose gel), with ethidium bromide as the florescent dye. SSR data obtained were subjected to binary number for the analysis of the similarity index. The similarity matrix was subsequently used for the construction of dendrogram by using the unweighted pair group method with arithmetic mean (UPGMA).
Development of F3 populations
From the dendrogram, eight diverse genotypes were selected, and the generation of the following cross combinations was attempted in July 2012 and the F1 populations were raised in the tunnel during November–March: (1) Bt-86 × CRSM-2007; (2) Bt-06 × CH-57; (3) Bt-1401 × CRSM-38; (4) CRSM-2007 × CIM-707; (5) CIM-707 × MNH-6070.
The F2 populations were raised in May 2013, and selection was made on the basis of the within-boll yield-related traits.
Morphological and agronomic trait evaluations
The F3 populations were raised during May–October 2014 and ten F3 rows of each cross were raised from the seed collected from F2 populations. Plants were sown at a line-to-line distance of 75 cm and a plant-to-plant distance of 30 cm, and 20 plants were maintained in each line. For parental data, two rows of each parent were also raised in a randomized complete block design with three replications. The soil was sandy loam (sand, silt and clay ratio 50:30:20) with a pH of 8.1. All standard agronomic and plant protection practices were carried out equally for all rows and crosses. For data collection of boll traits, 50 plants were randomly tagged from each population. To avoid any possible loss of boll during the data collection of boll yield, 50 plants were tagged separately for seed cotton yield per plant, of which 30 plants were harvested randomly to record the data.
For the within-boll yield-related traits, data for 30 bolls (each boll data average of five bolls from one plant) were randomly taken from each population, and data of number of locules per boll, number of seeds per locule, seed cotton yield per locule, number of seeds per boll, seed cotton yield per boll and lint percentage were recorded.
Statistical analysis
For correlation and path coefficient studies, segregating generations of all the five crosses were analysed separately to determine whether the production of different cross combinations can break the strength of certain correlations. Phenotypic correlation coefficient among the different traits was calculated by using Pearson's correlation coefficient formula (Gomez and Gomez, Reference Gomez and Gomez1984) as follows:
where n is the number of observations; X is the first variable; and Y is the second variable.
Genotypic correlation among the studied traits was calculated using the following formula:
where CoVF3 is the covariance of the F3 generation; $$\overline{CoVp} $$ is the average covariance of parents; VF3(X) is the variance of the X trait in the F3 generation; VF3(Y) is the variance of the Y trait in the F3 generation; $$\overline{Vp( X )} $$ is the average variance of parents for the X trait; and $$\overline{Vp( Y )} $$ is the average variance of parents for the Y trait.
Path coefficient analysis was performed according to the method given by Dewey and Lu (Reference Dewey and Lu1959) for the within-boll yield components, considering seed cotton yield per plant as the resultant variable and within-boll yield components, i.e. number of locules per boll, number of seeds per locule, seed cotton yield per locule, number of seeds per boll, seed cotton yield per boll and lint percentage, as causal variables.
Results
Genetic diversity of parents
Diversity analysis of 25 cotton genotypes using 50 SSR markers revealed a dendrogram (Fig. 1) on the basis of genetic similarities and homology. As shown in the dendrogram, these 25 genotypes can be divided into three diverse groups. Group A contained only one genotype, i.e. Bt-86, group B consisted of three genotypes, i.e. Bt-06, Bt-1401 and CRSM-2007, and group C was further divided into two subgroups, of which one contained three genotypes and the other was further subdivided into two subgroups (Fig. 1). The dendrogram revealed that Bt-86, Bt-06, Bt-1401, CRSM-2007, CRSM-38, MNH-6070, CIM-707 and CH-57 were diverse genotypes on the basis of the 50 SSR makers used in the study, as all these genotypes belonged to different groups and subgroups in the dendrogram. Based on genetic diversity, five crosses were attempted to generate from these eight genotypes and subsequent generations were developed for further studies. Genetic diversity is the true reflection of genotypic differences in genetic constitution, and its utility is very much essential for association studies.
Correlation analysis
With a few exceptions, all the within-boll yield components showed highly significant phenotypic and genotypic correlations among themselves (Table 1). For all cross combinations, number of locules per boll showed highly significant and positive phenotypic and genotypic correlations with all the within boll yield components, except seed cotton yield per locule and number of seeds per boll. The correlation between number of locules per boll and seed cotton yield per locule was always negative and non-significant, whereas in the case of genotypic correlation, the same component traits also showed a non-significant but positive correlation. The maximum genotypic and phenotypic correlations were observed between number of locules per boll and seed cotton yield per boll for all cross combinations, and the maximum value was observed in the cross combination CRSM-2007 × CIM-707 (0.72 and 0.77, respectively). Number of seeds per locule showed a highly significant genotypic and phenotypic correlation with all the studied traits, except lint percentage whose correlation was non-significant. The direction of the correlation was positive among all the within-boll yield components, except seed cotton yield per plant, for which it was negative. The maximum strength of the correlation coefficient was observed between number of seeds per locule and seed cotton yield per locule in all the cross combinations, and the maximum value was observed in the cross combination Bt-1401 × CRSM-38 (0.82). In contrast, the maximum strength of the phenotypic correlation was observed between number of seeds per locule and seed cotton yield per locule in all the cross combinations, except CIM-707 × MNH-6070 in which the maximum strength was observed for number of seeds per boll (0.77). Number of seeds per locule showed a maximum phenotypic correlation coefficient with seed cotton yield per locule (0.95) in the cross combination CRSM-2007 × CIM-707. Seed cotton yield per locule had a highly significant genotypic and phenotypic correlation with all the traits except number of locules per boll, which was non-significant. The strongest genotypic association was observed between seed cotton yield per locule and number of seeds per boll in all the crosses, while the phenotypic association was found to be strongest between seed cotton yield per boll and number of seeds per boll. The highest genotypic and phenotypic correlation coefficient value was observed between number of seeds per boll and seed cotton yield per locule in the cross combination Bt-86 × CRSM-2007 (0.99 and 0.98, respectively). Number of seeds per boll had a highly significant correlation with all the studied traits in all the cross combinations except number of locules per boll, which was non-significant. Seed cotton yield per boll had a positive and highly significant genotypic and phenotypic correlation with all the other studied traits in all the cross combinations. Lint percentage had a significant correlation with number of seeds per locule in the positive direction and with seed cotton yield per plant in the negative direction in all the cross combinations. Seed cotton yield per plant was negatively associated with number of seeds per locule and lint percentage, but it was positively associated with all the other traits, and the correlation coefficient was highly significant in all the cross combinations.
L/B, number of locules per boll; S/L, number of seeds per locule; SCY/L, seed cotton yield per locule; S/B, number of seeds per boll; SCY/B, seed cotton yield per boll; L%, lint percentage; SCY/P, seed cotton yield per plant.
* Significant at the 5% level.
** Significant at the 1% level. Bold values are diagonals.
Path coefficient analysis
Path coefficient analysis partitioned the correlation into direct and indirect effects of different related traits. Genetic correlation of different within yield components with seed cotton yield per plant was divided into direct and indirect effects by the path coefficient analysis (Table 2). The genetic correlation coefficient between number of locules per boll and seed cotton yield ranged from 0.493 to 0.620 in the different crosses studied, but the direct effect was very low, which was always less than 0.1 and in the negative direction. The maximum value ( − 0.086) was found in the cross combination Bt-06 × CH-57. The maximum indirect effect exerted by number of seeds per boll, ranging from 0.299 to 0.374 in the different crosses studied. The maximum indirect effect exerted by number of seeds per boll was observed in the cross combination Bt-06 × CH-57. Number of seeds per locule showed a negative correlation with seed cotton yield per plant, whose value ranged from − 0.394 to − 0.440 and maximum value was observed in the cross combination Bt-1401 × CRSM-38. The direct effect of number of locules per boll ranged from 0.144 to 0.157 and was in the positive direction. The maximum indirect effect of number of seeds per locule was exerted by number of seeds per boll, with the value ranging from − 0.442 to − 0.494. Seed cotton yield per locule was highly correlated with seed cotton yield per plant, with the genotypic correlation coefficient value ranging from 0.592 to 0.703 in the different crosses studied. The direct effect of seed cotton yield per locule ranged from 0.229 to 0.279, and the maximum value was observed in the cross combination CIM-707 × MNH-6070. The indirect effect of seed cotton yield per locule was very high (range 0.677–0.804), and the maximum value was observed in the cross combination Bt-06 × CH-57. Number of seeds per boll had a highly significant genotypic correlation with seed cotton yield per plant, and the value ranged from 0.503 to 0.604, where the maximum correlation coefficient value was observed in the cross combination Bt-1401 × CRSM-38. The direct effect of number of seeds per boll was high, ranging from 0.696 to 0.895, and the maximum value of the direct effect was observed in the cross combination Bt-1401 × CRSM-38. The indirect effect of number of seeds per boll exerted by seed cotton yield per boll was high, but it was in the negative direction. The value of the indirect effect ranged from − 0.413 to − 0.491, and the maximum value was observed in the cross combination Bt-1401 × CRSM-38. Seed cotton yield per boll had a highly significant genotypic correlation with seed cotton yield per plant, whose value ranged from 0.486 to 0.639 in the different crosses studied and the maximum value was observed in the cross combination Bt-86 × CRSM-2007. The indirect effect was also high, ranging from 0.024 to 0.032. The indirect effect of seed cotton yield per boll exerted by number of seeds per boll was high, which ranged from 0.430 to 0.565 in the different crosses studied and the maximum value was observed in the cross combination Bt-86 × CRSM-2007. Lint percentage was negatively correlated with seed cotton yield per plant, with the correlation coefficient value ranging from − 0.157 to − 0.213. It had a very small direct effect in the positive direction, with the maximum value being 0.063. Most of the indirect effects were also low, but the indirect effect exerted by number of seeds per boll was relatively higher but in the negative direction, with values ranging from − 0.139 to − 0.198 in the different crosses studied.
L/B, number of locules per boll; S/L, number of seeds per locule; SCY/L, seed cotton yield per locule; S/B, number of seeds per boll; SCY/B, seed cotton yield per boll; L%, lint percentage; SCY/P, seed cotton yield per plant.
Discussion
Correlation describes the strength of association of different traits, whereas path coefficient analysis partitions this association into direct and indirect effects. The present study investigated five crosses of genetically diverse parents, and analysed each cross combination separately to determine whether the strength of association between different traits remained the same in all the crosses. A highly significant correlation among within-boll yield components has also been reported in previous studies (Smith and Coyle, Reference Smith and Coyle1997; Bechere et al., Reference Bechere, Zeng and Boykin2014; Jones et al., Reference Jones, Joy and Smith2014; Tang and Xiao, Reference Tang and Xiao2014). A highly significant genotypic and phenotypic association of number of locules per boll with other traits showed the significance of this trait in selection. Locule traits such as number of locules per boll, number of seeds per locule and seed cotton yield per locule are ignored in most of the within-boll yield-related studies (Smith and Coyle, Reference Smith and Coyle1997; Imran et al., Reference Imran, Shakeel, Azhar, Farooq, Saleem, Saeed, Nazeer, Riaz, Naeem and Javaid2012; Jones et al., Reference Jones, Joy and Smith2014); however, in this paper, a highly significant genotypic and phenotypic correlation of these traits highlights their importance. All within-boll yield components, except number of locules per boll, were significantly correlated with seed cotton yield per locule and number of seeds per boll. The negative correlation of number of seeds per locule and lint percentage with seed cotton yield (Smith and Coyle, Reference Smith and Coyle1997; Imran et al., Reference Imran, Shakeel, Azhar, Farooq, Saleem, Saeed, Nazeer, Riaz, Naeem and Javaid2012; Jones et al., Reference Jones, Joy and Smith2014) increases the importance of these traits in the selection for yield. Our results are contrary to those observed in some studies (Desalegn et al., Reference Desalegn, Ratanadilok and Kaveeta2009; Karademir et al., Reference Karademir, Karademir, Ekininci and Gencer2010) reporting the positive correlation between lint percentage and seed cotton yield per plant. The main reason for these contradicting results can be attributed to the use of entirely different genotypes in different agroecological zones, as gene expression may change under the aforementioned conditions (Hoffmann and Merilä, Reference Hoffmann and Merilä1999; Saranga et al., Reference Saranga, Menz, Jiang, Wright, Yakir and Paterson2001), leading to the change in the direction of association among the traits. Data for the five crosses were analysed separately to determine whether genetic differences among genotypes change the correlation direction and strength. It was observed from the results that the direction of association among the different traits studied remained the same in all the crosses, and there were differences in the strength of association. The maximum difference was found in the correlation between seed cotton yield per locule and seed cotton yield per plant, where the strength of Bt-1401 × CRSM-38 was reduced to less than half of the maximum value observed in the cross combination CIM-707 × MNH-6070. These results are in accordance with the findings of other researchers (Ali and Wynne, Reference Ali and Wynne1994; Biradar et al., Reference Biradar, Salimath and Sridevi2010; Hussain et al., Reference Hussain, Khan, Sadaqat and Amjad2010) who investigated segregating populations of different crosses and reported differences in the strength of association.
Correlation analysis describes the strength association among traits, but does not explain the direct and indirect components of the association. Path coefficient analysis divides the correlation coefficient into direct effect of a particular trait and indirect effects exerted by other traits. Number of locules per boll had a highly significant positive correlation with seed cotton yield per plant, but its direct effect was very low and in the negative direction. Path coefficient analysis showed that it did not contribute directly to seed cotton yield per plant, but contributed via the other traits, especially number of seeds per boll (Table 1). Number of seeds per boll contributed a maximum value through the direct effect, while other traits such as number of seeds per locule, seed cotton yield per locule, seed cotton yield per boll and lint percentage contributed more through the indirect effect exerted by the other traits, and the direct effect was relatively low. Although the direction of association between seed cotton yield per plant and lint percentage was negative but its direct effect was positive with a small magnitude, also indirect effects exerted by these two traits had a very small magnitude, compromised selection for these traits might be helpful in increasing the seed cotton yield per plant. These results are supported by the findings of other researchers (Rauf et al., Reference Rauf, Khan, Sadaqat and Khan2004; Bechere et al., Reference Bechere, Zeng and Boykin2014). These results indicated that direct selection for within-boll yield components can be successful only in the case of number of seeds per boll, while for others, direct selection may lead to wrong selection (Hussain et al., Reference Hussain, Khan, Sadaqat and Amjad2010) as their maximum effect came indirectly through other traits. Seed cotton yield per boll had a highly significant genotypic correlation with seed cotton yield per plant, but it had also a large negative effect via number of seeds per locule and seed cotton yield per boll, so care should be taken when considering these traits as selection criteria. Some researchers (Board et al., Reference Board, Kang and Harville1997; Bechere et al., Reference Bechere, Zeng and Boykin2014) have suggested to exclude such traits from selection criteria. All traits showed maximum indirect effects via number of seeds per boll, so inclusion of this trait in the selection criteria may be helpful for increasing the seed cotton yield. Overall differences were observed for direct and indirect effects in the population of different crosses, but the direction of association was the same in all the cases. It has been reported that within-boll yield components are controlled by many genes, and for different populations, the frequency of genes might be different that possibly caused the change in the magnitude of genetic association (Bednarz et al., Reference Bednarz, Nichols and Brown2007; Bechere et al., Reference Bechere, Zeng and Boykin2014).
Conclusion
Number of locules per boll, seed cotton yield per locule, number of seeds per boll and seed cotton yield per boll showed a highly significant positive correlation with seed cotton yield per plant, and also these traits had a significant positive correlation among themselves, so these component traits can be used as the selection criteria for the improvement in seed cotton yield. Number of seeds per boll and seed cotton yield per locule had a maximum direct effect, and also the magnitude of the indirect effect exerted by these two traits was high, so these component traits can be considered of prime importance in selection. However, care should be taken in selecting number of seeds per locule and lint percentage, as these component traits are negatively correlated with seed cotton yield. Differences were observed in the strength of genetic association and direct and indirect effects for the different populations studied, but the direction of association was always the same.