INTRODUCTION
Tobacco use makes a significant contribution to global burden of disease. In 2000, approximately 4·83 million deaths worldwide were attributable to tobacco smoking, accounting for 12% of the estimated global adult mortality for that year (Ezzati & Lopez, Reference Ezzati and Lopez2003, Reference Ezzati and Lopez2004). In Australia, tobacco smoking is the single most influential risk factor for disease burden, responsible for roughly 12% and 7% of total disease burden in men and women respectively (Mathers et al. Reference Mathers, Vos, Stevenson and Begg2001). Thus tobacco use remains an important public health issue, despite the decline in smoking prevalence associated with the implementation of tobacco control programmes since the 1980s (White et al. Reference White, Hill, Siahpush and Bobevski2003).
Epidemiological research indicates an inter-relationship between different aspects of smoking behaviour. Studies suggest that age at smoking initiation is related to subsequent aspects of smoking behaviour, such as cigarette consumption, nicotine dependence and smoking cessation (Breslau & Peterson, Reference Breslau and Peterson1996; Hymowitz et al. Reference Hymowitz, Cummings, Hyland, Lynn, Pechacek and Hartwell1997; Everett et al. Reference Everett, Warren, Sharp, Kann, Husten and Crossett1999; Khuder et al. Reference Khuder, Dayal and Mutgi1999; Lando et al. Reference Lando, Thai, Murray, Robinson, Jeffery, Sherwood and Hennrikus1999). In general, such studies have shown that the younger an individual initiates smoking, the greater their cigarette consumption and risk of nicotine dependence, and the lower their likelihood of quitting. Thus some risk factors may influence more than one stage of smoking progression. What currently remains unclear is whether those risk factors are genetic or environmental in origin.
Twin studies demonstrate that both genetic and environmental factors influence variability in smoking behaviours (Sullivan & Kendler, Reference Sullivan and Kendler1999; Li, Reference Li2003). While some of these influences will be specific to certain smoking behaviours, many will be shared across traits. Multivariate twin analyses have shown some overlap between genetic and environmental factors influencing smoking initiation and persistence (Heath, Reference Heath1990; Heath & Martin, Reference Heath and Martin1993; Madden et al. Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999, Reference Madden, Pedersen, Kaprio, Koskenvuo and Martin2004; Heath et al. Reference Heath, Martin, Lynskey, Todorov and Madden2002), initiation and cigarette consumption (Koopmans et al. Reference Koopmans, Slutske, Heath, Neale and Boomsma1999), and initiation, regular tobacco use and nicotine dependence (Kendler et al. Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999; Maes et al. Reference Maes, Sullivan, Bulik, Neale, Prescott, Eaves and Kendler2004).
However, twin analyses of smoking are complicated by the fact that all aspects of smoking behaviour are observed conditional on the subject initiating smoking; we cannot observe nicotine dependence in those who have never smoked (for discussion see Neale et al. Reference Neale, Aggen, Maes, Kubarych and Schmitt2006a, Reference Neale, Harvey, Maes, Sullivan and Kendlerb). Heath et al. (Reference Heath, Martin, Lynskey, Todorov and Madden2002) developed a two-stage model of substance use that overcomes this limitation, incorporating non-smokers into analyses in a methodical manner. This model requires a measure of smoking initiation for the first ‘stage’, with other smoking-related traits making up the second (and subsequent) stages. Initiation is defined as a multiple category variable, with one category being non-smokers, having a single underlying distribution of liability. The second dimension phenotype can be considered Missing At Random (Little & Rubin, Reference Little and Rubin1987) for non-smokers; as they are below the threshold for initiation, their status for the second dimension variable is unknown (Heath et al. Reference Heath, Martin, Lynskey, Todorov and Madden2002; Neale et al. Reference Neale, Harvey, Maes, Sullivan and Kendler2006b). Using this model, the strength of genetic and environmental correlations between different stages of smoking can be estimated.
Heath and colleagues fitted this model to data on smoking age-at-onset and persistence from Australian adult twins. We have extended this original model in several ways. First, we used a significantly larger sample. Second, we incorporated data from both twins and their siblings. Importantly, we also extended the range of behaviours assessed to look at three distinct aspects of smoking, age-at-onset, cigarette consumption, and smoking persistence, all of which have been shown to be inter-related. The aim of this study was to elucidate the interactions between these phenotypes, which have been observed previously in epidemiological studies, and to shed light on shared sources of risk.
METHOD
Subjects
Phenotypic measures of smoking behaviour were taken from four complementary self-report questionnaires mailed to adult twins and their family members between 1989 and 1993. Twins were recruited in two cohorts from the Australian National Health and Medical Research Council (NH&MRC) Twin Registry, and were asked to provide the contact details of other family members who would also be willing to participate. All studies were approved by the Queensland Institute of Medical Research Human Research Ethics Committees and the Australian Twin Registry. In total, 21 222 participants responded to the questionnaires, of which 5321 were twin pairs, 986 single twins and 3715 siblings.
Individual response rates for the twins were 81% for the first (older) cohort and 84% for the second (younger) cohort. Response rates were 60% and 56% for relatives of the older and younger cohorts respectively, although there was some variation in response rate depending upon the type of relative (Lake et al. Reference Lake, Eaves, Maes, Heath and Martin2000). As this questionnaire represented a follow-up survey for twins from the older cohort (previously surveyed in 1981), some information was available regarding the characteristics of non-responders. Current or ex-smoking as assessed in 1981 predicted non-response to the 1989 survey for males only. Non-response to the 1989 survey was not predicted by quantity smoked or smoking persistence in 1981 for either sex (Heath et al. Reference Heath, Madden, Slutske and Martin1995). Similar information was not available for the younger twin cohort or relatives of either cohort. Zygosity was determined using standard questionnaire methods, validated by genotyping subsets of twins. More detailed descriptions of these studies can be found elsewhere (Heath & Martin, Reference Heath and Martin1994; Kirk et al. Reference Kirk, Birley, Statham, Haddon, Lake, Andrews and Martin2000; Lake et al. Reference Lake, Eaves, Maes, Heath and Martin2000).
Phenotypic measures
Participants answered a number of questions about their own smoking behaviour and the smoking behaviour of their family members. They classified themselves and other family members as non-smokers, ex-smokers or current smokers. They also reported their average number of cigarettes consumed per day, either currently or when they previously smoked if an ex-smoker (scaled as never smoker, 1–4 cigarettes, 5–10, 11–20, 21–40, 40+), the ages at which they began and had successfully quit smoking (if they were an ex-smoker), and the total number of years for which they had smoked. Responses to the different questions were used to verify that participants' answers were internally consistent.
Three measures of smoking behaviour were defined using the questionnaire data. Following Heath et al. (Reference Heath, Martin, Lynskey, Todorov and Madden2002), the smoking initiation dimension was defined according to self-reported age-at-onset of smoking. As most users begin in adolescence, we categorized smokers as those who began smoking by age 18 (standard onset) and those who began smoking after age 18 (late onset). Assuming a single underlying distribution of liability, these categories were ordered as never smoker, late-onset smoker, and standard-onset smoker, as those who begin smoking early are more likely to experience difficulty quitting (Lando et al. Reference Lando, Thai, Murray, Robinson, Jeffery, Sherwood and Hennrikus1999; Lewinsohn et al. Reference Lewinsohn, Rohde and Brown1999). Average cigarette consumption was defined as a five-category variable, including all categories from the questionnaire except ‘never smoker’. The third dimension, smoking persistence, was defined as whether or not the participant was a self-reported current or ex-smoker at the time of survey.
Structural equation modelling
All categorizations of smoking behaviour were treated as ordinal phenotypes, using a liability-threshold model. This model assumes that liability to a trait is normally distributed, with the boundaries between categories representing arbitrary thresholds along the distribution. Biometrical modelling of threshold traits is discussed in more detail in Neale & Cardon (Reference Neale and Cardon1992). In brief, classical twin analysis permits variance in liability to an ordinal trait to be decomposed into the following latent sources: additive genetic (A), non-additive genetic (D), environment shared by family members (C), and environment unique to each family member (E). However, C and D are confounded in analyses consisting of only twins reared together, and thus only one of these parameters can be estimated within a model (Grayson, Reference Grayson1989; Hewitt, Reference Hewitt1989).
The standard twin design can be extended to include information from additional non-twin siblings, which are parameterized as for dizygotic (DZ) twins (Posthuma et al. Reference Posthuma, Beem, de Geus, van Baal, von Hjelmborg, Iachine and Boomsma2003). When non-twin siblings are included, the influence of environmental factors specific to twin pairs (T) can also be estimated (Koeppen-Schomerus et al. Reference Koeppen-Schomerus, Spinath and Plomin2003). Differences between males and females in the strength of latent factors can be determined by fitting a common-effects sex-limitation model, which permits the magnitude of genetic and environmental influences to differ between the sexes (Neale & Cardon, Reference Neale and Cardon1992; Medland, Reference Medland2004; Neale et al. Reference Neale, Roysamb and Jacobson2006c).
Multivariate biometrical analyses can be used to partition the covariation between phenotypes, in addition to partitioning the variance within a phenotype. In the three-stage model, a series of latent factors is assumed to explain the variance and covariance of a series of traits. The three-stage model we fitted consisted of three latent factors from four sources of variance (A, C, E and T). The first set of latent factors was hypothesized to influence smoking age-at-onset, and to explain part of the variance in the two remaining phenotypes (cigarette consumption and smoking persistence). The second set of latent factors was hypothesized to explain the remaining variance of the cigarette consumption phenotype and part of the variance in smoking persistence. The third set of latent factors was hypothesized to explain the variance in smoking persistence not already accounted for by the first two groups of latent factors.
To investigate the inter-relationship of the smoking phenotypes, we fitted a common-effects sex-limitation three-stage model including smoking age-at-onset, cigarette consumption, and smoking persistence. Model fitting was undertaken in a stepwise manner. Prior to biometrical model fitting, a number of assumptions regarding the data were formally tested.
First, the fit of a single liability dimension (SLD) model to the age-at-onset variable was tested. This can be achieved using a contingency table script, fitted to the data using Mx, as outlined by Heath et al. (Reference Heath, Martin, Lynskey, Todorov and Madden2002). However, when sibling data are included in the analysis, this provides an imperfect method as families with incomplete data will not be included (e.g. a family consisting of a twin and a sibling will not contain any of the requisite pairings). Consequently, a Contingent Causal Common pathway (CCC) model (Kendler et al. Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999) with sex-specific thresholds was fitted to the first dimension variable. For the CCC model, the age of initiation variable was split into two binary variables: initiation of regular smoking, and late versus standard smoking onset. This model is a constrained multivariate model that assumes that genetic and environmental factors influencing initiation only affect age-at-onset via a single common pathway. As such a model with the causal path constrained to unity indicates that all genetic and environmental influences are shared between the two traits, the estimate of the causal path can be used to determine whether the three-category definition of age-at-onset has a single distribution of liability (Neale et al. Reference Neale, Harvey, Maes, Sullivan and Kendler2006b).
Second, assumptions regarding the homogeneity of phenotype prevalence and correlations for smoking age-at-onset were tested. A basic model incorporating a threshold model for smoking age-at-onset (including corrections for year of birth, sex, and an interaction between the two) and a calculation of polychoric correlations between different pairings of twins and siblings was fitted to the data. Thresholds were specified separately by sex for both twins and siblings. Correlations were allowed to vary for each zygosity group and between twin–twin, twin–sibling and sibling–sibling pairs. These model parameters were equated progressively within and between zygosity groups until the fit of the model worsened significantly. The appropriateness of including the regression terms in the threshold model was also tested by successively dropping each term from the model until the fit worsened significantly.
A common-effects sex-limitation univariate model was fitted to smoking age-at-onset to assess which parameters to include in the full multivariate model. To determine the appropriate threshold models for cigarette consumption and smoking persistence in the multivariate model, the percentage of individuals in each category was estimated separately for male and female twins and siblings (including those twins from opposite-sex pairs) using Stata version 8.2 (StataCorp, 2005).
Model fitting was conducted with twins and a maximum of two additional siblings. All models were fitted to the raw data by the method of maximum likelihood as implemented in Mx (Neale et al. Reference Neale, Boker, Xie and Maes2003). The significance of the different latent factors was assessed by using the likelihood-ratio χ2 test to compare the fit of a model including the factor to one in which the factor was not estimated (Neale & Cardon, Reference Neale and Cardon1992). While estimates of the confidence intervals (CIs) for the multivariate model parameters would be desirable, this was not possible as the large sample size and complexity of the model result in prohibitively large computation time.
RESULTS
Sample demographics and phenotypic measures
For this study, analyses were conducted with a maximum of four siblings (parent, offspring and spouse information was not used). Relevant phenotypes were available for 14 472 individuals from 6247 families. Family structures and sizes are shown in Table 1.
Table 1. The number of families, and their structures, for individuals with relevant phenotypes

T, Twins; S, siblings; MZ, monozygotic; DZ, dizygotic; F, female; M, male; OS, one female, one male twin.
Females made up 60·6% of the sample. The distribution of year of birth was similar for males and females: 50% of women were born in 1958 onwards (range 1902–1974) and 50% of men were born in 1961 onwards (range 1903–1973). The prevalence of current smoking (as compared to non- and ex-smokers) in this sample was 24% for women and 28% for men, and similar to estimates from Australian general population-based samples during the same time period (1989–1992) (White et al. Reference White, Hill, Siahpush and Bobevski2003). The proportion of males and females in the different phenotypic categories, shown separately for twins and siblings, is detailed in Table 2.
Table 2. Percentages (with standard errors) of male and female twins and siblings in each of the phenotypic categories

Overall, women were less likely to report ever having smoked: 43·5% reported themselves to be ex- or current smokers compared to 49·2% of men. The proportions of twins and siblings reporting ever having smoked were similar for females and for males. Average daily cigarette consumption was also similar between twins and siblings for each sex. However, the prevalence of current versus ex-smoking among twins was significantly different from that among siblings for both males and females (p<0·0001 in both cases). Siblings were more likely to be ex-smokers, with approximately 55% of male and female siblings reporting this as their current smoking status. In comparison, only 43% of female twins and 41% of male twins reported themselves to be ex-smokers.
Testing model assumptions
The assumption that smoking initiation, as defined by age of onset, represented a single underlying distribution of liability was tested using the CCC model. The estimate of the causal path for this model was very close to unity (point estimate 0·92, 95% CI 0·91–0·94). The saturated model for assumption testing for smoking age-at-onset included separate sex-specific thresholds for twins and siblings for each zygosity group. For twins, prevalence estimates were not significantly different for monozygotic (MZ) and DZ twins, for either sex (χ42=7·92, n.s.). The prevalence estimate for females from the DZ opposite-sex pairs was also not significantly different from that of the other female twins (χ22=0·36, n.s.). However, there was a significantly lower proportion of non-smokers among the males from the DZ opposite-sex group (0·46), as compared to other male twins (0·54) (χ22=18·58, p<0·001).
Male and female prevalence estimates for the siblings could be equated to those of the twins (excluding the male twins from DZ opposite-sex pairs) without a significant loss of model fit (χ42=8·07, n.s.). However, the estimates for males and females were significantly different (χ42=24·76, p<0·001). The regression coefficient for the interaction term between sex and year of birth (YOB) could not be dropped from the model without a significant loss of fit (βYOB×Sex=0·31, χ12=717·88, p<0·001; βYOB=−0·25), suggesting that YOB influences smoking age-at-onset differently in males and females.
For the best-fitting threshold model, correlations for male and female twins could be equated both within MZ twin pairs and within same-sex DZ pairs (χ22=0·46, n.s.). However, equating the DZ opposite-sex correlation to that for the same-sex DZ pairs significantly reduced the fit of the model (χ12=13·59, p<0·001), suggesting that the magnitude of genetic and environmental influences on smoking age-at-onset may differ between males and females.
Consequently, further assumption testing proceeded without the DZ twin correlations equated, so that sex-specific correlation comparisons could be made. Twin–sibling and sibling–sibling correlations could be equated for the opposite-sex pairs, as well as both the male and female same-sex pairs, without a significant loss of fit (χ12=0·52, n.s.; χ12=2·07, n.s.; χ12=0·02, n.s.; respectively). These correlations could be equated to the respective DZ twin correlations without a significant loss of fit for the opposite-sex groups, but with a worsening of model fit for either the male or female same-sex groups (χ12=4·18, p=0·04; χ12=8·18, p<0·01, χ12=15·41, p<0·001; respectively). This was caused by the higher twin–twin correlations.
Univariate model fitting
Based upon the assumption testing results presented above, a common-effects sex-limitation model including a twin-specific environmental component was fitted for smoking age-at-onset. Male and female variance component estimates from the full common-effects sex-limitation model could not be equated without a significant loss of model fit (χ42=16·10, p<0·01). This was predominantly due to the difference in the magnitude of the twin-specific environmental component. No latent factors could be removed from the model without a significant loss of fit (all tests p<0·01). Thus the full common-effects sex-limitation model was the best-fitting model. Variance components estimates, with 95% CIs, for this model are shown in Table 3.
Table 3. Univariate variance components estimates (with 95% confidence intervals) for the common-effects sex-limitation model

A, Additive genetic; C, common environmental; T, twin-specific environmental; E, unique environmental; f, female; m, male.
Multivariate model fitting
Based on the results from the various stages of assumption testing, a common-effects sex-limitation three-stage model including A, C, E and T parameters was fitted to the data for smoking age-at-onset, cigarette consumption and persistent smoking. The specification of thresholds for this model was complex, given the results of the preliminary analyses. For smoking age-at-onset, separate thresholds were specified for males and females, with a unique threshold model specified for the male opposite-sex twins. For cigarette consumption, this specification of thresholds was retained as there does not appear to be a significant difference between twins and siblings for average daily cigarette consumption. However, as the prevalence of current and ex-smokers differed significantly between twins and siblings in this sample, for the persistence phenotype thresholds were specified separately for male and female twins and siblings, with an additional, separate threshold again specified for the males from the opposite-sex twin pairs.
As expected, the multivariate model retrieved parameter estimates for smoking age-at-onset that were very close to the univariate model results. Results for the full model with relevant estimates for females and males are shown in Figs 1 and 2 respectively. Constraining all the paths to be equal for males and females significantly worsened the fit of the model (χ242=45·84, p<0·01). Equating the common environmental path coefficients for males and females significantly worsened the fit of the model (χ62=29·65, p<0·001). However, individually equating the unique environmental, twin-specific environmental or additive genetic paths did not significantly alter the model fit (χ62=8·07, n.s.; χ62=3·83, n.s.; χ62=9·48, n.s.; respectively).

Fig. 1. Three-stage model results for smoking age-at-onset (SA), cigarette consumption (CC) and smoking persistence (SP) in females. Point estimates of additive genetic (A), twin-specific environmental (T), common environmental (C), and unique environmental (E) variances and correlations are shown.

Fig. 2. Three-stage model results for smoking age-at-onset (SA), cigarette consumption (CC) and smoking persistence (SP) in males. Point estimates of additive genetic (A), twin-specific environmental (T), common environmental (C), and unique environmental (E) variances and correlations are shown.
Under the saturated model, all three phenotypes were moderately heritable, with estimates ranging from 0·40 to 0·62 (see Figs 1 and 2). Twin-specific environmental factors were estimated to have a reasonable influence on variability in smoking age-at-onset (estimates of 0·19 in females and 0·12 in males) but little influence on cigarette consumption and smoking persistence (less than 0·10 in both sexes). Common environmental estimates were small for all variables in both sexes, with the exception of cigarette consumption in females. Unique environmental estimates were moderate (0·17 and 0·19) for both sexes for smoking age-at-onset, but accounted for a substantial proportion of the variability in cigarette consumption and smoking persistence in both sexes, ranging from 0·39 to 0·49.
The pattern of correlations between the latent factors was somewhat different for males and females. In both sexes, there was a strong positive genetic correlation between age-at-onset and cigarette consumption, and another strong to moderate positive genetic correlation between cigarette consumption and smoking persistence. In males the genetic correlation was small and positive between smoking age-at-onset and smoking persistence, but in females the correlation was effectively zero. Twin-specific environmental correlations were also different between the sexes; females showed strong positive correlations between age-at-onset and the other two traits, but little correlation between consumption and persistence. In males, these correlations were low to moderate, and negative in the case of smoking age-at-onset and smoking persistence. In both sexes strong negative unique environmental correlations were observed between smoking initiation and smoking persistence.
DISCUSSION
To explore the relationship between smoking age-at-onset, cigarette consumption and smoking persistence, we fitted a three-stage model allowing for sex differences in the magnitude of genetic and environmental effects. The variance estimates and correlations obtained from this multivariate model provide insight into observed sex differences in smoking behaviour, and have implications for both phenotype definition and further multivariate modelling of smoking.
All three phenotypes were estimated to be moderately heritable in both sexes, with additive genetic influences accounting for 40–60% of the variance. For females, there were moderate genetic correlations between initiation and consumption, and between consumption and persistence. In males there were moderate to high genetic correlations for all three variables. This suggests that although additive genetic factors influence variability in the traits to a similar extent in males and females, these factors are shared across traits to a much greater extent in males. As has been found in other studies (Heath et al. Reference Heath, Cates, Martin, Meyer, Hewitt, Neale and Eaves1993, Reference Heath, Martin, Lynskey, Todorov and Madden2002; True et al. Reference True, Heath, Scherrer, Waterman, Goldberg, Lin, Eisen, Lyons and Tsuang1997; Hettema et al. Reference Hettema, Corey and Kendler1999; Maes et al. Reference Maes, Sullivan, Bulik, Neale, Prescott, Eaves and Kendler2004), common environmental factors did not have a major influence on any of these phenotypes in either sex.
The influence of environmental factors shared by members of a twin pair on smoking age-at-onset was substantial for both sexes, although there appeared to be a stronger influence in females. This has previously been identified for tobacco and other substance use (e.g. Rhee et al. Reference Rhee, Hewitt, Young, Corley, Crowley and Stallings2003), and suggests that sibling interaction plays an important role in smoking initiation. This is also supported by the finding that having a sibling of the same age is a strong influence on smoking uptake (Vink et al. Reference Vink, Willemsen and Boomsma2003). In women, there were also strong twin-specific environmental correlations between age-at-onset and cigarette consumption, and between age-at-onset and smoking persistence. This suggests that although the influence of these factors on subsequent smoking behaviours is comparatively small, many of them are shared with smoking age-at-onset. These results imply that for men the inter-relationship between different aspects of smoking behaviour is most strongly mediated by additive genetic factors. While additive genetic factors are still an important aspect of the relationship between these behaviours in women, environmental factors may make a greater contribution than is seen in men.
The unique environmental correlations from this model raise some questions about appropriate definition of smoking phenotypes. For both males and females a strong, negative unique environmental correlation was observed between smoking age-at-onset and smoking persistence. Although this result has been observed before (Heath et al. Reference Heath, Martin, Lynskey, Todorov and Madden2002), it seems counter-intuitive because the categorization of smoking age-at-onset was based upon the assumption that individuals who started smoking later in life would be more likely to quit smoking. Thus, these results could suggest that this parameterization of smoking initiation was incorrect.
However, it is also possible that these results were caused by censoring of the data. The mean age at survey for this sample was 34 years (median=30 years, mode=26 years), so individuals starting to smoke after age 18 would, on average, have had fewer opportunities to quit smoking than those who began smoking by age 18. Thus although late-onset smokers may be more likely to quit in the long term, we may have observed them at the beginning or middle of their smoking career, making them more likely to be current smokers. In comparison, the early-onset smokers will include individuals who only experimented with smoking in their teens or early 20s, potentially resulting in a larger proportion of ex-smokers in comparison to the late-onset smokers.
The negative unique environmental correlations that we observed highlight the need for a better phenotypic definition of smoking persistence. While all measures of smoking persistence (and also smoking initiation) are subject to censoring, the binary measure of persistence used here, and in other behavioural genetic studies of smoking (e.g. Heath et al. Reference Heath, Cates, Martin, Meyer, Hewitt, Neale and Eaves1993; True et al. Reference True, Heath, Scherrer, Waterman, Goldberg, Lin, Eisen, Lyons and Tsuang1997; Madden et al. Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999, Reference Madden, Pedersen, Kaprio, Koskenvuo and Martin2004), is particularly susceptible. We suggest that future studies of smoking behaviour work towards developing more informative measures of persistence.
These results also demonstrate how twin analysis can be used to explore relationships between aspects of smoking behaviour initially observed only at a phenotypic level in epidemiological studies. The three traits analysed in the multivariate model only show a low to moderate degree of phenotypic correlation. Correlations between smoking age-at-onset and cigarette consumption, and between cigarette consumption and smoking persistence, were only moderate in either sex, at around 0·35. There was no substantial phenotypic correlation between smoking age-at-onset and persistence in males or females. However, multivariate analysis revealed strong twin-specific environmental correlations between the three variables in women. Additionally, in men, and to a lesser extent in women, there was a strong additive genetic correlation between age-at-onset and consumption, and another between consumption and persistence. These results indicate much stronger inter-relationships between certain sources of risk factors for these different aspects of smoking behaviour than might be initially expected from the phenotypic correlations. This type of information may have ramifications not only for study design but also for the development of more effective tobacco control strategies.
These results should be interpreted within the potential limitations of the modelling strategy used. They are limited by the fact that this research was based on retrospective self-reports, which can be unreliable, particularly for substance use (Johnson & Mott, Reference Johnson and Mott2001; Johnson & Schultz, Reference Johnson and Schultz2005). Additionally, if concordance for initiation is high, multivariate analyses may have insufficient power to differentiate genetic and common environmental sources of covariation between initiation and subsequent use phenotypes (Pergadia et al. Reference Pergadia, Heath, Martin and Madden2006). However, given our large sample size, and the inclusion of siblings, this is unlikely to be a substantial problem for the analyses presented here. Finally, the first dimension variable used in the multivariate model did not perfectly meet the SLD assumption. However, as this variable produced an estimate for the causal path of the CCC model that was very close to unity, it is unlikely that this would have had a substantial impact on the results.
Overall, this research suggests that the inter-relationship between smoking age-at-onset, cigarette consumption and smoking persistence observed in epidemiological studies is likely to be mediated by both genetic and unique environmental factors that are shared between these traits. Despite the limitations outlined above, it is likely that similar analyses using more specialized measures of these phenotypes would produce similar results; namely, that the inter-relationship between smoking age-at-onset, cigarette consumption and smoking persistence is strongly influenced by genetic factors. Thus these results suggest that genes with pleiotropic effects on smoking behaviour are likely to exist, and that multivariate studies aimed at identifying these genes may prove informative.
ACKNOWLEDGEMENTS
We thank the twins and their families for their participation in the various studies. We also thank Drs Sarah Medland, Julie Grant, Valerie Knopik and Michele Pergadia for helpful discussions on multivariate model fitting, and Professor Wayne Hall and two anonymous reviewers for their useful comments. We acknowledge the funding sources that supported this project, and earlier studies that collected the data used: NIH grants (DA00272, DA12854, DA12540, CA75581, AA07535, and AA07728), Australian NH&MRC grants (971232 and 941177). K.I.M. is supported by an Ian Scott Fellowship from the Australian Rotary Health Research Fund.
DECLARATION OF INTEREST
None.