Externalizing (EXT) behaviors consist of harmful actions towards others and/or self-directed behaviors, including physical or relational aggression, theft, destruction of property, as well as traits including substance use behaviors, impulsivity, and disorganization (Achenbach, Reference Achenbach1966; Krueger et al., Reference Krueger, Hobbs, Conway, Dick, Dretsch, Eaton, Forbes, Forbush, Keyes, Latzman, Michelini, Patrick, Sellbom, Slade, South, Sunderland, Tackett, Waldman and Waszczuk2021; Lahey et al., Reference Lahey, Rathouz, Van Hulle, Urbano, Krueger, Applegate, Garriock, Chapman and Waldman2008). EXT behaviors result in substantial negative impacts to the self, family, and society more broadly. Youth EXT behaviors are prospectively associated with higher rates of school dropouts and adult unemployment (Bradshaw et al., Reference Bradshaw, Schaeffer, Petras and Ialongo2010; Roy, Reference Roy2008). They are also strongly associated with internalizing disorders, including depression and anxiety (Bartels et al., Reference Bartels, Hendriks, Mauri, Krapohl, Whipp, Bolhuis, Conde, Luningham, Fung Ip, Hagenbeek, Roetman, Gatej, Lamers, Nivard, van Dongen, Lu, Middeldorp, van Beijsterveldt, Vermeiren and Boomsma2018; Goldstein et al., Reference Goldstein, Chou, Saha, Smith, Jung, Zhang, Pickering, Ruan, Huang and Grant2017; Nivard et al., Reference Nivard, Lubke, Dolan, Evans, St. Pourcain, Munafò and Middeldorp2017). Additionally, individuals exhibiting high levels of EXT behaviors utilize a disproportionately greater amount of public services over their lifespans, including involvement in the criminal legal system and utilization of healthcare and social welfare services (Fairchild, Reference Fairchild2018; Foster & Jones, Reference Foster and Jones2005; Rivenbark et al., Reference Rivenbark, Odgers, Caspi, Harrington, Hogan, Houts, Poulton and Moffitt2018). Comprehensive and tailored interventions that occur earlier in development tend to be most effective for youth EXT (Frick, Reference Frick2012); thus, understanding how risk factors differentially impact pathways of EXT is paramount for early identification and prevention of EXT behaviors among youths.
Challenges in genetic studies of EXT behaviors
Twin and family studies have long indicated the prominence of genetic effects underlying EXT behaviors (Hicks et al., Reference Hicks, Krueger, Iacono, McGue and Patrick2004). In recent years, there have been major methodological advancements in molecular genetics (Barr & Dick, Reference Barr and Dick2019), including the development of novel genetic methods that leverage large scale genome wide association (GWA) study findings to identify genes for EXT (e.g., Grotzinger et al., Reference Grotzinger, Rhemtulla, de Vlaming, Ritchie, Mallard, Hill, Ip, Marioni, McIntosh, Deary, Koellinger, Harden, Nivard and Tucker-Drob2019). However, genetically informed studies of EXT remain complicated by several critical challenges in this literature: 1) the high degree of polygenicity underlying EXT, 2) heterogeneity in presentations of EXT behaviors, and 3) individual differences in patterns of development with respect to EXT behaviors over the lifespan. By addressing these challenges, this research can lead to better ways of identifying and supporting individuals with high risk for EXT behaviors.
First, very few EXT researchers subscribe to the notion that a single (or even a few) gene(s) explains much of the variance in EXT behaviors (Kim-Cohen et al., Reference Kim-Cohen, Caspi, Taylor, Williams, Newcombe, Craig and Moffitt2006; Noble, Reference Noble1998; Pappa et al., Reference Pappa, St Pourcain, Benke, Cavadino, Hakulinen, Nivard, Nolte, Tiesler, Bakermans-Kranenburg, Davies, Evans, Geoffroy, Grallert, Groen-Blokhuis, Hudziak, Kemp, Keltikangas-Järvinen, McMahon, Mileva-Seitz and Tiemeier2016). It is now widely accepted that EXT behaviors have a polygenic architecture, influenced by many genes, each with individually small effects on EXT. The high degree of polygenicity underlying EXT is evidenced from a GWA study of EXT behaviors conducted in a multi-cohort sample of over 1.5 million individuals, which identified 579 single nucleotide polymorphisms (SNPs) significantly associated with EXT (Karlsson Linnér et al., Reference Karlsson Linnér, Mallard, Barr, Sanchez-Roige, Madole, Driver, Poore, de Vlaming, Grotzinger, Tielbeek, Johnson, Liu, Rosenthal, Ideker, Zhou, Kember, Pasman, Verweij, Liu and Dick2021). EXT behaviors were operationalized as problematic alcohol use, lifetime cannabis use, age at first sex, number of sexual partners, general risk tolerance, irritability, and lifetime smoking initiation. GWA summary statistics can be used to calculate a polygenic score (PGS), which quantifies the degree of genetic liability for a given trait of a genotyped individual (Bogdan et al., Reference Bogdan, Baranger and Agrawal2018). Compared to single genetic variants alone, PGSs are significantly more powerful predictors of EXT behaviors. For instance, the most statistically significant SNP in Karlsson Linnér et al. (Reference Karlsson Linnér, Mallard, Barr, Sanchez-Roige, Madole, Driver, Poore, de Vlaming, Grotzinger, Tielbeek, Johnson, Liu, Rosenthal, Ideker, Zhou, Kember, Pasman, Verweij, Liu and Dick2021) was rs993137 (p = 6.50e-59), an intron variant near the cell adhesion molecule 2 gene which is associated with the encoding of a protein that is part of the immunoglobulin superfamily of cell adhesion molecules. However, the average standardized regression coefficient of the effect allele in rs993137 in predicting EXT outcomes was a miniscule -0.01, indicating that despite its high degree of statistical significance, rs993137 conveys an essentially null effect size in terms of EXT prediction. In contrast, EXT PGSs explained over 10% of the variance in EXT-related traits in independent samples, with average standardized regression coefficients of around 0.20. Thus, combining the effects of many genes through PGSs has the potential to explain and predict EXT behaviors across development.
However, PGS prediction signals may be confounded or possibly inflated by the high degree of heterogeneity in EXT. Phenotypically, factor analytic studies have suggested at least two prominent subdimensions of EXT (Kotov et al., Reference Kotov, Krueger, Watson, Achenbach, Althoff, Bagby, Brown, Carpenter, Caspi, Clark, Eaton, Forbes, Forbush, Goldberg, Hasin, Hyman, Ivanova, Lynam, Markon and Zimmerman2017; Krueger et al., Reference Krueger, Markon, Patrick, Benning and Kramer2007, Reference Krueger, Hobbs, Conway, Dick, Dretsch, Eaton, Forbes, Forbush, Keyes, Latzman, Michelini, Patrick, Sellbom, Slade, South, Sunderland, Tackett, Waldman and Waszczuk2021; Olson et al., Reference Olson, Sameroff, Lansford, Sexton, Davis-Kean, Bates, Pettit and Dodge2013): the antisocial behaviors (ASB) dimension that reflects interpersonal conflicts, hostility, deceitfulness, and a general lack of regard for other people (Krueger et al., Reference Krueger, Hobbs, Conway, Dick, Dretsch, Eaton, Forbes, Forbush, Keyes, Latzman, Michelini, Patrick, Sellbom, Slade, South, Sunderland, Tackett, Waldman and Waszczuk2021; Mullins-Sweatt et al., Reference Mullins-Sweatt, Bornovalova, Carragher, Clark, Corona Espinosa, Jonas, Keyes, Lynam, Michelini, Miller, Min, Rodriguez-Seijas, Samuel, Tackett and Watts2022) and the substance use behaviors (SUB) dimension, including harmful tobacco, marijuana, and alcohol use and/or dependence (Kotov et al., Reference Kotov, Krueger, Watson, Cicero, Conway, DeYoung, Eaton, Forbes, Hallquist, Latzman, Mullins-Sweatt, Ruggero, Simms, Waldman, Waszczuk and Wright2021; Krueger et al., Reference Krueger, Hobbs, Conway, Dick, Dretsch, Eaton, Forbes, Forbush, Keyes, Latzman, Michelini, Patrick, Sellbom, Slade, South, Sunderland, Tackett, Waldman and Waszczuk2021; Mullins-Sweatt et al., Reference Mullins-Sweatt, Bornovalova, Carragher, Clark, Corona Espinosa, Jonas, Keyes, Lynam, Michelini, Miller, Min, Rodriguez-Seijas, Samuel, Tackett and Watts2022). These subdimensions are further supported by behavior genetic studies that have shown differing heritability by subdimension (Kotov et al., Reference Kotov, Krueger, Watson, Achenbach, Althoff, Bagby, Brown, Carpenter, Caspi, Clark, Eaton, Forbes, Forbush, Goldberg, Hasin, Hyman, Ivanova, Lynam, Markon and Zimmerman2017; Krueger et al., Reference Krueger, Markon, Patrick, Benning and Kramer2007, Reference Krueger, Hobbs, Conway, Dick, Dretsch, Eaton, Forbes, Forbush, Keyes, Latzman, Michelini, Patrick, Sellbom, Slade, South, Sunderland, Tackett, Waldman and Waszczuk2021; Olson et al., Reference Olson, Sameroff, Lansford, Sexton, Davis-Kean, Bates, Pettit and Dodge2013), with estimates ranging from 47% to 71% for ASB (Polderman et al., Reference Polderman, Benyamin, de Leeuw, Sullivan, van Bochoven, Visscher and Posthuma2015; Thapar et al., Reference Thapar, Harrington and McGuffin2001; Viding et al., Reference Viding, Jones, Paul, Moffitt and Plomin2008) and 23% to 50% for SUB (Han et al., Reference Han, McGue and Iacono1999; Kendler et al., Reference Kendler, Schmitt, Aggen and Prescott2008; Virtanen et al., Reference Virtanen, Kaprio, Viken, Rose and Latvala2019; Ystrom et al., Reference Ystrom, Kendler and Reichborn-Kjennerud2014). This research suggests that accurate genetic predictions require considering different forms of EXT behaviors due to their unique genetic influences.
Furthermore, there is substantial heterogeneity in the ways EXT behaviors develop over time in individuals. Studies on the developmental heterogeneity of EXT behaviors, particularly as they pertain to ASB, are well documented in the literature. Using data from the Dunedin Multidisciplinary Health and Development Study, Moffitt (Reference Moffitt1993) originally described two distinct pathways of ASB development over the lifespan, including adolescence-limited (AL) and life course-persistent (LCP) pathways. Those in the AL pathway exhibited high levels of ASB in adolescence, which then sharply declined by early adulthood, whereas those on the LCP pathway exhibited ASB at earlier ages that persisted into and throughout adulthood. Odgers and colleagues (2008) used follow up data from the Dunedin Multidisciplinary Health and Development Study and subsequently identified four trajectories of ASB (i.e., low, childhood limited, adolescent-onset, and LCP) for both males and females, which suggested invariance of these trajectories by sex. Another set of studies that examined ASB were conducted using longitudinal data from the National Longitudinal Study of Adolescent to Adult Health (Add Health), which also identified similar pathways of development for ASB measured from early adolescence (age 13) to adulthood (age 32) (Li, Reference Li2017; Morrison et al., Reference Morrison, Martinez, Hilton and Li2019). Importantly, the study by Morrison et al. (Reference Morrison, Martinez, Hilton and Li2019) provided evidence that ASB trajectories across this age range were relatively invariant (in terms of form factors) across two racial-ethnic groups (white and Black/African American) in Add Health.
Although developmental trajectories of SUB have also been identified, this literature has been less developed in comparison to ASB (Halladay et al., Reference Halladay, Woock, El-Khechen, Munn, MacKillop, Amlung, Ogrodnik, Favotto, Aryal, Noori, Kiflen and Georgiades2020; Nelson et al., Reference Nelson, Van Ryzin and Dishion2015) given that SUB typically do not emerge until adolescence and because different substances become more accessible once individuals reach legal ages for access. However, there are at least two aspects of convergence in the longitudinal literature for SUB. In general, and for most individuals, SUB tends to emerge in adolescence, increase in early adulthood, and remain stable or decline into adulthood (P. Chen & Jacobson, Reference Chen and Jacobson2012; Jackson et al., Reference Jackson, Sher and Schulenberg2008; Vergunst et al., Reference Vergunst, Chadi, Orri, Brousseau-Paradis, Castellanos-Ryan, Séguin, Vitaro, Nagin, Tremblay and Côté2021; Zellers et al., Reference Zellers, Iacono, McGue and Vrieze2022). Second, there is substantial overlap between the initiation and continued use of different substances like alcohol, marijuana, and cigarette use, such that the frequency of use of these substances may share similar trajectories (Nelson et al., Reference Nelson, Van Ryzin and Dishion2015). Furthermore, prior studies examining longitudinal trajectories of polysubstance use have observed heterogenous trajectories that may be distinct in ways other than by severity alone. For example, a study from Ou and colleagues (2024) used growth mixture modeling to examine group-based trajectories from use of cigarettes, e-cigarettes, excessive alcohol, cannabis, painkillers, and cocaine from five waves of data of over 15,000 adult participants in the Population Assessment of Tobacco and Health Study. Their analyses yielded the identification of five, complex and relatively distinct trajectory groups, including a unique “low-risk polysubstance” use trajectory that represented 10.7% of their sample, alongside three other trajectory groups that were characterized by the use of a single substance that co-led to later polysubstance use at later ages. Another study from Lanza and colleagues (2021) used parallel growth mixture modeling to identify polysubstance use from use of three tobacco (nicotine vaping, cigarette, and hookah) and five cannabis (combustible, blunt, edible, vaping, and dabbing) products in a sample of adolescents and young adults from 11th grade to 1-2 years post-high school. Their analysis produced five developmental trajectories of polysubstance use, including a “young adult-onset poly-substance/poly-product users” class that represented 15.8% of their sample. Within trajectory groups, patterns of tobacco and cannabis use were similar, but across trajectory groups, there was significant variation in which groups increased, decreased, or remained stable in their polysubstance use over time. Regardless, trajectory analyses can be useful for distinguishing between sets of individuals and provide an approximation of reality to inform clinical decision making (Nagin & Odgers, Reference Nagin and Odgers2010). Overall, longitudinal studies of ASB and SUB not only indicate differential patterns of development for each EXT phenotype, but also the possibility that they may feature different underlying etiologies.
Differential genetic influences on longitudinally modeled EXT behaviors
Quantitative genetic evidence suggests that underlying genetic influences for EXT may also differ depending on developmental epoch (Kendler et al., Reference Kendler, Jaffee and Romer2011), with heritability estimates ranging from 81%–88% in children (Jaffee et al., Reference Jaffee, Moffitt, Caspi, Taylor and Arseneault2002; Wichers et al., Reference Wichers, Gardner, Maes, Lichtenstein, Larsson and Kendler2013), 34%–86% in adolescents (Hicks et al., Reference Hicks, Krueger, Iacono, McGue and Patrick2004, Reference Hicks, Blonigen, Kramer, Krueger, Patrick, Iacono and McGue2007, Reference Hicks, South, DiRago, Iacono and McGue2009; Nikstat & Riemann, Reference Nikstat and Riemann2020; Teeuw et al., Reference Teeuw, Klein, Mota, Brouwer, van ‘t Ent, Al-Hassaan, Franke, Boomsma and Hulshoff Pol2022; Wichers et al., Reference Wichers, Gardner, Maes, Lichtenstein, Larsson and Kendler2013), 36%–51% in young adults (Hicks et al., Reference Hicks, Blonigen, Kramer, Krueger, Patrick, Iacono and McGue2007; Nikstat & Riemann, Reference Nikstat and Riemann2020; Wichers et al., Reference Wichers, Gardner, Maes, Lichtenstein, Larsson and Kendler2013), and 56% in middle-aged adults (Gustavson et al., Reference Gustavson, Franz, Panizzon, Lyons and Kremen2020). Studies have also found that life-course-persistent forms of EXT may be more heritable than less persistent forms (Moffitt, Reference Moffitt1993; Rhee & Waldman, Reference Rhee and Waldman2002; Walters, Reference Walters2002; Zheng et al., Reference Zheng, Brendgen, Dionne, Boivin and Vitaro2019; Zheng & Cleveland, Reference Zheng and Cleveland2015). For instance, Barnes and colleagues (2011) used the twin subsample from Add Health and found that the heritability for persistent ASB ranged from 56% to 70%, while the heritability for adolescent-limited ASB was only 35% (Barnes et al., Reference Barnes, Beaver and Boutwell2011). However, quantitative genetics relies almost exclusively on family-based designs to infer genetic effects via decomposition of variance. Longitudinal studies of EXT behaviors that leverage powerful molecular genetic methods like PGSs, which provide more direct measures of genetic effects, have only recently emerged in the literature.
We summarize three studies that utilized PGSs to examine ASB and/or SUB using a longitudinal design, as well as existing gaps in knowledge that we aim to address in the current study. First, Salvatore and colleagues (Reference Salvatore, Aliev, Bucholz, Agrawal, Hesselbrock, Hesselbrock, Bauer, Kuperman, Schuckit, Kramer, Edenberg, Foroud and Dick2015) used data from the Prospective Study of the Collaborative Study on the Genetics of Alcoholism (COGA) and found that EXT PGSs were associated with ASB (i.e., aggression, vandalism, theft, deceitfulness), explaining 5% of its variance in adolescents and 1% in young adults (Salvatore et al., Reference Salvatore, Aliev, Bucholz, Agrawal, Hesselbrock, Hesselbrock, Bauer, Kuperman, Schuckit, Kramer, Edenberg, Foroud and Dick2015). However, their study suffered from some limitations, including the fact that the GWA used to generate the EXT PGSs was relatively underpowered (n = 1,249 from the COGA adult sample) and their sample was not truly longitudinal in that there was only data on two developmental periods of its participants. In another study, Li and colleagues (2017) used PGSs for alcohol dependence to predict trajectories of SUB (i.e., heavy alcohol use) in the COGA Prospective Study (Li et al., Reference Li, Cho, Salvatore, Edenberg, Agrawal, Chorlian, Porjesz, Hesselbrock, Investigators and Dick2017). They found that the PGSs explained between 0.8%–2.3% of the variance in the initial status of SUB. Further, they were differentially predictive across ages, such that there was a main effect for PGSs for rates of SUB from adolescence to young adulthood, but not from young adulthood to adulthood. Collectively, these findings suggest the possibility that genetic influences may have unique impacts on the initiation and progression of ASB and SUB outcomes over time (Waszczuk et al., Reference Waszczuk, Zavos and Eley2021). Besides age, it is unclear whether genetic influences differ based on developmental subtype. In a more recent study Tielbeek and colleagues (Reference Tielbeek, Uffelmann, Williams, Colodro-Conde, Gagnon, Mallard, Levitt, Jansen, Johansson, Sallis, Pistis, Saunders, Allegrini, Rimfeld, Konte, Klein, Hartmann, Salvatore, Nolte and Posthuma2022) conducted a GWA meta-analysis (N = 85,359) on severe forms of ASB, operationalized as conduct disorder symptoms, aggressive behavior, and delinquency spanning 28 different discovery samples. Then, using the out-of-sample Dunedin Study (N = 1,037), they identified growth mixture model trajectories from ages 7 to 26, and found that individuals in the life course-persistent ASB trajectory had the highest levels of ASB PGSs relative to individuals in either the childhood-limited or adolescence-onset ASB trajectories (Tielbeek et al., Reference Tielbeek, Uffelmann, Williams, Colodro-Conde, Gagnon, Mallard, Levitt, Jansen, Johansson, Sallis, Pistis, Saunders, Allegrini, Rimfeld, Konte, Klein, Hartmann, Salvatore, Nolte and Posthuma2022). However, their GWA and resultant PGS association analyses focused exclusively on ASB without also considering SUB. Additionally, their GWA study may be unrepresentative of general risks for EXT in the population, given that they only included individuals from population-based cohorts with a clinical diagnosis. Despite these limitations, the study by Tielbeek et al. (Reference Tielbeek, Uffelmann, Williams, Colodro-Conde, Gagnon, Mallard, Levitt, Jansen, Johansson, Sallis, Pistis, Saunders, Allegrini, Rimfeld, Konte, Klein, Hartmann, Salvatore, Nolte and Posthuma2022) provides compelling preliminary evidence that genetic influences (in the form of PGSs) may be differentially prominent depending on the developmental subtype of EXT. In other words, genes may influence EXT behaviors differently based on how and when they develop.
Present study
The current study has two objectives. First, we sought to characterize the different developmental pathways for ASB and SUB over the course of nearly 30 years (ages 13–41) using a large, prospective longitudinal sample in Add Health. Second, we tested whether EXT PGSs, informed by the largest GWA study on EXT behaviors to date (Karlsson Linnér et al., Reference Karlsson Linnér, Mallard, Barr, Sanchez-Roige, Madole, Driver, Poore, de Vlaming, Grotzinger, Tielbeek, Johnson, Liu, Rosenthal, Ideker, Zhou, Kember, Pasman, Verweij, Liu and Dick2021), would be differentially predictive of membership into the different ASB and SUB trajectories as identified in our first objective. We make no specific hypotheses for the first objective due to its exploratory nature (i.e., identification of developmental trajectories for ASB and SUB), although based on prior longitudinal literature on ASB and SUB, we generally expect to identify pathways that vary by chronicity/persistence and peaks at or during certain developmental periods, such as adolescence and early adulthood. For the second objective, we hypothesized that higher EXT PGSs will be more strongly predictive of membership into the more chronic/persistent trajectories of ASB and SUB compared to the less chronic/persistent trajectories, which would converge with evidence derived using quantitative genetic approaches (Barnes et al., Reference Barnes, Beaver and Boutwell2011; Tielbeek et al., Reference Tielbeek, Uffelmann, Williams, Colodro-Conde, Gagnon, Mallard, Levitt, Jansen, Johansson, Sallis, Pistis, Saunders, Allegrini, Rimfeld, Konte, Klein, Hartmann, Salvatore, Nolte and Posthuma2022).
Method
Preregistration
A portion of this study was preregistered via the Open Science Framework (OSF) and can be found at https://osf.io/ednvb/?view_only=1ceaeba1100c4558b5ba54061293c38b. Code and scripts to reproduce our analyses are available on the OSF project page. Although Add Health is publicly available, these data can only be directly accessed by researchers with approved access by the Add Health team. We note that there were several major changes to our originally planned analyses as detailed in our preregistration. First, we originally intended to focus our PGS analyses exclusively on predicting trajectories of ASB but later expanded our analysis to also include a PGS association analysis of trajectories of SUB. This change was made to address early feedback from our colleagues and collaborators that ASB and SUB are both phenotypically and genetically correlated, and that the findings from our project would be significantly strengthened if we analyzed ASB and SUB trajectories in the same project. Another change to our preregistration was that we decided to narrow the overall scope of our project, which was originally intended to quantify direct and indirect effects of EXT PGSs on ASB trajectories. During the early stages of our original analysis, we came to realize that the multitude of steps required to produce direct and indirect genetic effects for PGSs required significant empirical justification (e.g., identification and extraction of trajectories for ASB and SUB, testing main effects of “traditional PGSs” on ASB and SUB trajectories, acquiring independent trio-based data to estimate unique direct and indirect effect sizes on EXT via a GWA study, production of novel PGSs in Add Health stratified by direct and indirect effects informed by trio-based GWA study), the entirety of which was too expansive for a single study. The current research, however, remains conceptually and theoretically aligned with our original preregistration. Thus, per OSF Preregistration Support Guidelines (https://help.osf.io/article/145-preregistration), we consider the current preregistration valid (Simmons et al., Reference Simmons, Nelson and Simonsohn2011).
Participants
Participants were from Add Health, an ongoing study on adolescent health and behavior in the United States that began in 1994 (Harris et al., Reference Harris, Halpern, Whitsel, Hussey, Tabor, Entzel and Undry2009). Data were obtained from adolescents in grades 7–12 using stratified random sampling from high schools across the United States. Adolescents, parents, peers, school administrators, siblings, friends, and romantic partners participated in data collection across five waves: wave I (1994–1995, ages 12–21, N = 20,745), wave II (1995–1996, ages 12-23, N = 14,738), wave III (2001–2002, ages 18–28, N = 15,197), wave IV (2007–2008, ages 25–34, N = 15,701), and wave V (2016–2018, ages 33–44, N = 12,300). Forty-nine and a half percent of the sample identified as male, and the self-reported racial-ethnic composition included 62.1% “Caucasian (including Hispanic or Latino),” 23% “Black or African American,” 7.1% “Asian or Pacific Islander,” 1.2% “Native American,” and 6.6% “other.” The mean household income at wave I was 45.73 thousand dollars (SD = 51.62 thousand) and the modal highest parental educational attainment at wave I was a high school degree/diploma (25% of the sample). Patterns of attrition in Add Health have been found for gender, age, socioeconomic status, urban residence, immigrant status, and self-reported race across time (Harris et al., Reference Harris, Halpern, Whitsel, Hussey, Killeya-Jones, Tabor and Dean2019b). In general, responses were higher for female, younger, higher socioeconomic status, urban, native-born, and white participants at waves III and IV. Response rates for Add Health exceed those of other national studies (wave I = 79%, wave II = 88.6%, wave III = 77.4%, wave IV = 80.3%, wave V = 72%) (Harris, Reference Harris2022; Harris et al., Reference Harris, Halpern, Whitsel, Hussey, Killeya-Jones, Tabor and Dean2019b).
Measures
Genotyping and quality control
Saliva samples were obtained at wave IV. Genotyping was conducted on the Omni1-Quad BeadChip and the Omni2.5-Quad BeadChip. Add Health European genetic ancestry samples were imputed on Release 1 of the Human Reference Consortium (HRS r1.1). Non-European genetic ancestry groups were imputed using the 1000 Genomes Phase 3 reference panel. After a set of standard genotype quality control procedures, imputed genotype data containing 9,664,514 markers were available for a total of 9,974 Add Health participants. The combination of multiple chips and multiple genetic ancestries used by the Add Health team resulted in a complex quality control pipeline; as such, additional details of the quality control are available online (https://www.cpc.unc.edu/projects/addhealth/documentation/guides).
The predictive performance of PGSs is known to drop substantially in samples with non-European genetic ancestries. This is because many GWA studies only include (or mostly include) European genetic ancestry individuals (Martin et al., Reference Martin, Kanai, Kamatani, Okada, Neale and Daly2019). In addition to the risk of population stratification (Price et al., Reference Price, Patterson, Plenge, Weinblatt, Shadick and Reich2006), we restricted our main analyses to European genetic ancestry individuals. We also generated PGSs for other available genetic ancestry groups in Add Health (African, Hispanic, and East Asian genetic ancestry groups) and conducted stratified analyses for each group. Genetic ancestry stratified results are reported in Supplementary Tables 1–6.
Note. ASB = antisocial behaviors; SUB = substance use behaviors; EXT PGS = externalizing polygenic score.
a Age measured in years. bIncome measured in thousands. cASB included property damage, stealing something greater than $50, selling drugs, pulling a knife or gun on someone, and shooting or stabbing someone. These items were dichotomized and summed to create a composite score for each wave. dSUB included frequency of alcohol consumption, cigarette smoking, and marijuana use. Each item ranged from zero to six, with zero indicating no substance use and six indicating daily/almost daily substance use. These items were summed to create a composite score for each wave. eEXT PGS were standardized with a mean of zero and standard deviation of one.
EXT PGSs
EXT PGSs were computed from summary statistics produced by a genomic structural equation model analysis conducted on EXT behaviors (Karlsson Linnér et al., Reference Karlsson Linnér, Mallard, Barr, Sanchez-Roige, Madole, Driver, Poore, de Vlaming, Grotzinger, Tielbeek, Johnson, Liu, Rosenthal, Ideker, Zhou, Kember, Pasman, Verweij, Liu and Dick2021; Williams et al., Reference Williams, Poore, Tanksley, Kweon, Courchesne-Krak, Londono-Correa, Mallard, Barr, Koellinger, Waldman, Sanchez-Roige, Harden, Palmer, Dick and Karlsson Linnér2023), which included GWA summary statistics derived from the following cohorts: UKB, 23andMe, Psychiatric Genomics Consortium, International Cannabis Consortium, GWA Study & Sequencing Consortium of Alcohol and Nicotine use (GSCAN), Million Veteran Program, and Social Science Genetic Association Consortium. Summary statistics of the EXT GWA study were directly obtained via the EXT Consortium (https://externalizing.rutgers.edu). We also signed a Data Use Agreement with the 23andMe Research Team (correspondence, Data Transfer Agreement, and Statement of Work available upon request of the last author). These data were then used to compute PGSs in Add Health for participants. We used the GWA study by Karlsson Linnér and colleagues (2021) to compute PGSs because it is the largest available and well-powered study of broad EXT to date, and its phenotypic focus is a good match for the two prominent subdimensions of EXT at the center of the current study (i.e., ASB and SUB). PGSs generated from this GWA study also demonstrated superior predictive performance compared to other GWA studies of broad EXT behaviors available (Barr et al., Reference Barr, Salvatore, Wetherill, Anokhin, Chan, Edenberg, Kuperman, Meyers, Nurnberger, Porjesz, Schuckit and Dick2020). Incorporating a GWA study examining broad EXT was important given both the genetic overlap and unique genetic influences underlying ASB and SUB (Waszczuk et al., Reference Waszczuk, Eaton, Krueger, Shackman, Waldman, Zald, Lahey, Patrick, Conway, Ormel, Hyman, Fried, Forbes, Docherty, Althoff, Bach, Chmielewski, DeYoung, Forbush and Kotov2020).
EXT PGSs were computed using PRS-CS, which uses Bayesian regression and a continuous shrinkage prior to infer posterior effect sizes of SNPs using GWA summary statistics and an external linkage disequilibrium (LD) reference panel (Ge et al., Reference Ge, Chen, Ni, Feng and Smoller2019). In other words, PRS-CS weights SNPs based on the observed LD in various genetic ancestry groups, tuning the European genetic ancestry GWA study to target populations of different genetic ancestries. This method has been found to have better predictive performance across genetic ancestries and phenotypes compared to other PGS generation methods (Ahern et al., Reference Ahern, Thompson, Fan and Loughnan2023; Kachuri et al., Reference Kachuri, Chatterjee, Hirbo, Schaid, Martin, Kullo, Kenny, Pasaniuc, Auer, Conomos, Conti, Ding, Wang, Zhang, Zhang, Witte and Ge2024). EXT PGSs (N = 9,974) were standardized with a mean of zero and standard deviation of one within genetic ancestry groups (European genetic ancestry n = 5,728; African genetic ancestry n = 1,976; Hispanic genetic ancestry n = 988; East Asian genetic ancestry n = 437) to aid interpretability. We further accounted for potential risk of confounding via population stratification by controlling for the top 10 genetic principal components of the covariance matrix of the Add Health genotypic data (Price et al., Reference Price, Patterson, Plenge, Weinblatt, Shadick and Reich2006) in all analyses.
ASB
ASB was assessed during the Add Health in-home interviews (“delinquency scale” and “fighting and violence”) conducted at waves I-IV, and during the Mixed-Mode Survey at wave V (Harris et al., Reference Harris, Halpern, Biemer, Liao and Dean2019a). To facilitate the longitudinal analysis (i.e., growth mixture modeling), five identical or highly similar items were selected from each wave reflecting non-aggressive rule-breaking behaviors (e.g., property damage, stealing something greater than $50, selling drugs) and aggressive rule-breaking behaviors (e.g., pulling a knife or gun on someone, shooting or stabbing someone). Although there were more than five items related to ASB in Add Health, we only used the five which were measured across all five waves. Items were dichotomized and summed to create a composite score (range = 0-5) at each wave. The scale demonstrated high internal consistency across waves (ordinal αs = .87, .87, .81, .83, .83 for waves I, II, III, IV and V, respectively).Footnote 1
SUB
SUB was assessed during the Add Health in-home interviews conducted at waves I-IV, and during the Mixed-Mode Survey at wave V (Harris et al., Reference Harris, Halpern, Biemer, Liao and Dean2019a). Three identical or highly similar items were extracted from each wave, reflecting the presence and frequency of alcohol, marijuana, and cigarette use (e.g., “During the past 30 days, on how many days did you use marijuana?”). Because several items had varying response scales, items across waves I-V were re-scaled for consistency and increased interpretability, ranging from 0 to 6, with 0 indicating no substance use endorsement (i.e., abstainers) and 6 indicating daily/almost daily substance use endorsement behaviors. The three items from waves I-V (alcohol, marijuana, and cigarette use) were summed to create a composite score (range = 0–18). Composite scores were calculated for all participants. These scales demonstrated high to moderate internal consistency across waves (ordinal αs = .82, .80, .68, .56, .50 for waves I, II, III, IV, and V, respectively).Footnote 2
Analytic plan
Step 1. Growth mixture models (GMM) of ASB and SUB
Composite scores of ASB and SUB were modeled longitudinally using GMM for the entire sample (i.e., for participants of all ancestries)Footnote 3 across all five waves of Add Health data, spanning ages 13 to 41. GMM is a group-based analytic method that identifies subpopulations characterized by their observed trajectories (Jung & Wickrama, Reference Jung and Wickrama2008). GMM allows for within-class variation of the growth parameters (Muthén & Muthén, Reference Muthén and Muthén2015). Following several other studies using Add Health (Barboza, Reference Barboza2020; Li et al., Reference Li, Zhang, Wang and Lu2022; Wang et al., Reference Wang, Walsh and Li2023), we used GMM to capture individual variation in ASB and SUB trajectories. The skewed nature of the composite scales was accounted for using a zero-inflated Poisson model.
To model growth trajectories from adolescence and into adulthood, data were represented by age rather than by wave, resulting in “missing data by design” (Little, Reference Little2013; Muthén & Muthén, Reference Muthén and Muthén2015). The decision to restructure data by age rather than wave (i.e., accelerated longitudinal design or cohort-sequential design) was due to the age heterogeneity within each wave (e.g., ages 12–21 at wave I, 12–23 at wave II, 18–28 at wave III, etc.). Examining ASB and SUB measured at each wave would have led to serious interpretation problems. Participants will have at most only five points of data (one per wave), meaning that most participants will have large amounts of missing data (i.e., “missing data by design”). Mplus handles this type of missingness using the expectation maximization algorithm (Duncan et al., Reference Duncan, Duncan, Strycker and Chaumeton2007).
Models were evaluated based on interpretability (i.e., meaningful interpretation and consistency with prior literature) in addition to model fit (i.e., Akaike Information Criterion, Bayesian Information Criterion [BIC], sample-adjusted BIC, and adjusted Lo-Mendell-Rubin test). Finally, regarding potential sex differences, developmental pathways of ASB have not been found to significantly differ by sex in prior investigations (Moffitt et al., Reference Moffitt, Caspi, Harrington and Milne2002; Odgers et al., Reference Odgers, Moffitt, Broadbent, Dickson, Hancox, Harrington, Poulton, Sears, Thomson and Caspi2008). Similar results have been found in studies of SUB (Keyes et al., Reference Keyes, Martins, Blanco and Hasin2010). To reduce Type I error rates related to multiple testing by stratification, the GMM was conducted for the entire sample rather than separately for males and females.
Step 2. Multinomial logistic regressions (MLRs) predicting ASB and SUB GMM trajectory membership
PGS associations were analyzed using MLRs, where each trajectory of ASB and SUB was regressed on EXT PGS as separate models. We used trajectory groups rather than parameter estimates of trajectories to aid in clinically relevant interpretations. The MLR models included the following covariates: age at wave I, biological sex at wave I (1 = male; 2 = female), self-reported race (1= white; 2 = Black or African American; 3 = American Indian or Native American; 4 = Asian or Pacific Islander; 5 = other), household income at wave I (M = $45,730, SD = $51,620), highest parental education at wave I (1 = less than high school; 2 = high school; 3 = some college; 4 = college degree; 5 = post-college education). Age, biological sex, self-reported race, household income, and highest level of parental education are known to covary with the variables in the current study (i.e., ASB and SUB) and were included in the MLR models as covariates (Ingoldsby et al., Reference Ingoldsby, Shaw, Winslow, Schonberg, Gilliom and Criss2006; McHugh et al., Reference McHugh, Votaw, Sugarman and Greenfield2018; Patrick et al., Reference Patrick, Wightman, Schoeni and Schulenberg2012; Thibodeau et al., Reference Thibodeau, Cicchetti and Rogosch2015; White et al., Reference White, Labouvie and Papadaratsakis2005).
In addition to the above covariates, the MLR model for ASB controlled for SUB class membership. Likewise, the model for SUB controlled for ASB class membership. The decision to distinguish ASB from SUB outcomes was largely driven by compelling theoretical and quantitative evidence that ASB and SUB tend to co-occur at notably greater than chance levels (Krueger et al., Reference Krueger, Hobbs, Conway, Dick, Dretsch, Eaton, Forbes, Forbush, Keyes, Latzman, Michelini, Patrick, Sellbom, Slade, South, Sunderland, Tackett, Waldman and Waszczuk2021).Footnote 4
Results
Descriptive statistics
Table 1 and Supplementary Table 7 provide descriptive statistics and correlations for the analytic sample, respectively.
Step 1: GMM of ASB and SUB
ASB
Intercept-only, linear, quadratic, and cubic models were tested for two, three, four, five, and six class solutions (see Supplementary Table 8 for fit indices). Due to the improvement in fit statistics and the clarity of interpretation, the best fitting model was determined to be the quadratic solution with four classes (N = 20,722). The four classes that emerged were Low (67% of the sample), Moderate (18.9% of the sample), Adolescence-Peaked (10.6% of the sample), and High Decline (3.6% of the sample) (Figure 1a). On average, the Low class exhibited consistently minimal levels of ASB from age 13 until age 41. The Moderate class exhibited slightly higher levels of ASB at age 13 than the Low class, which increased very minimally until approximately age 23. Following age 23, ASB began to decline and reached nearly zero by age 41. The Adolescence-Peaked class was characterized by a higher initial status of ASB than Moderate and Low, which then sharply increased and peaked at age 16 before sharply decreasing after approximately age 16. Finally, the High Decline class exhibited a high level of ASB at age 13 that was substantially higher than any of the other three classes. Following age 13, the mean level of ASB consistently declined and approached zero by age 41. Though it was declining steadily throughout ages 13 to 41, High Decline maintained an average level of ASB that was higher than any of the other classes.
SUB
Intercept-only, linear, quadratic, and cubic models were tested for two, three, four, five, and six classes (see Supplementary Table 9) for fit indices across each model; cubic models did not converge and were not reported. Evaluation of fit indices and theoretical interpretability suggested that the best fitting model from the GMM was the quadratic solution with three classes (N = 20,692). Notably, the quadratic solution with six classes resulted in similar fit indices as the three-class model, but we selected the three-class model because of its stronger alignment with prior research findings. To ensure that the three-class and six-class models did not differ substantively, we cross-tabulated their frequencies in Supplementary Table 10. As expected, there was significant overlap between the classes, χ2(10) = 23,885.75, p < .001), strongly suggesting that the unique classes that emerged in the six-class model could be substantively accounted for by three-class model with minimal reduction in interpretability.
The classes that emerged from the three-class model were Low Use (23% of the sample), Typical Use (41.7% of the sample), and High Use (35.2% of the sample) (Figure 1b). The Low Use class exhibited almost no SUB at age 13, but these behaviors gradually increased up to age 41. Low Use exhibited the lowest levels of SUB compared to the other two classes from age 13 to 41. Typical Use showed a slightly higher initial status of SUB than Low Use but increased in trajectory up to approximately age 30 before plateauing. By approximately age 34, SUB in the Typical Use class began to slowly decline until age 41. We labeled this Typical Use because the greatest proportion of the sample fell into the class compared to the other two classes. Additionally, epidemiological studies on adolescent alcohol and drug use consistently show some degree of SUB is more prevalent than complete abstinence during this developmental period (Miech et al., Reference Miech, Johnston, Patrick, O’Malley, Bachman and Schulenberg2023; Substance Abuse and Mental Health Services Administration, 2021). Finally, SUB for High Use at baseline was higher than either Low Use or Typical Use classes. SUB increased steadily until peaking around age 24, where it remained stable until around age 30. From ages 30-41, the High Use class declined, but at a relatively slow rate. The High Use class exhibited the highest persistent levels of SUB compared to the other two classes.
Step 2: MLRs predicting ASB and SUB trajectories
MLRs were tested to assess the relative risk ratios for ASB and SUB class memberships based on one’s EXT PGS, controlling for biological sex, household income, highest parental education, age at wave I, ASB or SUB class membership, and the first 10 genetic principal components. Add Health survey weights were included to account for cluster effects from schools in all MLRs. Effect sizes were converted from logits to relative risk ratios to aid interpretability. Relative risk ratios indicate the probability of occurrence relative to a reference group. In this study, it represents the probability that an individual will belong to one of two developmental trajectories of ASB or SUB based on their EXT PGS, relative to being in either Low or Low Use trajectories, respectively. The analytic sample size for each MLR was 4,416.
ASB
The Low class was the reference class due to being the most prevalent outcome of the sample (67%). There was a significant association between EXT PGSs and High Decline, such that participants with a one standard deviation increase in EXT PGSs had a 42% increased relative risk of belonging to High Decline than Low class (RR = 1.42, 95% CI [1.02, 1.98], p = .04; Table 2). In other words, the risk of belonging to the High Decline class (compared to Low) was 1.42 times higher for individuals with a one standard deviation increase in EXT PGSs). This suggests the risk of belonging to the High Decline class was elevated among individuals with higher EXT PGSs compared to those with lower PGSs. The association between EXT PGSs and belonging to the Moderate class was not significant (RR = 1.11, 95% CI [0.97, 1.26], p = .10), nor was the association between EXT PGSs and belonging to the Adolescence-Peaked class (RR = 1.13, 95% CI [0.96, 1.34], p = .14). Figure 2a shows proportions of the four ASB classes as a percent of the total EXT PGS distribution.
Note. The Low class was used as the reference class in this model. Genetic principal components were also covaried; this data is available upon request. SUB = substance use behaviors; EXT PGS = externalizing polygenic score.
aMale used as comparison group. bWhite used as comparison group. cLess than high school used as comparison group. dTypical Use used as comparison group.
SUB
Typical Use was selected as the reference class given that the highest proportion of the sample belonged to this class (41.7%). There was a significant association between EXT PGSs and High Use, such that participants with a one standard deviation increase in EXT PGSs had a 34% increased relative risk of belonging to High Use than Typical Use (RR = 1.34, 95% CI [1.22, 1.48], p < .001; Table 3). In other words, the risk of belonging to the High Use class (compared to Low Use) was 1.34 times higher for individuals with a one standard deviation increase in EXT PGSs). This suggests the risk of belonging to the High Use class was elevated among individuals with higher EXT PGSs compared to those with lower PGSs. The association between EXT PGSs and Low Use relative to Typical Use was also significant, such that participants with a one standard deviation increase in EXT PGSs had a 16% lower relative risk of belonging to the low class compared to Typical Use (RR = 0.84, 95% CI [0.75, 0.94], p < .01). This indicates that individuals with higher EXT PGSs were less likely to belong to the Low Use class and more likely to be in the Typical Use or High Use classes. Figure 2b shows proportions of the three SUB classes as a percent of the total EXT PGS distribution.
Note. The Typical Use class was used as the reference class in this model. Genetic principal components were also covaried; this data is available upon request. ASB = antisocial behaviors; EXT PGS = externalizing polygenic score.
aMale used as comparison group. bWhite used as comparison group. cLess than high school used as comparison group. dLow used as comparison group.
Secondary analyses: chi-square tests
Based on the results of MLRs, post hoc Chi-square tests of independence were conducted to compare the frequencies of ASB class membership and SUB class membership, mirroring the MLR analyses conducted in the previous step. As shown in the frequencies cross tabulated in Supplementary Table 11, there was a significant association between ASB class and SUB class, χ2(6) = 1,1897.44, p < .001). However, not all individuals belonging to the High Decline ASB trajectory also belonged to the High Use SUB trajectory. These results confirm that while class trajectories of ASB and SUB were quite correlated, they were not entirely predictive or associated with one another.
Discussion
We discovered unique pathways of development for ASB and SUB when examined across early adolescence into middle adulthood. Using prospective longitudinal data from Add Health, four distinct trajectories of ASB emerged, represented by High Decline, Moderate, Adolescence-Peaked, and Low pathways. For SUB, three trajectories emerged consisting of Low Use, Typical Use, and High Use pathways. Furthermore, we found that EXT PGSs had differential associations with trajectories of ASB and SUB, such that higher EXT PGSs increased the relative risk of belonging to the chronic/persistent forms of ASB and SUB–High Decline and High Use, respectively–when compared to Low and Typical Use. Collectively, these findings reinforce the prominence of unique pathways of development within EXT, and subsequently, the unique genetic contributions for ASB and SUB that appear to be sensitive to developmental typology.
The identification of unique developmental pathways for ASB and SUB from adolescence into middle adulthood adds to emerging evidence that these constructs may have unique etiologies (Kotov et al., Reference Kotov, Krueger, Watson, Achenbach, Althoff, Bagby, Brown, Carpenter, Caspi, Clark, Eaton, Forbes, Forbush, Goldberg, Hasin, Hyman, Ivanova, Lynam, Markon and Zimmerman2017). For ASB, the developmental trajectories we identified were largely consistent with prior longitudinal research using Add Health data (F. R. Chen & Jaffee, Reference Chen and Jaffee2015; Li, Reference Li2017; Morrison et al., Reference Morrison, Martinez, Hilton and Li2019) and other longitudinal datasets (Moffitt, Reference Moffitt1993; Odgers et al., Reference Odgers, Moffitt, Broadbent, Dickson, Hancox, Harrington, Poulton, Sears, Thomson and Caspi2008; Tielbeek et al., Reference Tielbeek, Uffelmann, Williams, Colodro-Conde, Gagnon, Mallard, Levitt, Jansen, Johansson, Sallis, Pistis, Saunders, Allegrini, Rimfeld, Konte, Klein, Hartmann, Salvatore, Nolte and Posthuma2022), albeit within narrower age ranges. Given the wider age range of the current study, we also found that most trajectories of ASB diminished by middle age (Gottfredson & Hirschi, Reference Gottfredson and Hirschi2016; Hirschi & Gottfredson, Reference Hirschi and Gottfredson1983; Sweeten et al., Reference Sweeten, Piquero and Steinberg2013), including for those in the persistent trajectory.Footnote 5 Trajectories for SUB differed from those we identified for ASB; we found most individuals belonged to a SUB class other than Low Use, which supports the notion that SUB tend to be normative during adolescence (Levy et al., Reference Levy, Campbell, Shea and DuPont2018).Footnote 6 Further, High Use continued to have the highest level of SUB across the entire time period measured, while this pattern did not hold true for ASB (i.e., High Decline did not increase at any point).
Although trajectories of ASB and SUB may unfold differently over time for different individuals, we found that genetic effects via PGSs were strongest for those belonging to the most chronic forms of ASB and SUB. This complements previous lines of behavioral and molecular genetic evidence which have shown that heritability and PGS effects, respectively, may be higher for chronic EXT behaviors than for less chronic or adolescent-limited EXT behaviors (Rhee & Waldman, Reference Rhee and Waldman2002). For example, Zheng and Cleveland (Reference Zheng and Cleveland2015) used twin data to find that genetic factors were more influential in ASB for individuals in the life course-persistent pathway than in adolescent-limited pathways. And while there is less prior research on genetic differences in trajectories of SUB, one study using twin data identified three trajectories of alcohol use from ages 13–17: low (15.1% of the sample), early onset (8.2%), and normative increasing (76.7%) (Zheng et al., Reference Zheng, Brendgen, Dionne, Boivin and Vitaro2019). Compared to the low group, the early onset and normative increasing groups were found to have the highest level of genetic liability in belonging to those trajectories (34.7% and 37.7%, respectively). However, their findings were limited given their narrow age range and their focus on alcohol. Most recently, PGSs for ASB were most associated with persistent ASB from ages 7–26 in the out-of-sample Dunedin Study (Tielbeek et al., Reference Tielbeek, Uffelmann, Williams, Colodro-Conde, Gagnon, Mallard, Levitt, Jansen, Johansson, Sallis, Pistis, Saunders, Allegrini, Rimfeld, Konte, Klein, Hartmann, Salvatore, Nolte and Posthuma2022). The present study replicated and extended this result by using a non-clinical sample and examining a significantly extended age range (up to age 41). Moreover, our study provides crucial evidence that PGS effects can also vary by developmental subtypes, rather than just age (Elam et al., Reference Elam, Ha, Neale, Aliev, Dick and Lemery-Chalfant2021; Li et al., Reference Li, Cho, Salvatore, Edenberg, Agrawal, Chorlian, Porjesz, Hesselbrock, Investigators and Dick2017).
One explanation for why individuals with high polygenic liability for EXT may be more likely to belong to the chronic or persistent trajectories of ASB and SUB may be because they are also more likely to evoke or select into adverse environments that also tend to underlie the development of chronic or persistent EXT outcomes (e.g., being raised in a harsh or more permissive home environment, greater involvement with deviant peers). For example, high PGSs for attention-deficit/hyperactivity disorder in Add Health were negatively associated with wave I measures of supportive parenting (e.g., parental warmth, closeness, communication quality) and school connectedness (e.g., belongingness, teacher support, safety at school), over the effects of participant sex, age, highest level of parental education, and household income (Li, Reference Li2019). Using data from two different population cohorts (Dunedin and Environmental Risk), Wertz and colleagues (Reference Wertz, Caspi, Belsky, Beckley, Arseneault, Barnes, Corcoran, Hogan, Houts, Morgan, Odgers, Prinz, Sugden, Williams, Poulton and Moffitt2018) found that polygenic risk for educational attainment was negatively associated with familial socioeconomic deprivation and parental antisocial behaviors, further implicating the prominent role of gene-environment correlations with respect to PGSs for EXT outcomes (Wertz et al., Reference Wertz, Caspi, Belsky, Beckley, Arseneault, Barnes, Corcoran, Hogan, Houts, Morgan, Odgers, Prinz, Sugden, Williams, Poulton and Moffitt2018). In turn, the influence of parents, peers, and schools also associate with the later development of ASB (Burt, Reference Burt2022) and SUB (Bosk et al., Reference Bosk, Anthony, Folk and Williams-Butler2021; Henneberger et al., Reference Henneberger, Mushonga and Preston2021), over and above genetic effects (Burt et al., Reference Burt, Clark, Gershoff, Klump and Hyde2021; Klahr et al., Reference Klahr, McGue, Iacono and Burt2011). There is emerging evidence that PGSs may reflect indirect effects on mental health trajectories, via environmental exposure (Li et al., Reference Li, Hilton, Lu, Hong, Greenberg and Mailick2019; Li, Reference Li2019). Thus, it is plausible that via gene-environment correlation (i.e., either active or evocative forms), high genetic liability for EXT may increase one’s exposure to (and experience of) more hostile or adverse environments, which increase one’s likelihood of developing chronic negative outcomes associated with ASB and SUB.
Another reason why higher PGS for EXT may be more strongly associated with membership into chronic trajectories of ASB and SUB may be due to fact that the EXT GWA study for which EXT PGSs were computed from was based on predominantly adult-related phenotypes of EXT (e.g., lifetime cannabis use, number of sexual partners) and/or populations (i.e., UK Biobank, 23andMe, GSCAN) (Karlsson Linnér et al., Reference Karlsson Linnér, Mallard, Barr, Sanchez-Roige, Madole, Driver, Poore, de Vlaming, Grotzinger, Tielbeek, Johnson, Liu, Rosenthal, Ideker, Zhou, Kember, Pasman, Verweij, Liu and Dick2021). The GWA discovery sample may have largely identified genetic variants associated with EXT outcomes that tend to express later in life and thus, more likely to be reflected in the chronic EXT trajectories we observed in Add Health. Conversely, it is plausible that EXT PGSs might yield weaker prediction signals for trajectories in which most variation occurs during the adolescent years (e.g., Adolescence-Peaked trajectory of ASB, Low Use trajectory of SUB). Indeed, there is evidence that genetic liability for alcohol use frequency during adolescence may be distinct from the genetic liability for later developmental periods (i.e., early adulthood and adulthood) (Thomas et al., Reference Thomas, Gillespie, Chan, Edenberg, Kamarajan, Kuo, Miller, Nurnberger, Tischfield, Dick and Salvatore2024). As Thomas and colleagues noted (pp. 164), the omission of developmental considerations (e.g., age specificity of genetic effects, trajectory analyses of phenotypes) may substantially limit the predictive power of PGSs for phenotypic predictions across the lifespan (Elam et al., Reference Elam, Ha, Neale, Aliev, Dick and Lemery-Chalfant2021; Kandaswamy et al., Reference Kandaswamy, Allegrini, Plomin and Stumm2021; Thomas et al., Reference Thomas, Gillespie, Chan, Edenberg, Kamarajan, Kuo, Miller, Nurnberger, Tischfield, Dick and Salvatore2024). Future applications of PGSs should be generated from discovery samples that are more closely age-matched to the target study. Furthermore, future GWA studies may also benefit from directly identifying genes associated with EXT trajectories themselves.
There are several limitations of the present study that should be noted. First, despite characterizing ASB and SUB with nearly 30 years of prospective longitudinal data, retrospective childhood ASB and SUB (i.e., before age 13) could not be examined. As such, we were precluded from making strong inferences regarding whether our findings reflect lifespan trajectories of ASB and SUB. Second, our measures of ASB and SUB were somewhat narrow, given our focus on items that were similarly measured across each wave to conduct GMM analyses. This may have limited the types of trajectories we could have identified. Future work should aim to include a range of items reflecting broader forms of ASB and SUB. Third, our study examined outcomes using a group-based approach for greater interpretability, but focusing on continuous outcomes within groups (e.g., intercepts, slopes) may be more suitable when evaluating narrower phenotypes, such as alcohol use (Vachon et al., Reference Vachon, Krueger, Irons, Iacono and McGue2017). Fourth, we did not account for other co-occurring mental health dimensions that are known to correlate with EXT, such as internalizing behaviors (Achenbach & Rescorla, Reference Achenbach and Rescorla2003), as this would have been well beyond the scope of the current study. Future studies studying EXT behaviors should consider co-modeling internalizing behaviors (Wang et al., Reference Wang, Walsh and Li2023) alongside EXT in Add Health. Fifth, the reliance on self-reports may be subject to social desirability bias in the form of under self-reporting ASB and SUB. Finally, though the present study used a nationally representative racial and ethnic sample for GMM, the GWA study used to compute EXT PGSs was limited to participants of predominantly European genetic ancestry. It remains imperative that GWA studies continue to increase recruitment of ancestrally diverse samples to ensure that future genetically informed discoveries are generalizable.
The current study findings may also have important clinical implications. PGSs can provide insights into understanding where individuals may fall on the continuum of risk with respect to their development of EXT, which may in turn be relevant for early detection and prevention efforts. For example, we found that difference in severity of SUB between classes at age 13 was minimal compared to later ages. PGSs may be clinically informative at baseline (age 13), when clear differences in severity of SUB have yet to emerge. The differential association of PGSs with ASB and SUB trajectories further suggest that there are other important risk and protective factors to consider, beyond genetics alone. While there has been some research to examine the role that various environmental risk and protective factors may have in the prediction of EXT trajectories broadly (Figge et al., Reference Figge, Martinez-Torteya and Weeks2018), questions remain regarding how these factors may moderate EXT PGS associations for ASB and SUB trajectories (Domingue et al., Reference Domingue, Trejo, Armstrong-Carter and Tucker-Drob2020; Kendler & Eaves, Reference Kendler and Eaves1986). Finally, there may be ways to adapt and personalize interventions based on one’s genotype. Gene-by-intervention interaction studies assert that genetic factors may be useful in predicting which interventions might be most effective for certain individuals (Belsky & Van Ijzendoorn, Reference Belsky and Van Ijzendoorn2015). For example, youths with greater genetic risks for EXT behaviors can be targeted for more intensive forms of interventions. It is also possible that interventions may be tailored to developmental stages to enhance their effectiveness. Utilizing PGSs as predictors and developmental trajectories as outcomes (rather than an outcome at a singular time point) may lead to more individualized and possibility more efficacious interventions for EXT behaviors across the lifespan.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0954579424001962.
Acknowledgements
This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Special acknowledgement is due to Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Information on how to obtain the Add Health data files is available on the Add Health Web site (http://www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this analysis.
We would also like to thank The Externalizing Consortium for sharing the GWAS summary statistics of externalizing. The Externalizing Consortium: Principal Investigators: Danielle M. Dick, Philipp Koellinger, K. Paige Harden, Abraham A. Palmer. Lead Analysts: Richard Karlsson Linnér, Travis T. Mallard, Peter B. Barr, Sandra Sanchez-Roige. Significant Contributors: Irwin D. Waldman. The Externalizing Consortium has been supported by the National Institute on Alcohol Abuse and Alcoholism (R01AA015416 -administrative supplement), and the National Institute on Drug Abuse (R01DA050721). Additional funding for investigator effort has been provided by K02AA018755, U10AA008401, P50AA022537, as well as a European Research Council Consolidator Grant (647648 EdGe to Koellinger). The content is solely the responsibility of the authors and does not necessarily represent the official views of the above funding bodies. The Externalizing Consortium would like to thank the following groups for making the research possible: 23andMe, Add Health, Vanderbilt University Medical Center’s BioVU, Collaborative Study on the Genetics of Alcoholism (COGA), the Psychiatric Genomics Consortium’s Substance Use Disorders working group, UK10K Consortium, UK Biobank, and Philadelphia Neurodevelopmental Cohort.
Funding statement
This study was supported in part by grants from the National Institute of Mental Health (R01MH128371 and R01MH134039 to JJL, R21MH123908 to KGJ and MAW and an R01MH135119 to KGJ). JJL was also supported in part by a core grant to the Waisman Center from the National Institute of Child Health and Human Development (P50HD105353).
Competing interests
None to disclose.