Latent variable mixture modeling in psychiatric research – a review and application

J. Miettunen; T. Nordström; M. Kaakinen; A. O. Ahmed

doi:10.1017/S0033291715002305

Latent variable mixture modeling in psychiatric research – a review and application

Published online by Cambridge University Press: 03 November 2015

M. Kaakinen and

J. Miettunen*: Affiliation:
Center for Life Course Epidemiology and Systems Medicine, University of Oulu, Oulu, Finland Medical Research Center Oulu, Oulu University Hospital and University of Oulu, Oulu, Finland Department of Psychiatry, Research Unit of Clinical Neuroscience, University of Oulu, Oulu, Finland Department of Psychiatry, Oulu University Hospital, Oulu, Finland
T. Nordström: Affiliation:
Center for Life Course Epidemiology and Systems Medicine, University of Oulu, Oulu, Finland Medical Research Center Oulu, Oulu University Hospital and University of Oulu, Oulu, Finland
M. Kaakinen: Affiliation:
Center for Life Course Epidemiology and Systems Medicine, University of Oulu, Oulu, Finland Unit of General Practice, Oulu University Hospital, Oulu, Finland Biocenter Oulu, University of Oulu, Oulu, Finland Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, Imperial College London, London, UK
A. O. Ahmed: Affiliation:
Department of Psychiatry, Weill Cornell Medical College – Westchester Division, White Plains, NY, USA
*: *Address for correspondence: J. Miettunen, Center for Life Course Epidemiology and Systems Medicine, PO Box 5000, 90014 University of Oulu, Finland. (Email: jouko.miettunen@oulu.fi)

Article contents

Abstract
Introduction
A brief primer on LVMMs
LCA
FMMs
Mixture modeling of psychosis data from the NCS
CFA of NCS psychosis data
LCA of NCS psychosis data
FMM of NCS psychosis data
On the latent structure of the psychosis phenotype
Conclusions
References

Rights & Permissions

Abstract

Latent variable mixture modeling represents a flexible approach to investigating population heterogeneity by sorting cases into latent but non-arbitrary subgroups that are more homogeneous. The purpose of this selective review is to provide a non-technical introduction to mixture modeling in a cross-sectional context. Latent class analysis is used to classify individuals into homogeneous subgroups (latent classes). Factor mixture modeling represents a newer approach that represents a fusion of latent class analysis and factor analysis. Factor mixture models are adaptable to representing categorical and dimensional states of affairs. This article provides an overview of latent variable mixture models and illustrates the application of these methods by applying them to the study of the latent structure of psychotic experiences. The flexibility of latent variable mixture models makes them adaptable to the study of heterogeneity in complex psychiatric and psychological phenomena. They also allow researchers to address research questions that directly compare the viability of dimensional, categorical and hybrid conceptions of constructs.

Keywords

Factor mixture models latent class analysis psychosis statistics

Type: Review Article
Information: Psychological Medicine , Volume 46 , Issue 3 , February 2016 , pp. 457 - 467

DOI: https://doi.org/10.1017/S0033291715002305 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2015

Introduction

Researchers in the fields of psychiatry and psychology are fundamentally interested in the nature of individual differences that explain variability observed among people, objects, relationships and other entities. The forms of heterogeneity that are of interest are features of observations obtained from experimental, psychopharmacological, physiological, clinical, genetic, epidemiological and psychometric data. Latent variable modeling is a flexible analytical approach that allows researchers to study patterns of observations in data and make inferences about unobserved sources of population heterogeneity. The flexibility of latent variable modeling also allows researchers to study patterns of observations in data in relation to other external variables including covariates and distal outcomes. Latent variable mixture modeling (LVMM) is an extension of latent variable modeling to occasions when researchers are interested in testing hypotheses about categorical sources of population heterogeneity within a dataset. The purpose of this review is to provide a primer on LVMM and illustrate the use of LVMM by analysing cross-sectional data on psychotic experiences.

A brief primer on LVMMs

Researchers interested in the sources of individual differences are faced with an added conundrum of distinguishing among kinds of latent heterogeneity, that is, whether the observed pattern of responses is best explained by a few set dimensions or domains or non-arbitrary boundaries that distinguish ‘types’ of responders. The LVMM modeling framework provides tools that allows the researcher to sort cases into homogeneous subgroups of respondents that are more similar to each other than other subgroups. LVMM allows the researchers to directly test the comparative viability of alternate conjectures of the number of subgroups that underpin observations in the data. Recent analytic strategies also allow researchers to distinguish data with: (1) a categorical latent structure – that is, the observed distribution comprises two or more latent categories or subgroups of responders; (2) a dimensional structure – wherein observations subsume a single population and are best explained by one or more latent dimensions; and (3) a hybrid structure with observations explained by categorical and dimensional sources of heterogeneity.

Fundamentally, all latent variable models involve mathematical representations of the relation between latent variables and observed variables (Muthén & Muthén, Reference Muthén and Muthén2008). Models investigated depend on the distribution of the observations (dichotomous, ordered/unordered categorical, continuous, count, and mixed) and latent variable, and the cross-sectional v. longitudinal nature of the data. For example, latent class analysis (LCA) models categorical latent variables with categorical cross-sectional observations. Latent profile analysis is an extension of LCA to situations in which the indicator variables are continuous rather than categorical (Marsh et al. Reference Marsh, Lüdtke, Trautwein and Morin2009). Factor analysis models dimensional latent variables whereas LVMMs involve at least one categorical latent variable. Whereas individuals are represented along dimensions in factor analysis, mixture models assign cases into classes or subgroups. In LVMM, posterior probabilities, which represent the probability that an individual belongs to each class, are obtained and used to infer the correct class membership. It is expected that posterior probabilities may be non-zero for many classes (i.e. partial membership); however, individuals are assigned to the class with which they have the highest probability.

More complex latent variable models may incorporate causal or path relations (structural equation models, structural equation mixture models) or combine multiple latent variable models into hybrid models [e.g. factor mixture models (FMMs), growth mixture models] (Hoyle, Reference Hoyle2012). Fig. 1 provides a decision tree for selecting among several psychometric models for studying heterogeneity in datasets.

Fig. 1. A decision tree for selecting among different psychometric models for studying heterogeneity in datasets. ANOVA, Analysis of variance.

Previous reviews that discussed the application of latent variable models in psychiatry or psychology research (Raykov et al. Reference Raykov, Tomer and Nesselroade1991; Crowley & Fan, Reference Crowley and Fan1997; MacCallum & Austin, Reference MacCallum and Austin2000; Curran & Hussong, Reference Curran and Hussong2003; Streiner, Reference Streiner2005, Reference Streiner2006; Nelson et al. Reference Nelson, Aylward and Steele2008) are often quite old or focus on specific latent variable methods. The more recent reviews include those by Masyn et al. (Reference Masyn, Henderson and Greenbaum2010), Sterba & Bauer (Reference Sterba and Bauer2010), Hallquist & Wright (Reference Hallquist and Wright2014) and Wright & Hallquist (Reference Wright and Hallquist2014). In this article we present a mixture modeling framework that allows a researcher to determine whether a psychiatric construct is best viewed as categorical, dimensional or both (such that it conforms to a hybrid model). This framework seeks evidence of the best structural model by comparing purely dimensional models using the general factor model to categorical models (see, for example, Kuo et al. Reference Kuo, Aggen, Prescott, Kendler and Neale2008; Ahmed et al. Reference Ahmed, Green, Goodrum, Doane, Birgenheir and Buckley2013, Reference Ahmed, Strauss, Buchanan, Kirkpatrick and Carpenter2015; Wright et al. Reference Wright, Krueger, Hobbs, Markon, Eaton and Slade2013; Eaton et al. Reference Eaton, Krueger, Docherty and Sponheim2014; Lubke & Miller, Reference Lubke and Miller2015). The current effort focuses on the modeling of cross-sectional data with LCA and factor mixture modeling. We will illustrate the use of mixture models to study the latent structure of psychotic symptoms using data from the National Comorbidity Survey (NCS; Kessler et al. Reference Kessler, McGonagle, Zhao, Nelson, Hughes, Eshleman, Wittchen and Kendler1994).

LCA

We begin with LCA, a statistical procedure that can be used to classify individuals into homogeneous subgroups or latent classes. The latent class model is essentially a regression of observed indicator variables onto a set of one or more latent class variables (with dummy-coded variables representing individual categories of the latent class variable). The latent class model can be configured for outcome indicators that are binary, polytomous, ordered/unordered categorical, count or a combination of these response types. LCA may be appropriate for occasions in which the researcher believes that sample heterogeneity is best explained by two or more subgroups. LCA can be used in such a context to determine the number of classes or subgroups of respondents needed to sufficiently explain the differences in observed response patterns (Geiser, Reference Geiser2013). LCA makes a strong assumption of local independence in which the latent class variable accounts for all observed variation and covariance patterns observed in the data. Within individual classes, there is zero covariance and dependence among the observed indicator variables. In practice, the local independence assumption may be unrealistic for many psychiatric and psychological constructs. For example, within clinical samples, symptoms such as hallucinations, delusions, depression and anxiety are often fairly correlated and may be apt to violate local independence assumptions in ordinary LCA. Alternate configurations of the latent class model may relax assumptions of local independence and allow some observed variables to covary within classes (Vermunt & Magidson, Reference Vermunt and Magidson2000). In such models, direct effects or correlated errors may be incorporated to account for non-zero dependence among pairs of observations. Models that relax the independence assumption may be more appropriate for psychiatric data and the researcher may consider such models instead of ordinary LCA. It is reasonable to, however, begin by fitting ordinary LCA to the data and compare with one with relaxed local independence assumptions. This will allow the validity of the local independence assumption to be checked for the class solution imposed on the data to adjudge the appropriateness of the ordinary LCA to the data. Independence assumptions can be checked through an examination of the bivariate residuals. More on LCA can be found from previous reviews (Muthén, Reference Muthén, Marcoulides and Schumacker2001; Hagenaars & McCutcheon, Reference Hagenaars and McCutcheon2002; Collins & Lanza, Reference Collins and Lanza2010).

One consideration in the application of LCA to psychiatric data is determining the number of classes to fit to the data or the class models to compare. It is helpful in this case to be informed by theoretical conjectures regarding the latent structure of the construct. LCA may provide a means to compare the relative viabilities of alternate conjectures regarding subtypes or subgroups. One common example is comparing a dimensional (one-class) hypothesis to dichotomous or polytomous conjectures. Many studies using LCA have chosen to compare several class solutions by progressively increasing the number of classes fitted on the data until a preferred solution is reached. However, fitting multiple structural models to the same data might inflate type 1 error rates for failing to reject less complex models. Therefore, the results may be less reliable using this approach.

FMMs

More recent investigations of subgroups in samples have used a collection of modeling techniques that represent a fusion of LCA and factor analysis (Yung, Reference Yung1997; Dolan & van der Maas, Reference Dolan and van der Maas1998; Arminger et al. Reference Arminger, Stein and Wittenberg1999; Vermunt & Magidson, Reference Vermunt, Magidson, Hagenaars and McCutcheon2002; Lubke & Muthén, Reference Lubke and Muthén2005; Muthén, Reference Muthén2006). This family of FMMs is advantageous because they are capable of representing categorical and dimensional state of affairs (Muthén, Reference Muthén2006). FMMs can provide the numbers of latent classes that best describe the data. FMMs can also provide factor scores, which may aid in determining whether differences across groups on indicators are best viewed as qualitative or quantitative through an examination of class differences in factor scores. FMMs are based on the premise that the observed relationship among a set of indicators is influenced by one latent class variable and one or more continuous factors. FMMs draw from LCA in their assumption that a latent class variable influences the observations in the population, but depart from LCA in that they do not assume local independence, but rather suggest that within classes, one or more factors influence the indicator variables, causing them to covary. As enumerated by Muthén and colleagues, FMMs vary in their degree of complexity (Lubke & Muthén, Reference Lubke and Muthén2005; Clark et al. Reference Clark, Muthén, Kaprio, D'Onofrio, Viken, Rose and Smalley2013). Indeed ordinary LCA is a very restricted form of FMM with zero factor loadings or factor variance. Compared with other FMMs, however, LCA tends to extract many more classes given that response patterns and association among observed variables are attributed to mean differences across classes. Fewer classes are needed to explain observed associations in FMM if within-class covariance is permitted.

Latent class factor analysis (LCFA)

LCFA (Muthén, Reference Muthén2006) represents another simple configuration of FMM. Relative to other FMMs, LCFA is characterized by class-invariant factor loading and item thresholds – LCFA and other simple configurations of FMMs may fix factor loadings and item thresholds to be equal across classes. In this case, the computed factor variance and covariance are zero whereas item probabilities are unequal. In more complex configurations, factor variance/covariance across classes can be relaxed and item thresholds may be unequal across classes (Clark et al. Reference Clark, Muthén, Kaprio, D'Onofrio, Viken, Rose and Smalley2013). The flexibility of FMMs also allows the researcher to incorporate the effect of covariates on latent variables. For example, if a researcher is interested in whether a family history of alcoholism contributes to the likelihood of group membership in a particular class, this variable may be incorporated into the model as a covariate on which factor and class variables can be regressed. Lubke & Muthén (Reference Lubke and Muthén2007) conceptualize the final step as a multinomial logistic regression with the covariate predicting the log odds of the probability of belonging to one class as compared with belonging to the reference category.

The flexibility of the FMM approach allows the researcher to build measurement models that may be most reasonable for response patterns observed in the data. In this context, decisions have to be made about: (1) the number of factors to be imposed on the model; (2) the number of classes to be fitted on the model; and (3) whether factor loading should be variant or invariant across conjectured classes. As with other latent variable models, a theory-driven approach may confer the advantage of limiting the number of factor–class configurations that would be fitted to the data, thereby limiting the risk of type 1 error. Competing conjectures about the factor structure and/or class structure of the construct may be tested directly in FMM. In the absence of strong theories, an exploratory approach informed by previous findings regarding factor and/or class structure may be useful. Depending on the source of the data, one may expect items that assess psychiatric symptoms to have low probabilities in general population samples but higher probabilities in hospital-based samples. Psychiatric patients would be expected to endorse such items at higher rates than non-clinical cases. LCFA may be most appropriate for occasions in which item probabilities are not expected to be starkly different across classes – that is, the observed variable is equally relevant to each class. It might be, however, worthwhile to free factor loading and variances across classes to allow for more complex response patterns in general population samples. This would allow items to vary in the degree to which they contribute to within-class severity differences and allow classes to differ in their level of heterogeneity.

Interpretational framework

The family of LVMMs allows researchers to compare the relative fits of competing structural models to the research data. The structural hypotheses to be compared often vary by the number of classes to be imposed on the data and the number of underlying factors or dimensions that explain the covariance among outcome variables. Suppose a researcher were interested in the optimal number of classes and dimensions that underlie data. As previously discussed, existing conjectures about the construct would serve well to inform model specification and analytic strategy. In the absence of guiding theory, an exploratory approach could be taken but results have to be interpreted with caution. The first step would be to run a series of confirmatory factor analyses (CFAs) to compare the relative fits of either theoretically conjectured or exploratory factor solutions to the data. The best factor structure can be selected informed by relevant fit indices that include information criteria (discussed below), confirmatory fit index (CFI), Tucker–Lewis index (TLI) and the root mean square error of approximation (RMSEA). The available fit indices would depend on the estimator selected to analyse the model. The next step would be to run a series of LCAs testing various class solutions. In the final step, FMMs can be conducted, informed by the best factor solution from CFA while imposing several class solutions. Given the propensity of LCA to favor models with a larger number of classes and extract too many classes from the data, it is suggested that LCA set an upper limit for the number of extracted classes in the FMM (see, for example, Ahmed et al. Reference Ahmed, Green, Goodrum, Doane, Birgenheir and Buckley2013).

Model fit

The preferred class model is selected using fit indices; however, the selection of preferred models should also be informed by theory and previous research (Bauer & Curran, Reference Bauer and Curran2003; Ram & Grimm, Reference Ram and Grimm2009). Likelihood ratio-based statistics and three sets of information criteria including the Akaike information criterion (AIC; Akaike, Reference Akaike1987), Bayesian information criterion (BIC; Schwarz, Reference Schwarz1978) and sample size-adjusted BIC (aBIC; Sclove, Reference Sclove1987) are available (e.g. in Mplus) as fit indices (Nylund et al. Reference Nylund, Asparouhov and Muthén2007). For more information on these fit indices and on available statistical software, see the online Supplementary Appendices S1 and S2.

Mixture modeling of psychosis data from the NCS

This section illustrates the use of mixture modeling to examine the latent structure of psychosis in the NCS (Kessler et al. Reference Kessler, McGonagle, Zhao, Nelson, Hughes, Eshleman, Wittchen and Kendler1994). To investigate if classes underpin psychotic experiences, several considerations can inform the model specification and analytic strategy. Competing theories about the existence of boundaries in the psychosis phenotype can inform a research question and analytic strategy. The current Diagnostic and Statistical Manual of Mental Disorders (DSM) classification of psychotic disorders is informed by the Kraepelinian dichotomy that distinguishes psychosis in schizophrenia and other psychotic disorders from psychotic experiences in mood disorders (Craddock & Owen, Reference Craddock and Owen2005; Tandon et al. Reference Tandon, Keshavan and Nasrallah2008). The DSM thus posits a categorical approach to psychosis with two or more classes. In contrast, dimensional models of psychosis posit that psychotic experiences cut across diagnostic boundaries so that they are feature of psychotic disorders, mood disturbances and other psychiatric illnesses (Strauss, Reference Strauss1969; Allardyce et al. Reference Allardyce, Suppes and van Os2007). Dimensional views also argue that psychotic experiences, which are prevalent among even the unaffected, exist on a severity gradient, with people with psychotic disorders at the severe end and the rest of the population at various levels of the gradient (Strauss, Reference Strauss1969). Previous LCA studies of psychosis have tended to identify three or four classes (Shevlin et al. Reference Shevlin, Murphy, Dorahy and Adamson2007; Murphy et al. Reference Murphy, Shevlin, Adamson and Houston2010; Gale et al. Reference Gale, Wells, McGee and Oakley Browne2011; Ndetei et al. Reference Ndetei, Muriungi, Owoso, Mutiso, Mbwayo, Khasakhala, Barch and Mamah2012; Mamah et al. Reference Mamah, Owoso, Mbwayo, Mutiso, Muriungi, Khasakhala, Barch and Ndetei2013). Shevlin et al. (Reference Shevlin, Murphy, Dorahy and Adamson2007), who used data from the NCS, found evidence of four classes of psychotic experiences. These studies, however, suggested that a latent dimension may underlie the classes they detected without actually testing a dimensional hypothesis.

Factor analytic studies of subclinical psychotic experiences have been less consistent, mostly due to differences in the item coverage of symptoms (Kitamura et al. Reference Kitamura, Okazaki, Fujinawa, Takayanagi and Kasahara1998; Wuthrich & Bates, Reference Wuthrich and Bates2006; Fonseca-Pedrero et al. Reference Fonseca-Pedrero, Lemos-Giráldez, Paino, Sierra-Baigrie, Villazón-García, García-Portilla González and Muñiz2010; Murphy et al. Reference Murphy, Shevlin, Adamson and Houston2010; Bakhshaie et al. Reference Bakhshaie, Sharifi and Amini2011; Heering et al. Reference Heering, van Haren and Derks2013). Factor solutions have ranged from two to as many as five factors. One study using data from the NCS produced evidence of a three-factor structure of psychotic experiences (Murphy et al. Reference Murphy, Shevlin, Adamson and Houston2010). No studies have, however, examined the structure of psychosis using hybrid FMMs. The greater question of whether psychotic experiences are categorical, dimensional or possess elements of both can be addressed by comparing purely dimensional models with class models and hybrid FMMs. Previous investigations in the NCS data have favored as many as four classes and three factors; thus it might make sense to fit as many classes and factors in LCA and CFA models, respectively.

Although psychotic experiences were prevalent in the NCS sample, endorsed by 28.4% of respondents, the actual prevalence of non-affective psychosis in the sample was no greater than 0.7% (Kendler et al. Reference Kendler, Gallagher, Abelson and Kessler1996). One question is how to handle the majority of respondents who neither meet criteria nor endorse any psychotic experiences. One consideration could be to exclude those individuals from the analysis but this is an undesirable approach given that zero-response patterns are a property by virtue of many psychiatric constructs when measured in the general population. Moreover, covariates of zero response patterns may be useful and informative in some context. A more defensible approach is to incorporate a zero class in which factor means and variances are set to zero for the specified zero class whereas non-zero classes are allowed to be estimated (Muthén & Asparouhov, Reference Muthén and Asparouhov2006). The degree to which incorporating a zero class improves or degrades a model can be investigated.

With regard to psychiatric theory, purely dimensional models may suggest that factor loading should be naturally invariant across the entire sample and similar for both individuals that meet criteria for non-affective psychosis and unaffected cases. Hybrid theoretical models can, however, accommodate the presence or absence of measurement invariance. An example is the similarities in the dimensions that underlie schizotypy across clinical samples and unaffected individuals (Reynolds et al. Reference Reynolds, Raine, Mellingen, Venables and Mednick2000). Analytically, it is possible to fit and compare models that assume that the same factors influence item probabilities across classes (i.e. LCFA) and more complex models that relax this assumption (more complex FMMs). In the latter, the dimensions are not equivalent across classes.

Sample size considerations

Adequate sample sizes are needed for implementing mixture models. Although conventions common to latent variable models such as 10 observations to each parameter estimated have been applied to LVMM, this has been found to be unreliable (Wolf et al. Reference Wolf, Harrington, Clark and Miller2013). Moreover, the sample size requirements depend on many factors including the complexity of the models to be estimated, the number of variables and their distribution, and the number of classes to be estimated (Muthén & Muthén, Reference Muthén and Muthén2002; Wolf et al. Reference Wolf, Harrington, Clark and Miller2013). The use of Monte Carlo samples has been recommended as a method for determining the adequacy of sample size for latent variable models (Muthén & Muthén, Reference Muthén and Muthén2002; Wolf et al. Reference Wolf, Harrington, Clark and Miller2013). The models to be evaluated are imposed on several samples that mimic the characteristics of the research data. Adequate sample sizes are necessary for attenuating bias in the estimation of parameters and standard errors, and decreasing the risk of non-convergence, and improper solutions. Relevant and meaningful classes, particularly those with small class probabilities, may be poorly represented and difficult to detect in small samples. The use of an epidemiological sample in the current demonstration, however, ensures that putative classes are well represented for the analysis.

The NCS was completed by the Survey Research Center of the University of Michigan between 1990 and 1992. The epidemiological survey assessed several psychiatric symptoms in 8098 non-institutionalized individuals aged 18–54 years, living in the USA. The NCS obtained information about psychotic experiences from 5908 respondents using survey questions drawn from the World Mental Health Composite International Diagnostic Interview (WMH-CIDI). The NCS psychosis questionnaire comprised 90 items that assessed positive symptoms, age of onset, duration, rule outs and treatment seeking. For the ease of running the mixture models, we selected 20 symptom items from the psychotic disorders section (Table 1). The selected items assessed hallucinations of all sensory domains, although many more items focused on auditory hallucinations. Participants’ responses to the 20 items were submitted to LCA, CFA, LCFA and a less restrictive form of FMM. All analyses were completed using Mplus 5 (Muthén & Muthén, Reference Muthén and Muthén2008). Annotated syntaxes are presented in the online Supplementary Appendix S3.

Table 1. Selected National Comorbidity Survey Items, endorsement rates and loadings from exploratory factor analysis ^a

^a Loadings were generated from a Quartimin rotation.

^b Preferred factor assignment.

CFA of NCS psychosis data

First, factor models were fitted to the psychosis symptom items. We fitted one to four factors because fitting these models would allow us to compare the three-factor model of Murphy et al. (Reference Murphy, Shevlin, Adamson and Houston2010) of the NCS psychosis items to alternative models. Each CFA was tested using two estimators – maximum likelihood parameter estimates with standard errors robust to non-normality (robust maximum likelihood; MLR) and robust weighted least square (WLSM). The fit statistics for the factor models are presented in Table 2 including maximum log-likelihood estimates, information criteria, CFI, TLI and RMSEA. Examination of the fit statistics suggests that a three-factor model is the preferred model. The information criteria are lowest for a three-factor model. Moreover, examination of the CFI, TLI and RMSEA shows little improvement from a three-factor to a four-factor model. The CFA results confirmed the finding of Murphy et al. (Reference Murphy, Shevlin, Adamson and Houston2010) of a three-factor structure for the NCS psychosis items. Table 2 also includes results of an exploratory factor analytic run which similarly favored a three-factor model.

Table 2. Factor analysis of National Comorbidity Survey psychosis data: model fit results

LL, Log-likelihood; k, number of free parameters; AIC, Akaike information criterion; BIC, Bayesian information criterion; aBIC, sample size-adjusted BIC; CFI, confirmatory fit index; TLI, Tucker–Lewis index; RMSEA, root mean square error of approximation; CFA, confirmatory factor analysis; EFA, exploratory factor analysis.

^a Preferred model.

LCA of NCS psychosis data

Next, the 20 psychosis items were submitted to LCA with solutions ranging from two to five classes. Fitting this number of models will allow us to confirm/disconfirm prevalent LCA models of psychosis that often suggest three or four classes, including the finding of Shevlin et al. (Reference Shevlin, Murphy, Dorahy and Adamson2007) of a four-class model. Table 2 summarizes the fit statistics obtained from all analytic models. The AIC and the aBIC slightly favored the four-class model whereas the BIC favored the three-class model. While the four-class preference of Shevlin et al. (Reference Shevlin, Murphy, Dorahy and Adamson2007) is supported, an argument can also be made for a three-class model. The aBIC only decreases by nine points and the AIC by about 22 points going from the three-class to the four-class model, suggesting that not much is gained beyond the three-class model. Fig. 2 depicts the item profiles of the three-class solution with the item numbers on the x-axis and the item probability on the y-axis. The item endorsement profile across classes is parallel with the classes, appearing to be ordered from lowest to highest in item endorsement probabilities. The mixing proportions of the three classes from class 1, which has the highest endorsement probability, to class 3, which has the lowest, are 2.34, 81.95 and 15.71%. The classification quality (entropy) or the degree to which classes could readily be distinguishable is adequate at 0.826.

Fig. 2. Item profiles of the three-class solution with the item numbers on the x-axis and the item probability on the y-axis.

FMM of NCS psychosis data

Given that the endorsement of psychosis items is highest among people who met criteria for a psychotic disorder and lowest among unaffected individuals, it made sense to fit a FMM that relaxes measurement invariance assumptions to the data. We compared this relatively more complex FMM with a simple FMM with class-invariant factor loadings and item thresholds (LCFA). All of the fitted hybrid models included three factors, informed by the best-fit CFA model. We set a four-class structure as the upper limit for the number of classes, informed partly by the preferred model from the LCA. It should be noted that for both the LCFA and the more complex FMM, the three-factor four-class solution could not be estimated by the program. In some cases, this may suggest that too many classes are being extracted from the data.

The results of the FMM and the fit statistics are summarized in Table 3. When the LCFA and the complex FMM are compared, the LCFA is a slightly better fit when all of the fit indices are examined. However, when all of the hybrid models are compared with the three-factor CFA, the information criteria are lower for the latter. Next, we examine the p values obtained when the three-factor solution is nested within the more complex three-factor two-class solution. Here, the p values produced by the LMR, aLMR and bootstrap likelihood ratio test exceed the 0.05 α level. We therefore fail to reject the three-factor solution. Across all latent variable models (CFA, LCA, LCFA and FMM), the fit indices suggest that the psychosis data are best described by three factors and one population (i.e. a purely dimensional model).

Table 3. Latent variable mixture modeling of National Comorbidity Survey psychosis data: model fit results ^a

LL, Log-likelihood; k, number of free parameters; AIC, Akaike information criterion; BIC, Bayesian information criterion, aBIC, sample size-adjusted BIC; LMR, Lo–Mendell–Rubin test; aLMR, sample size-adjusted LMR; BLRT, bootstrap likelihood ratio test; LCA, latent class analysis; LCFA, latent class factor analysis; FMM, factor mixture models.

^a Note that the three-factor, four-class models were non-identified.

^b Preferred model.

On the latent structure of the psychosis phenotype

Responses to psychosis items on the NCS conform to a three-factor structure, suggesting that psychotic experiences exist on continua of severity in the general population. The absence of distinct boundaries in psychotic experiences is consistent with dimensional conceptions of the psychosis phenotype (Strauss, Reference Strauss1969; Allardyce et al. Reference Allardyce, Suppes and van Os2007; Ahmed et al. Reference Ahmed, Buckley and Mabe2012a , Reference Ahmed, Green, Buckley and McFarland b ).

Conclusions

Latent variable models are a family of flexible procedures that allow researchers to describe the associations that exists among latent variables, their manifest indicators and their covariates. The flexibility of latent variable models allows the latent variables of interest and their manifest indicators to be continuous, categorical or both. Factor analytic models depict associations between a continuous latent variable and a set of categorical and/or continuous manifest (outcome) variables. LVMM is an extension of latent variable modeling to occasions when researchers are interested in testing hypotheses about categorical sources of population heterogeneity within a dataset. LCA depicts the association between a categorical latent variable and its categorical and/or dimensional manifest variables. The family of FMMs combines factor analysis and LCA to estimate models that conjecture the existence of co-occurring categorical and dimensional latent variables that both influence manifest variables.

LVMMs have seen an increased use in their application to study psychiatric and psychological constructs, no doubt due to the simultaneous development of computer capacity and the availability of analysis software. LVMMs, however, remain underutilized. Although ideal for this sort of investigations, very few studies have used FMMs to study discontinuities in psychiatric disorders. This represents an unfavorable contrast to the wide application of LCA and multivariate taxometric methods (Meehl, Reference Meehl1999; Haslam et al. Reference Haslam, Holland and Kuppens2012). The current application of LVMM to NCS psychosis data demonstrates that it can provide results that are consistent with those of taxometric methods. Specifically, the dimensional structure of psychotic experiences in the general population is further supported by the current demonstration.

LVMMs require several choices from the researchers – the variables that are to be used in analyses, the associations to be modeled and how the latent variables are named and interpreted. All these choices should be well justified. LVMM offers substantial advantages over more traditional statistical methods and use of these methods is recommended for various research questions and designs in psychiatric and psychological research.

Supplementary material

For supplementary material accompanying this paper visit http://dx.doi.org/10.1017/S0033291715002305

Acknowledgements

This study has been funded by the Academy of Finland (project grant 268336), the Jalmari and Rauha Ahokas Foundation, the Sigrid Jusélius Foundation and the Northern Finland Health Care Support Foundation. The funders had no role in the conduct of the study.

Declaration of Interest

None.

References

Ahmed, AO, Buckley, PF, Mabe, PA (2012 a). Latent structure of psychotic experiences in the general population. Acta Psychiatrica Scandinavica 125, 54–65.Google Scholar

Ahmed, AO, Green, BA, Buckley, PF, McFarland, ME (2012 b). Taxometric analyses of paranoid and schizoid personality disorders. Psychiatry Research 196, 123–132.Google Scholar

Ahmed, AO, Green, BA, Goodrum, NM, Doane, NJ, Birgenheir, D, Buckley, PF (2013). Does a latent class underlie schizotypal personality disorder? Implications for schizophrenia. Journal of Abnormal Psychology 122, 475–491.Google Scholar

Ahmed, AO, Strauss, GP, Buchanan, RW, Kirkpatrick, B, Carpenter, WT (2015). Are negative symptoms dimensional or categorical? Detection and validation of deficit schizophrenia with taxometric and latent variable mixture models. Schizophrenia Bulletin 41, 879–891.CrossRef Google Scholar PubMed

Akaike, H (1987). Factor analysis and AIC. Psychometrika 52, 317–332.Google Scholar

Allardyce, J, Suppes, T, van Os, J (2007). Dimensions and the psychosis phenotype. International Journal of Methods in Psychiatric Research 16 (Suppl. 1), S34–S40.Google Scholar

Arminger, G, Stein, P, Wittenberg, J (1999). Mixtures of conditional mean and covariance structure models. Psychometrika 64, 475–494.Google Scholar

Bakhshaie, J, Sharifi, V, Amini, J (2011). Exploratory factor analysis of SCL90-R symptoms relevant to psychosis. Iranian Journal of Psychiatry 6, 128–132.Google Scholar

Bauer, DJ, Curran, PJ (2003). Distributional assumptions of growth mixture models: implications for overextraction of latent trajectory classes. Psychological Methods 8, 338–363.Google Scholar

Clark, SL, Muthén, B, Kaprio, J, D'Onofrio, BM, Viken, R, Rose, RJ, Smalley, SL (2013). Models and strategies for factor mixture analysis: two examples concerning the structure underlying psychological disorders. Structural Equation Modeling: A Multidisciplinary Journal 20, 681–703.Google Scholar

Collins, LM, Lanza, ST (2010). Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences. John Wiley & Sons: New York.Google Scholar

Craddock, N, Owen, MJ (2005). The beginning of the end for the Kraepelinian dichotomy. British Journal of Psychiatry 186, 364–366.Google Scholar

Crowley, SL, Fan, X (1997). Structural equation modeling: basic concepts and applications in personality assessment research. Journal of Personality Assessment 68, 508–531.Google Scholar

Curran, PJ, Hussong, AM (2003). The use of latent trajectory models in psychopathology research. Journal of Abnormal Psychology 112, 526–544.Google Scholar

Dolan, CV, van der Maas, HLJ (1998). Fitting multivariate normal finite mixtures subject to structural equation modeling. Psychometrika 63, 225–253.CrossRef Google Scholar

Eaton, NR, Krueger, RF, Docherty, AR, Sponheim, SR (2014). Toward a model-based approach to the clinical assessment of personality psychopathology. Journal of Personality Assessment 96, 283–292.Google Scholar

Fonseca-Pedrero, E, Lemos-Giráldez, S, Paino, M, Sierra-Baigrie, S, Villazón-García, Ú, García-Portilla González, MP, Muñiz, J (2010). Dimensionality of hallucinatory predisposition: confirmatory factor analysis of the Launay–Slade Hallucination Scale-revised in college students. Anales de Psicología 26, 41–48.Google Scholar

Gale, CK, Wells, JE, McGee, MA, Oakley Browne, MA (2011). A latent class analysis of psychosis-like experiences in the New Zealand Mental Health Survey. Acta Psychiatrica Scandinavica 124, 205–213.CrossRef Google Scholar PubMed

Geiser, C (2013). Data Analysis with Mplus. The Guilford Press: New York.Google Scholar

Hagenaars, JA, McCutcheon, AL (2002). Applied Latent Class Analysis, pp. 476. Kluwer: Dordrecht, The Netherlands.Google Scholar

Hallquist, MN, Wright, AGC (2014). Mixture modeling methods for the assessment of normal and abnormal personality part I: cross-sectional models. Journal of Personality Assessment 96, 256–268.Google Scholar

Haslam, N, Holland, E, Kuppens, P (2012). Categories versus dimensions in personality and psychopathology: a quantitative review of taxometric research. Psychological Medicine 42, 903–920.CrossRef Google Scholar PubMed

Heering, HD, van Haren, NEM, Derks, EM (2013). A two-factor structure of first rank symptoms in patients with a psychotic disorder. Schizophrenia Research 147, 269–274.Google Scholar

Hoyle, RH (editor) (2012). Handbook of Structural Equation Modeling. The Guilford Press: New York.Google Scholar

Kendler, KS, Gallagher, TJ, Abelson, JM, Kessler, RC (1996). Lifetime prevalence, demographic risk factors, and diagnostic validity of nonaffective psychosis as assessed in a US community sample: the National Comorbidity Survey. Archives of General Psychiatry 53, 1022–1031.CrossRef Google Scholar

Kessler, RC, McGonagle, KA, Zhao, S, Nelson, CB, Hughes, M, Eshleman, S, Wittchen, H-U, Kendler, KS (1994). Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States: results from the National Comorbidity Survey. Archives of General Psychiatry 51, 8–19.CrossRef Google Scholar PubMed

Kitamura, T, Okazaki, Y, Fujinawa, A, Takayanagi, I, Kasahara, Y (1998). Dimensions of schizophrenic positive symptoms: an exploratory factor analysis investigation. European Archives of Psychiatry and Clinical Neuroscience 248, 130–135.Google Scholar

Kuo, PH, Aggen, SH, Prescott, CA, Kendler, KS, Neale, MC (2008). Using a factor mixture modeling approach in alcohol dependence in a general population sample. Drug and Alcohol Dependence 98, 105–114.Google Scholar

Lubke, GH, Miller, PJ (2015). Does nature have joints worth carving? A discussion of taxometrics, model-based clustering and latent variable mixture modeling. Psychological Medicine 45, 705–715.CrossRef Google Scholar PubMed

Lubke, GH, Muthén, B (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods 10, 21–39.Google Scholar

Lubke, G, Muthén, B (2007). Performance of factor mixture models as a function of model size, covariate effects, and class-specific parameters. Structural Equation Modeling 14, 26–47.Google Scholar

MacCallum, RC, Austin, JT (2000). Applications of structural equation modeling in psychological research. Annual Review of Psychology 51, 201–226.Google Scholar

Mamah, D, Owoso, A, Mbwayo, AW, Mutiso, VN, Muriungi, SK, Khasakhala, LI, Barch, DM, Ndetei, DM (2013). Classes of psychotic experiences in Kenyan children and adolescents. Child Psychiatry and Human Development 44, 452–459.Google Scholar

Marsh, HW, Lüdtke, O, Trautwein, U, Morin, AJS (2009). Classical latent profile analysis of academic self-concept dimensions: synergy of person- and variable-centered approaches to theoretical models of self-concept. Structural Equation Modeling: A Multidisciplinary Journal 16, 191–225.Google Scholar

Masyn, KE, Henderson, CE, Greenbaum, PE (2010). Exploring the latent structures of psychological constructs in social development using the dimensional–categorical spectrum. Social Development 19, 470–493.Google Scholar

Meehl, PE (1999). Clarifications about taxometric method. Applied and Preventive Psychology 8, 165–174.Google Scholar

Murphy, J, Shevlin, M, Adamson, G, Houston, JE (2010). Positive psychosis symptom structure in the general population: assessing dimensional consistency and continuity from “pathology” to “normality”. Psychosis: Psychological, Social and Integrative Approaches 2, 199–209.Google Scholar

Muthén, B (2001). Latent variable mixture modeling. In New Developments and Techniques in Structural Equation Modeling. (ed. Marcoulides, G. A. and Schumacker, R. E.), pp. 1–33. Lawrence Erlbaum Associates: Mahwah, NJ.Google Scholar

Muthén, B (2006). Should substance use disorders be considered as categorical or dimensional? Addiction 101 (Suppl. 1), 6–16.Google Scholar

Muthén, B, Asparouhov, T (2006). Item response mixture modeling: application to tobacco dependence. Addictive Behaviors 31, 1050–1066.Google Scholar

Muthén, LK, Muthén, BO (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling 9, 599–620.Google Scholar

Muthén, LK, Muthén, BO (2008). Mplus (version 5.1) . Muthén & Muthén: Los Angeles, CA.Google Scholar

Ndetei, DM, Muriungi, SK, Owoso, A, Mutiso, VN, Mbwayo, AW, Khasakhala, LI, Barch, DM, Mamah, D (2012). Prevalence and characteristics of psychotic-like experiences in Kenyan youth. Psychiatry Research 196, 235–242.Google Scholar

Nelson, TD, Aylward, BS, Steele, RG (2008). Structural equation modeling in pediatric psychology: overview and review of applications. Journal of Pediatric Psychology 33, 679–687.Google Scholar

Nylund, KL, Asparouhov, T, Muthén, B (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary Journal 14, 535–569.Google Scholar

Ram, N, Grimm, KJ (2009). Growth mixture modeling: a method for identifying differences in longitudinal change among unobserved groups. International Journal of Behavioral Development 33, 565–576.Google Scholar

Raykov, T, Tomer, A, Nesselroade, JR (1991). Reporting structural equation modeling results in psychology and aging: some proposed guidelines. Psychology and Aging 6, 499–503.Google Scholar

Reynolds, CA, Raine, A, Mellingen, K, Venables, PH, Mednick, SA (2000). Three-factor model of schizotypal personality: invariance across culture, gender, religious affiliation, family adversity, and psychopathology. Schizophrenia Bulletin 26, 603–618.Google Scholar

Schwarz, G (1978). Estimating the dimension of a model. Annals of Statistics 6, 461–464.CrossRef Google Scholar

Sclove, LS (1987). Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52, 333–343.Google Scholar

Shevlin, M, Murphy, J, Dorahy, MJ, Adamson, G (2007). The distribution of positive psychosis-like symptoms in the population: a latent class analysis of the National Comorbidity Survey. Schizophrenia Research 89, 101–109.Google Scholar

Sterba, SK, Bauer, DJ (2010). Matching method with theory in person-oriented developmental psychopathology research. Development and Psychopathology 22, 239–254.Google Scholar

Strauss, JS (1969). Hallucinations and delusions as points on continua function. Rating scale evidence. Archives of General Psychiatry 21, 581–586.Google Scholar

Streiner, DL (2005). Finding our way: an introduction to path analysis. Canadian Journal of Psychiatry 50, 115–122.Google Scholar

Streiner, DL (2006). Building a better model: an introduction to structural equation modeling. Canadian Journal of Psychiatry 51, 317–324.Google Scholar

Tandon, R, Keshavan, MS, Nasrallah, HA (2008). Schizophrenia, “Just the Facts”: what we know in 2008. Schizophrenia Research 100, 4–19.Google Scholar

Vermunt, JK, Magidson, J (2000). Latent class models for classification. Computational Statistics 41, 531–537.Google Scholar

Vermunt, JK, Magidson, J (2002). Latent class cluster analysis. In Applied Latent Class Analysis (ed. Hagenaars, J. A. and McCutcheon, A. L.), pp. 89–106. Cambridge University Press: Cambridge, UK.Google Scholar

Wolf, EJ, Harrington, KM, Clark, SL, Miller, MW (2013). Sample size requirements for structural equation models: an evaluation of power, bias, and solution propriety. Educational and Psychological Measurement 73, 913–934.Google Scholar

Wright, AGC, Hallquist, MN (2014). Mixture modeling methods for the assessment of normal and abnormal personality part II: longitudinal models. Journal of Personality Assessment 96, 269–282.Google Scholar

Wright, AGC, Krueger, RF, Hobbs, MJ, Markon, KE, Eaton, NR, Slade, T (2013). The structure of psychopathology: toward an expanded quantitative empirical model. Journal of Abnormal Psychology 122, 281–294.Google Scholar

Wuthrich, VM, Bates, TC (2006). Confirmatory factor analysis of the three-factor structure of the Schizotypal Personality Questionnaire and Chapman schizotypy scales. Journal of Personality Assessment 87, 292–304.Google Scholar

Yung, YF (1997). Finite mixtures in confirmatory factor analysis models. Psychometrika 62, 297–330.Google Scholar

Fig. 1. A decision tree for selecting among different psychometric models for studying heterogeneity in datasets. ANOVA, Analysis of variance.

Table 1. Selected National Comorbidity Survey Items, endorsement rates and loadings from exploratory factor analysisa

Table 2. Factor analysis of National Comorbidity Survey psychosis data: model fit results

Fig. 2. Item profiles of the three-class solution with the item numbers on the x-axis and the item probability on the y-axis.

Table 3. Latent variable mixture modeling of National Comorbidity Survey psychosis data: model fit resultsa

Miettunen supplementary material

Appendix

File 49.2 KB

Article contents

Latent variable mixture modeling in psychiatric research – a review and application

Abstract

Keywords

Introduction

A brief primer on LVMMs

LCA

FMMs

Latent class factor analysis (LCFA)

Interpretational framework

Model fit

Mixture modeling of psychosis data from the NCS

Sample size considerations

CFA of NCS psychosis data

LCA of NCS psychosis data

FMM of NCS psychosis data

On the latent structure of the psychosis phenotype

Conclusions

Supplementary material

Acknowledgements

Declaration of Interest

References

Miettunen supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests