Introduction
In contrast with emerging dimensional psychiatric classification systems (e.g. Research Domain Criteria; Cuthbert, Reference Cuthbert2014), the latest revisions of the predominant psychiatric diagnostic systems [e.g. Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM-5); American Psychiatric Association, 2013] continue to operationalize mood disorders as categorical phenomena in at least two ways. First, clinically significant depressive or manic experiences are operationalized as discrete episodes, temporally separated by well periods and phenomenologically separated from subsyndromal symptomatology. Second, these discrete episodes form the building blocks of the prototypical mood disorder diagnoses (e.g. bipolar I, bipolar II, and unipolar depressive disorders), which are themselves categorical phenomena; the presence of a mood episode is necessary and sufficient (barring the presence of exclusionary criteria) to diagnose a mood disorder, whereas the absence of a mood episode precludes diagnosis of a mood disorder. The sharp temporal and phenomenological boundaries that comprise mood episodes and mood disorder diagnoses were originally created by a committee of experts for DSM-III to bolster diagnostic reliability and have changed little since that time, despite mounting evidence that the boundaries are misplaced (e.g. Judd & Akiskal, Reference Judd and Akiskal2003; Angst et al. Reference Angst, Gamma, Bowden, Azorin, Perugi, Vieta and Young2012) or that the practice of forming boundaries itself is inappropriate given a demonstrated lack of discontinuity between subsyndromal and syndromal mood experiences (e.g. Ruscio & Ruscio, Reference Ruscio and Ruscio2000; Ruscio et al. Reference Ruscio, Ruscio and Keane2004; Prisciandaro & Roberts, Reference Prisciandaro and Roberts2005, 2011).
Evidence supporting the continuity of subsyndromal and syndromal mood phenomena has not yet been incorporated into the predominant nosological systems, perhaps, in part, because the evidence has been inconsistent and/or incomplete. First, most investigations have focused on depressive symptoms; only two studies have investigated manic symptoms (Ahmed et al. Reference Ahmed, Green, Clark, Stahl and Mcfarland2011; Prisciandaro & Roberts, Reference Prisciandaro and Roberts2011). Second, nearly all investigations have focused on the boundary between disorder and normal functioning by evaluating large population-based samples containing both individuals with and without diagnosable mood disorders. Another important boundary, between ‘sick’ and ‘well’ periods in patients already diagnosed with mood disorders, has been virtually neglected in the literature. There is increasing awareness that characterization of mood disorders as having an episodic course, with clearly defined inter-episodic well periods, is deceiving because most patients experience impairing subsyndromal symptoms between episodes (Altshuler et al. Reference Altshuler, Gitlin, Mintz, Leight and Frye2002; Judd & Akiskal, Reference Judd and Akiskal2003; Bonnin et al. Reference Bonnin, Sanchez-Moreno, Martinez-Aran, Sole, Reinares, Rosa, Goikolea, Benabarre, Ayuso-Mateos, Ferrer, Vieta and Torrent2012). Nevertheless, the validity of existing mood episode thresholds has not been directly tested in patients with mood disorders. Third, although the preponderance of evidence supports the continuity of depressive phenomenology with normal human functioning (e.g. Ruscio & Ruscio, Reference Ruscio and Ruscio2000; Ruscio et al. Reference Ruscio, Ruscio and Keane2004; Prisciandaro & Roberts, Reference Prisciandaro and Roberts2005), there have been notable exceptions (Solomon et al. Reference Solomon, Ruscio, Seeley and Lewinsohn2006; Ruscio et al. Reference Ruscio, Zimmerman, McGlinchey, Chelminski and Young2007). Furthermore, out of the two studies that examined the continuity of manic phenomenology with normal human functioning, one found support for discontinuity (Ahmed et al. Reference Ahmed, Green, Clark, Stahl and Mcfarland2011) while the other did not (Prisciandaro & Roberts, Reference Prisciandaro and Roberts2011). The above investigations have primarily relied on a statistical methodology called taxometrics which evaluates whether a given disorder is continuous or discontinuous with normal functioning based on the observed pattern of associations among its symptoms (Meehl, Reference Meehl1995). Although knowledge of the boundary conditions of taxometric methods, and the resulting development of implementation standards, has been increasing in recent years (Ruscio et al. Reference Ruscio, Haslam and Ruscio2006), implementations of taxometric methods have historically been methodologically diverse (i.e. involving diverse taxometric procedures and/or consistency tests, or the same procedures/tests implemented in diverse ways) making cross-study comparisons challenging. This methodological heterogeneity may be in part responsible for the relative lack of consistency in findings across studies. One potential solution to this issue would be to implement multiple appropriate statistical methodologies within the same data; this approach was taken by Prisciandaro & Roberts (Reference Prisciandaro and Roberts2011) in an investigation of the continuity/discontinuity of mania. Assuming convergence across multiple statistical methodologies, the conclusions drawn from such a study are arguably robust to the methodological idiosyncrasies of any single statistical approach. In the present study, taxometric analyses were supplemented with two additional analytic approaches. The first additional approach, information theoretic latent distribution modeling (ITLDM), was used to compare the relative fit of a variety of discrete and continuous latent variable models. As implemented in the present study, ITLDM has a similar goal as taxometrics (i.e. to test whether the latent distribution of a disorder exhibits discontinuities), but it accomplishes this goal within a well-developed, formal statistical framework, and includes comparison of a wider variety of latent variable models. The second supplementary approach, semi-parametric structural equation mixture modeling (SEMM), was used in the present study to examine a different type of continuity/discontinuity relative to taxometrics and ITLDM: continuity/discontinuity between manic and depressive symptoms and their sequelae (Flett et al. Reference Flett, Vredenburg and Krames1997). Specifically, SEMM was used to investigate potential non-linearity between depressive and manic symptomatology and general functioning (Bauer, Reference Bauer2005). Although research has historically focused on the continuity/discontinuity of the latent distribution of psychopathological constructs (using methods like taxometrics), many have argued that the continuity/discontinuity between psychopathological constructs and general functioning is equally important because discontinuity (e.g. an inflection point) between manic or depressive symptomatology and general functioning could form the basis of a valid categorical diagnostic threshold (Kessler, Reference Kessler2002).
The present study was designed to address gaps in the existing literature regarding the continuity/discontinuity of mood disorder phenomenology. Using data from the largest treatment study ever conducted in bipolar disorder, the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD), the present study examined the validity of manic and depressive episode symptom thresholds in patients with bipolar disorder using three statistical methodologies: taxometrics, ITLDM and SEMM.
Method
Participants and measures
STEP-BD was a multicenter study that prospectively evaluated clinical outcomes in individuals with bipolar disorder (Sachs et al. Reference Sachs, Thase, Otto, Bauer, Miklowitz, Wisniewski, Lavori, Lebowitz, Rudorfer, Frank, Nierenberg, Fava, Bowden, Ketter, Marangell, Calabrese, Kupfer and Rosenbaum2003). All patients meeting DSM-IV criteria for bipolar disorder at participating STEP-BD centers were eligible to participate in the study. Potential participants provided written informed consent, and 15- to 17-year-old participants additionally provided parental assent, prior to completing an extensive baseline assessment interview. Bipolar diagnoses, retrospective course and the Global Assessment of Functioning (GAF; American Psychiatric Association, 2000) were assessed via the Affective Disorders Evaluation [which included the mood/psychosis modules of the Structured Clinical Interview for DSM-IV (First, Reference First1998)]; co-morbid Axis I diagnoses were assessed using the Mini-International Neuropsychiatric Interview (Sheehan et al. Reference Sheehan, Lecrubier, Sheehan, Amorim, Janavs, Weiller, Hergueta, Baker and Dunbar1998; Sachs et al. Reference Sachs, Thase, Otto, Bauer, Miklowitz, Wisniewski, Lavori, Lebowitz, Rudorfer, Frank, Nierenberg, Fava, Bowden, Ketter, Marangell, Calabrese, Kupfer and Rosenbaum2003). Current (i.e. past-week) manic and depressive symptoms were assessed using the Young Mania Rating Scale (YMRS; Young et al. Reference Young, Biggs, Ziegler and Meyer1978) and the Montgomery–Åsberg Depression Rating Scale (MADRS; Montgomery & Åsberg, Reference Montgomery and Åsberg1979), respectively. General functioning was assessed via the clinician/rater-administered Range of Impaired Functioning Tool (LIFE-RIFT; Leon et al. Reference Leon, Solomon, Mueller, Turvey, Endicott and Keller1999). Quality of life was assessed via the Quality of Life Enjoyment and Satisfaction (Short Form) questionnaire (Q-LES-QSF; Endicott et al. Reference Endicott, Nee, Harrison and Blumenthal1993). Demographic information (e.g. age, sex, employment status) was collected via a study-specific questionnaire. A maximum sample of 3721 individuals was available for the present analysis; however, effective sample sizes varied slightly (i.e. <1.5%) across analyses (e.g. taxometric programs employ listwise deletion of missing data).
Analytic strategy
Taxometric analyses were conducted using Ruscio's taxometric programs (Ruscio et al. Reference Ruscio, Haslam and Ruscio2006), code version 2012–01–09, for the R platform (R Core Team, 2011). ITLDM and SEMM were conducted using MPlus version 6.1 (Muthén & Muthén, Reference Muthén and Muthén2011). Each of these three methodologies was applied twice, once to the 10 ordinal items of the MADRS and again to the 11 ordinal items of the YMRS.
Taxometrics
Taxometric methods evaluate whether the pattern of covariances among observed variables conform to expectations of a one-population (dimensional) or a two-population (categorical) structural model (Meehl, Reference Meehl1995). Maximum covariance (MAXCOV; Meehl & Yonce, Reference Meehl and Yonce1996) was conducted on all possible input/output indicator (i.e. symptom) configurations. Summed input indicators were used, and subsamples were created by dividing the sample into 25 overlapping windows with 90% overlap, to provide more interpretable results (Walters & Ruscio, Reference Walters and Ruscio2010). Mean above minus below a cut (MAMBAC; Meehl & Yonce, Reference Meehl and Yonce1994) was also conducted with summed input indicators. The first and last cuts were made 25 cases from each input indicator's distributional tails (Ruscio et al. Reference Ruscio, Haslam and Ruscio2006). To improve interpretability, the number of cuts for MAMBAC was held equal to the number of overlapping windows for MAXCOV. MAMBAC was repeated until all indicators had served as output. For both MAXCOV and MAMBAC, 10 internal replications were implemented per run to reduce the impact of tied cases on results (Ruscio et al. Reference Ruscio, Haslam and Ruscio2006). Also, for both procedures, using a larger number of windows/cuts (i.e. 500) was explored; results are provided for 25 windows/cuts only because increasing the number of windows/cuts did not alter the results. Finally, the comparison of 100 sets of simulated taxonic (i.e. categorical) and dimensional data with observed data was used to aid in interpretation (Ruscio & Kaczetow, Reference Ruscio and Kaczetow2008). Taxonic data were simulated using the base rate classification method (Ruscio, Reference Ruscio2009); individuals with the highest total scores were assigned to the taxon group based on the mean base rate estimate from taxometric analyses of the observed data. Simulated data were submitted to the same analyses as the observed data, and results from the observed data were compared with results from the simulated data both visually and using fit indices (i.e. comparison curve fit index; CCFI). CCFI values over 0.60 support taxonic structure and values under 0.40 support dimensional structure with at least 90% confidence (Ruscio et al. Reference Ruscio, Walters, Marcus and Kaczetow2010); values between 0.40 and 0.60 are interpreted as ambiguous.
ITLDM
ITLDM (Markon & Krueger, Reference Markon and Krueger2006) consists of estimating and comparing the parsimony-adjusted fit of various latent variable models using information criteria (e.g. the Bayesian information criterion; BIC; Schwarz, Reference Schwarz1978). Following the example in Markon & Krueger (Reference Markon and Krueger2006), models were derived from three special cases of a generalized latent variable model: (1) nominal, qualitative, latent class models (LCMs) with varying numbers (i.e. 1–8) of classes; (2) discrete metrical, integer, latent trait models (LTMs) with varying numbers (i.e. 1–8) of latent values (distributed according to a rescaled binomial distribution); and (3) a normally distributed, continuous, metrical LTM. All models were estimated using maximum likelihood with robust standard errors (MLR). Information regarding the clustering of the data by study site was included to properly model STEP-BD's complex sample design. Comparisons between LTMs with few versus many latent values evaluated the relative fit of discrete metrical and continuous models, respectively. Comparisons between LTMs and LCMs with the same number of values/classes evaluated whether the target construct consisted of ordered or unordered categories. Because improperly specifying a multidimensional target construct as unidimensional can produce spurious evidence for a discrete latent structure (Markon & Krueger, Reference Markon and Krueger2006), prior to fitting LTMs and LCMs to the YMRS and MADRS data, exploratory factor analyses (EFAs) were conducted separately for each measure to ensure that each scale was essentially unidimensional. EFAs were estimated using a weighted least squares mean and variance adjusted (WLSMV; Muthén, Reference Muthén1989) estimator to properly account for the ordinal nature of the item data. Root mean square error of approximation (RMSEA) ≤0.06 was taken as evidence for essential unidimensionality (Hu & Bentler, Reference Hu and Bentler1999).
SEMM
Semi-parametric structural equation mixture models, where classes are used to represent regions of the association between two latent variables, have recently been developed (Bauer, Reference Bauer2005) and implemented (Markon, Reference Markon2010) to examine non-linear relationships among latent variables. In this application of SEMM, non-linearities between latent variables are modeled via a mixture of multivariate normal distributions; while each latent variable's measurement model is constrained across classes, the association between the latent variables is free to vary across classes. Because distributional non-normalities within the latent variables can spuriously suggest non-linearities in the association between the latent variables, non-linear models, in which both the distributions of the latent variables and the association between the variables are free to vary across classes, are compared with ‘slope-fixed’ models, in which only the distributions of the latent variables are free to vary across classes, and ‘distribution-fixed’ models, in which the distribution of one latent variable and the association between the variables are free to vary across classes (Bauer, Reference Bauer2005; Markon, Reference Markon2010). Within each type of model, models with differing numbers of classes are estimated using MLR and compared using information-theoretic criteria (e.g. BIC, sample size-adjusted BIC; Sclove, Reference Sclove1987) as well as the Vuong–Lo–Mendell–Rubin (VLMR) likelihood-ratio test for nested (i.e. K v. K-1 class) models. The purpose of SEMM in the present study was to explore potential non-linear associations between depressive or manic symptoms and general functioning. The GAF (American Psychiatric Association, 2000), current employment (i.e. whether the patient was currently employed, yes/no) and total scores on the LIFE-RIFT and Q-LES-QSF were used as indicators of a general functioning latent variable. Appropriateness of the general functioning latent variable was evaluated via confirmatory factor analysis using the WLSMV estimator (Muthén, Reference Muthén1989); model fit was evaluated according to published guidelines (Hu & Bentler, Reference Hu and Bentler1999). Given adequate fit of the general functioning latent variable, the full complement of SEMM model comparisons was executed twice, once for the MADRS and general functioning and again for the YMRS and general functioning. Information regarding the clustering of the data by study site was included to properly model STEP-BD's complex sample design.
Ethical Standards
All procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Results
Taxometrics
Averaged MAXCOV results are presented in the top panels (MADRS: top-left, YMRS: top-right), and averaged MAMBAC results are presented in the bottom panels (MADRS: bottom-left, YMRS: bottom-right), of Fig. 1 alongside averaged results from simulated categorical and dimensional comparison data. For both the MADRS and YMRS, across both MAXCOV and MAMBAC procedures, results from the research data were more visually consistent with results from simulated dimensional data than for results from simulated categorical data. CCFI values confirmed this superior fit (MAXCOV: MADRS CCFI = 0.30, YMRS CCFI = 0.22; MAMBAC: MADRS CCFI = 0.34, YMRS CCFI = 0.25). As such, a dimensional, continuous interpretation of depressive and manic symptoms was consistently supported by taxometric analyses.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045306012-0851:S0033291715000513:S0033291715000513_fig1g.gif?pub-status=live)
Fig. 1. Averaged maximum covariance (MAXCOV; top) and means above minus below a cut (MAMBAC; bottom) results for the Montgomery–Åsberg Depression Rating Scale (MADRS) (left) and the Young Mania Rating Scale (YMRS) (right). Within each panel of results, results from research data are overlaid results from simulated categorical (left) and dimensional (right) data.
ITLDM
EFAs suggested that both the YMRS (RMSEA = 0.06) and MADRS (RMSEA < 0.06) were sufficiently unidimensional. For ITLDM results, see Tables 1 (MADRS) and 2 (YMRS). For both the MADRS and YMRS, the best-fitting model was a ‘discrete metrical’ model with eight latent values; this was the highest number of discrete latent values that we tested. The superior fit of the eight-value discrete metrical LTMs relative to the normally distributed continuous LTMs suggests that both depressive and manic symptoms have continuous, but potentially non-normally distributed, latent structures (Markon & Krueger, Reference Markon and Krueger2006).
Table 1. Fit criteria for discrete and continuous models of depressive symptoms
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045306012-0851:S0033291715000513:S0033291715000513_tab1.gif?pub-status=live)
k, Number of parameters; ln(L), log likelihood; BIC, Bayesian information criterion.
a Fit criteria for the best-fitting model.
Table 2. Fit criteria for discrete and continuous models of manic symptoms
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045306012-0851:S0033291715000513:S0033291715000513_tab2.gif?pub-status=live)
k, Number of parameters; ln(L), log likelihood; BIC, Bayesian information criterion.
a Fit criteria for the best-fitting model.
SEMM
The general functioning latent variable was clearly interpretable, fit the data well (comparative fit index = 0.98; Tucker–Lewis index = 0.95; RMSEA = 0.04), and all indicator loadings (i.e. employment = 0.35; GAF = 0.61; LIFE-RIFT = –0.79; Q-LES-QSF = 0.68) were statistically significant. Although non-linear, slope-fixed, and distribution-fixed models containing between one and five latent classes were estimated, only results from one- and two-class models are presented because most (i.e. 78%) models with >2 classes were un-estimable and/or contained at least one class with ≤5% of the total sample; the remaining admissible >2-class models uniformly fit the data more poorly than their one- or two- class counterparts. For results from all one- and two-class SEMM models, see Table 3. Although BIC values supported two-class non-linear and slope-fixed, but not distribution-fixed, models relative to their one-class counterparts, VLMR likelihood-ratio tests suggested that, across all non-linear, slope-fixed and distribution-fixed models, two-class models did not provide significantly better fit to the data relative to their one-class counterparts. In single-class models, the estimated association between the MADRS and general functioning was large (i.e. r = −0.88, p < 0.001), whereas the estimated association between the YMRS and general functioning was relatively small (i.e. r = −0.25, p < 0.001); this difference was probably determined in part by the greater variability of MADRS versus YMRS scores in the STEP-BD.
Table 3. Fit criteria for linear and non-linear associations between mood symptoms (YMRS, MADRS) and general functioning
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045306012-0851:S0033291715000513:S0033291715000513_tab3.gif?pub-status=live)
YMRS, Young Mania Rating Scale; MADRS, Montgomery–Åsberg Depression Rating Scale; k, number of parameters; ln(L), log likelihood; BIC, Bayesian information criterion; BICa, sample size-adjusted BIC; VLMR LRT, Vuong–Lo–Mendell–Rubin likelihood ratio test.
Discussion
The present study provided consistent support for the dimensional conceptualization of depressive and manic symptoms in a large out-patient sample of individuals with bipolar disorder using three statistical frameworks (i.e. taxometrics, ITLDM, SEMM). These results, along with studies demonstrating that most bipolar patients experience impairing subsyndromal symptoms between episodes (Altshuler et al. Reference Altshuler, Gitlin, Mintz, Leight and Frye2002; Judd & Akiskal, Reference Judd and Akiskal2003; Bonnin et al. Reference Bonnin, Sanchez-Moreno, Martinez-Aran, Sole, Reinares, Rosa, Goikolea, Benabarre, Ayuso-Mateos, Ferrer, Vieta and Torrent2012), argue against the DSM conceptualization of bipolar symptom phenomena as discrete episodes and are consistent with emerging dimensional psychiatric classification systems (e.g. Research Domain Criteria; Cuthbert, Reference Cuthbert2014).
The results from the present study have important research and treatment implications. They suggest that delineating mood episode status in bipolar patients may not provide additional informational value relative to dimensional quantifications of patients’ depressive and manic symptoms. As such, they suggest that using mood episode status in place of quantitative symptom severity measures in research and clinical decision making necessarily involves a loss of meaningful information and statistical power (Cohen, Reference Cohen1983). The results from the present study may also argue for a graded continuum of care for bipolar symptom management, whereby different levels of bipolar symptomatology are met with commensurate levels of clinical care (Basco & Rush, Reference Basco and Rush2005). For example, elevated, but relatively low, levels of bipolar symptoms may call for more careful patient and doctor monitoring; elevated, moderate, levels of bipolar symptoms may be met with plans for behavioral change (e.g. behavioral activation for depressive symptoms, behavioral deactivation for manic symptoms) and/or consideration of changes in medications; severe symptom levels may be treated with abortive medications or hospitalization. Further research would be needed to substantiate such a graded continuum of care; to establish empirically what levels of bipolar symptom severity should be met with particular interventions (e.g. at what level of manic symptom severity does a bipolar patient require hospitalization because they present an unacceptable risk to self or others?). Developing and standardizing such a continuum based on empirical evidence could provide a valuable tool to clinicians and their patients.
Although the present study represents an important initial step in directly evaluating the validity of mood episode thresholds in bipolar patients, there are a number of potential future directions for this research. First, the present study demonstrated that relationships between manic and depressive symptoms and general functioning were linear; identifying non-linear associations between symptomatology and general functioning has been proposed as a method for generating meaningful, practical diagnostic thresholds (Kessler, Reference Kessler2002). However, variables other than general functioning, for example, hospitalization or treatment response, could also be considered for the generation of practical diagnostic cut-points. Second, the present study only investigated out-patients with bipolar disorder. As a result, manic symptom variability, and perhaps to a lesser extent depressive symptom variability, was necessarily restricted. Future research should consider data from both out-patient and in-patient sources. Third, the present study evaluated the validity of episode thresholds for manic and depressive symptoms separately, preventing firm conclusions about mixed episode thresholds. Past factor analytic research has demonstrated depressive symptom factors in individuals with bipolar disorder experiencing manic or mixed episodes (Cassidy et al. Reference Cassidy, Forest, Murry and Carroll1998). Unfortunately, evaluating depressive and manic symptom items simultaneously was not possible due to the complexity of the resulting statistical models. Methods for simplifying the statistical models could include investigating subsets of depressive and manic symptom items, evaluating items with a larger number of scale points (e.g. 10) per item, or analysing total and/or subset scores in place of individual item responses. Fourth, the present study did not explicitly model information regarding the duration of manic or depressive symptoms in evaluating episode thresholds; both manic and depressive symptoms were assessed using a past-week recall period. In contrast, mood episodes are defined by both symptom severity and duration (2 weeks for a depressive episode and 4 or 7 days for a hypomanic or manic episode, respectively) thresholds in the DSM-5 (American Psychiatric Association, 2013). Information regarding the duration of manic and depressive symptoms could be incorporated into future studies. Finally, the statistical methods employed in the present study are particularly suited to investigating cross-sectional phenomena; methods for directly evaluating the continuity versus discontinuity of longitudinal phenomena are not well formulated. Although evaluating the continuity of cross-sectional mood phenomena is important in its own right, it does not adequately capture the phenomenon of mood cycling that is central to bipolar disorder. Further research is needed to develop latent longitudinal statistical methods that can distinguish between dimensional and categorical longitudinal phenomena. It is unclear whether existing methods [e.g. hidden Markov modeling (Langeheine & Van de Pol, Reference Langeheine, Van de Pol, Hagenaars and McCutcheon2002), growth mixture modeling (Muthén & Muthén, Reference Muthén and Muthén2000)] are capable of validly making this distinction.
In sum, the present study provided consistent support for the dimensional conceptualization of depressive and manic symptoms in individuals with bipolar disorder using a variety of statistical methodologies. These results argue against the validity of DSM mood episode thresholds and argue for a graded continuum of care for bipolar symptom management.
Acknowledgements
STEP-BD Data Use Certification for Public Release Dataset version 4.1 was provided on 27 April 2011 (to B.K.T., recipient principal investigator). Data used in the preparation of this article were obtained from the limited access datasets distributed from the National Institutes of Health (NIH)-supported ‘Systematic Treatment Enhancement Program for Bipolar Disorder’ (STEP-BD). This is a multisite, clinical trial studying the current treatments for bipolar disorder, including medications and psychosocial therapies. The study was supported by National Institute for Mental Health contract no. N01MH80001 to Massachusetts General Hospital and the University of Pittsburgh. The ClinicalTrials.gov identifier is NCT00012558. This paper reflects the views of the authors and may not reflect the opinions or views of the STEP-BD Study Investigators or the NIH. J.J.P. is funded by K23 AA020842.
Declaration of Interest
None.