Introduction
Major depression, which is characterised by the core symptoms of depressed mood and anhedonia (World Health Organization, 2012; American Psychiatric Association, 2013), is a leading contributor to the global burden of disease (Murray & Lopez, Reference Murray and Lopez1996; Bromet et al. Reference Bromet, Andrade, Hwang, Sampson, Alonso, de Girolamo, de Graaf, Demyttenaere, Hu, Iwata, Karam, Kaur, Kostyuchenko, Lepine, Levinson, Matschinger, Mora, Browne, Posada-Villa, Viana, Williams and Kessler2011; Ferrari et al. Reference Ferrari, Charlson, Norman, Patten, Freedman, Murray, Vos and Whiteford2013; Kessler & Bromet, Reference Kessler and Bromet2013; Cuijpers et al. Reference Cuijpers, Vogelzangs, Twisk, Kleiboer, Li and Penninx2014). For years, debate has raged over the clinical presentation of depression (Baumeister & Parker, Reference Baumeister and Parker2012; van Loo et al. Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012). Specifically, studies have been conducted seeking to identify symptom-based dimensions and subtypes via statistical analysis of symptom co-occurrence using factor analysis, principal component analysis and latent class analysis (Chen et al. Reference Chen, Eaton, Gallo and Nestadt2000; Aggen et al. Reference Aggen, Neale and Kendler2005; Shafer, Reference Shafer2006; Carragher et al. Reference Carragher, Adamson, Bunting and McCann2009; Aggen et al. Reference Aggen, Kendler, Kubarych and Neale2011; Cole et al. Reference Cole, Cai, Martin, Findling, Youngstrom, Garber, Curry, Hyde, Essex, Compas, Goodyer, Rohde, Stark, Slattery and Forehand2011; Mezuk & Kendler, Reference Mezuk and Kendler2012; Hybels et al. Reference Hybels, Landerman and Blazer2013; Buhler et al. Reference Buhler, Seemuller and Lage2014; Li et al. Reference Li, Aggen, Shi, Gao, Tao, Zhang, Wang, Gao, Yang, Liu, Li, Shi, Wang, Liu, Zhang, Du, Jiang, Shen, Zhang, Liang, Sun, Hu, Liu, Miao, Meng, Hu, Huang, Li, Ha, Deng, Mei, Zhong, Gao, Sang, Zhang, Fang, Yu, Yang, Chen, Hong, Wu, Chen, Cai, Song, Pan, Dong, Pan, Zhang, Shen, Liu, Gu, Liu, Zhang, Flint and Kendler2014a, Reference Li, Aggen, Shi, Gao, Tao, Zhang, Wang, Gao, Yang, Liu, Li, Shi, Wang, Liu, Zhang, Du, Jiang, Shen, Zhang, Liang, Sun, Hu, Liu, Miao, Meng, Hu, Huang, Li, Ha, Deng, Mei, Zhong, Gao, Sang, Zhang, Fang, Yu, Yang, Chen, Hong, Wu, Chen, Cai, Song, Pan, Dong, Pan, Zhang, Shen, Liu, Gu, Liu, Zhang, Flint and Kendlerb; Rodgers et al. Reference Rodgers, Grosse Holtforth, Muller, Hengartner, Rossler and Ajdacic-Gross2014; Fried et al. Reference Fried, van Borkulo, Epskamp, Schoevers, Tuerlinckx and Borsboom2016). Tightly correlated symptom sets are important as they might constitute dimensions or subtypes that imply different aetiologies and/or treatment responses. However, as summarised in a recent systematic review, these studies have failed to generate replicable results (van Loo et al. Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012).
Two methodological issues have not yet been considered, however. First, previous studies failed to disentangle two distinct sources of correlation between any two symptoms. Such correlations may be: (1) due to differences in overall severity (i.e., individuals with more severe depression score higher for all symptoms than individuals with less severe depression; hence, symptoms A and B are correlated); or (2) due to a specific profile of symptom correlations (e.g., individuals scoring high for symptom A could typically score high for symptom B too, but not necessarily for symptom C, which is more closely linked to symptom D). Therefore, it may be more appropriate to study symptom correlations adjusted for overall depression severity. If a structure underlying the symptoms exists, it should be revealed more clearly in this way.
Second, most studies were conducted on samples of depressed individuals. Studies that examined the dimensions of depression in general population samples, however, consistently revealed one single dimension of depression severity (Muthén, Reference Muthén1989; Aggen et al. Reference Aggen, Neale and Kendler2005; Aggen et al. Reference Aggen, Kendler, Kubarych and Neale2011; Cole et al. Reference Cole, Cai, Martin, Findling, Youngstrom, Garber, Curry, Hyde, Essex, Compas, Goodyer, Rohde, Stark, Slattery and Forehand2011; Mezuk & Kendler, Reference Mezuk and Kendler2012; Familiar et al. Reference Familiar, Ortiz-Panozo, Hall, Vieitez, Romieu, Lopez-Ridaura and Lajous2015). Thus, in the general population, depression was found to be a uni-dimensional construct. Apparently, it makes a difference whether one studies the general population or depressed individuals only, but this issue remained unaddressed.
The present study's main objective was to examine the dimensional structure of the nine symptoms of depression listed in the DSM-IV (a) when adjusting symptom correlations for overall depression severity; and (b) in general-population samples v. subsamples of currently depressed individuals. We adopted a dimensional approach, since evidence increasingly suggests that many psychiatric syndromes, including depression, are continuous and hence dimensional rather than categorical (Slade & Andrews, Reference Slade and Andrews2005; Goldberg et al. Reference Goldberg, Krueger, Andrews and Hobbs2009; Prisciandaro & Roberts, Reference Prisciandaro and Roberts2009; Markon et al. Reference Markon, Chmielewski and Miller2011; Haslam et al. Reference Haslam, Holland and Kuppens2012; Eaton et al. Reference Eaton, Krueger, Markon, Keyes, Skodol, Wall, Hasin and Grant2013).
Materials and methods
Study design
We used: (a) longitudinal data from the Cohort Study on Substance Use Risk Factors (C-SURF); and (b) cross-sectional data from the U.S. National Health and Nutrition Survey (NHANES). In total, we considered four samples: C-SURF baseline, C-SURF follow-up, NHANES men and NHANES women. Comparing C-SURF baseline and follow-up data permitted us to examine whether results were replicable across two time points in the same sample. Comparing the C-SURF and NHANES data allowed us to examine whether results were replicable in two different populations.
Each of the four samples was analysed twice: once in the full (general-population) sample, and once only in the subsample of participants within a current mild-to-severe depressive episode, generating eight analytical samples in total.
Participants
C-SURF is a large cohort study examining young men in Switzerland, for which details on sampling and non-response bias have been published elsewhere (Studer et al. Reference Studer, Baggio, Mohler-Kuo, Dermota, Gaume, Bertholet, Daeppen and Gmel2013a, Reference Studer, Mohler-Kuo, Dermota, Gaume, Bertholet, Eidenbenz, Daeppen and Gmelb). It was designed to be representative of young non-institutionalised Swiss men. The study protocol was approved by the Ethics Committee for Clinical Research at Lausanne University Medical School (protocol number 15/07), and all subjects consented to participate.
5990 men completed the baseline survey between September 2010 and March 2012. Of these, 107 (1.8%) were excluded for missing data on the depression items. Of these, 5155 (87.6%) answered all necessary items of the follow-up survey performed between January 2012 and April 2013. The mean time elapsed between baseline and follow-up was 1.3 years (standard deviation 0.2).
NHANES is a continuous cross-sectional survey released in 2-year cycles (Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS)). It was designed to be representative of the non-institutionalised U.S. civilian population. NHANES study protocols were approved by the National Center for Health Statistics Research Ethics Review Board, and all participants consented.
We included NHANES data from NHANES cycles 2005–2012 for men and women from 18 to 28 years old, an age range chosen to resemble the C-SURF cohort. Men and women were analysed separately. Of the total 2371 men and 2542 women, 197 (8.3%) and 298 (11.7%) were excluded for missing depression items data.
Measures
C-SURF: Self-reported depressive symptoms were assessed via the Major Depressive Inventory – WHO-MDI (Bech et al. Reference Bech, Rasmussen, Olsen, Noerholm and Abildgaard2001; Olsen et al. Reference Olsen, Jensen, Noerholm, Martiny and Bech2003). This validated measure covers DSM-IV and ICD-10 depression symptoms over the past 14 days, using 12 items with six-point answer scales ranging from ‘never’ (0) to ‘all the time’ (5). Items were aggregated into the nine DSM-IV symptoms, as proposed previously (Bech et al. Reference Bech, Rasmussen, Olsen, Noerholm and Abildgaard2001) (Table 1). Subjects were classified as having ‘no’, ‘mild’, ‘moderate’ or ‘severe’ depression based on the MDI summation score (Olsen et al. Reference Olsen, Jensen, Noerholm, Martiny and Bech2003). For correlation and factor analyses, symptoms were dichotomised into present/absent, as per ICD-10 definitions (Bech et al. Reference Bech, Rasmussen, Olsen, Noerholm and Abildgaard2001) (Table 1).
Table 1. Symptoms of depression in the ICD-10-based WHO-MDI and the DSM-IV-based PHQ-9
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180427062612658-0184:S2045796016001086:S2045796016001086_tab1.gif?pub-status=live)
ICD-10, International Classification of Diseases 10th version; DSM-IV, Diagnostic and Statistical Manual of Mental Disorders 4th edition; WHO-MDI, World Health Organization Major Depression Inventory; PHQ, Patient Health Questionnaire.
a Translation rule to combine ICD-10 symptoms into DSM-IV symptoms. The rule is to take the highest value of the relevant ICD-10 symptoms to represent the corresponding DSM-IV symptom.
b Threshold for scoring the symptom as ‘present’.
NHANES 2005–2012: Self-reported depressive symptoms were assessed via the Patient Health Questionnaire (PHQ-9) (Kroenke et al. Reference Kroenke, Spitzer and Williams2001). This validated measure covers the nine DSM-IV depression symptoms over the past 14 days (Table 1). Four answer options are provided, ranging from ‘not at all’ (0) to ‘nearly every day’ (3). Participants were classified as having ‘no’, ‘mild’, ‘moderate’ or ‘severe’ depression based on the PHQ9 summation score (Kroenke et al. Reference Kroenke, Spitzer and Williams2001). Note that the threshold score we used to denote ‘mild’ depression was termed ‘moderate’ depression by Kroenke et al. This threshold resembled most closely the threshold for ‘mild’ depression that we used in the C-SURF sample, in terms of the percentage summation score required for the diagnosis (40% in C-SURF, 37% in NHANES). For correlation and factor analyses, symptoms were dichotomised into present/absent, as per DSM-IV definitions (Kroenke et al. Reference Kroenke, Spitzer and Williams2001) (Table 1).
Statistical analysis
First, we examined tetrachoric correlations of the depression symptoms. To compare the correlations of each general-population sample with those of the corresponding subsample of currently depressed subjects, we calculated the ratio of the squared correlations for each symptom pair and used Steiger's test to formally examine the hypothesis that the two correlation matrices differed (Steiger, Reference Steiger1980). Steiger's test sums the squared differences of the Fisher transformed correlations of the two matrices and tests this sum against the chi-square distribution. Second, we assessed the dimensionality of the depression symptoms in three steps, each step conducted separately for each of the eight samples to determine whether the results were replicable.
Step 1: We first performed one-factor confirmatory factor analysis (CFA) and exploratory factor analysis (EFA). CFA consisted of the nine symptoms as indicators of an underlying depression factor, thereby modelling overall depression severity (model 1). With EFA, we tested one- to seven-factor models to determine which best fit the data. Both CFA and EFA were estimated using mean and variance-adjusted weighted least squares (WLSMV) estimations, which is the standard for categorical indicators (Barendse et al. Reference Barendse, Oort and Timmerman2014; Li et al. Reference Li, Aggen, Shi, Gao, Tao, Zhang, Wang, Gao, Yang, Liu, Li, Shi, Wang, Liu, Zhang, Du, Jiang, Shen, Zhang, Liang, Sun, Hu, Liu, Miao, Meng, Hu, Huang, Li, Ha, Deng, Mei, Zhong, Gao, Sang, Zhang, Fang, Yu, Yang, Chen, Hong, Wu, Chen, Cai, Song, Pan, Dong, Pan, Zhang, Shen, Liu, Gu, Liu, Zhang, Flint and Kendler2014a). Model fit was evaluated via standard criteria for good model fit (Aggen et al. Reference Aggen, Neale and Kendler2005; Li et al. Reference Li, Aggen, Shi, Gao, Tao, Zhang, Wang, Gao, Yang, Liu, Li, Shi, Wang, Liu, Zhang, Du, Jiang, Shen, Zhang, Liang, Sun, Hu, Liu, Miao, Meng, Hu, Huang, Li, Ha, Deng, Mei, Zhong, Gao, Sang, Zhang, Fang, Yu, Yang, Chen, Hong, Wu, Chen, Cai, Song, Pan, Dong, Pan, Zhang, Shen, Liu, Gu, Liu, Zhang, Flint and Kendler2014a): root-mean-square error of approximation (RMSEA) ≤0.05; comparative fit index (CFI) ≥0.95; and Tucker–Lewis index (TLI) ≥0.95. For EFA, the model with the lowest number of factors achieving these criteria was adopted (model 2).
Step 2: From model 1, we derived the modification indices for the residual symptom co-variances as indicators of symptom pairs correlated beyond the general factor (i.e., as indicators of substantial severity-adjusted symptom correlations). Modification indices estimate the degree of improvement in model fit if the corresponding parameter is included in the model (Brown & Moore, Reference Brown, Moore and Hoyle2012). Consequently, the modification index of a residual co-variance indicates whether the model would fit better if this co-variance was included in the CFA model.
We considered a modification index ≥3.84 statistically significant (Brown & Moore, Reference Brown, Moore and Hoyle2012). We then re-fitted the one-factor CFA, this time including the residual symptom co-variances revealed by the modification indices (model 3). The residual symptom correlations derived from this CFA model generated an estimate of symptom correlations corrected for overall depression severity. If there is a dimensional structure beyond overall depression severity, these correlations: (a) should be replicable across the samples and (b) form interpretable symptom clusters. Because adopting CFA models based on modification indices is associated with a high risk of overfitting (MacCallum et al. Reference MacCallum, Roznowski and Necowitz1992), we used the median of each modification index across 5000 case-based bootstrap samples.
Step 3: Finally, we estimated a series of bifactor models that, by definition, consist of one general factor and several group factors. Each indicator variable loads simultaneously on the general factor and one of the group factors (Reise et al. Reference Reise, Moore and Haviland2010). Thus, bifactor models allow for estimating group factors controlled for a general factor (Reise et al. Reference Reise, Moore and Haviland2010) and, hence, correspond directly to our notion of assessing depression dimensions (the group factors) controlled for overall depression severity (the general factor). If there is a replicable dimensional structure underlying the depression symptoms, at least one of the bifactor models should either converge with the residual correlations revealed in step 2, or provide an alternative model that is replicable across samples.
We used two approaches to identify the group factors:
(a) We examined three theoretically-derived groupings (models 4.a1–3):
(1) Three genetic factors revealed by (Kendler et al. Reference Kendler, Aggen and Neale2013).
(2) The common distinction of a cognitive/affective factor v. a somatic factor, as defined in the systematic review by van Loo et al. (Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012);
(3) The symptoms most consistently found on a single factor in the review (van Loo et al. Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012) v. the remaining symptoms;
(b) A non-rotated EFA that comprises several factors can be rotated into a bifactor structure (Jennrich & Bentler, Reference Jennrich and Bentler2011, Reference Jennrich and Bentler2012), resulting in an exploratory bifactor analysis (EBFA) that can then undergo confirmatory analysis. We derived the EBFA from the EFA calculated in step 1 and re-fitted it as a CFA model (model 4.b). However, only one EFA model revealed a sufficient number of factors to be rotated into a bifactor structure. We therefore used this bifactor model across all samples, rather than assessing a separate bifactor solution for each sample.
Bifactor models were estimated using WLSMV estimation and model fit was evaluated as in step 2.
Analyses were performed using R-software version 3.1.2 (R Core Team, 2014), particularly using the packages ‘psych’ (Revelle, Reference Revelle2013), ‘semTools’ (Pornprasertmanit et al. Reference Pornprasertmanit, Miller, Schoemann and Rosseel2013) and ‘lavaan’ (Rosseel, Reference Rosseel2012). R-scripts are available at https://osf.io/a6tuw/.
Results
Participants’ baseline characteristics are summarised in Table 2. Prevalence rates for current depression of at least mild degree ranged from 4.9 to 7.6%.
Table 2. Baseline characteristics of study participants
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180427062612658-0184:S2045796016001086:S2045796016001086_tab2.gif?pub-status=live)
M, mean; SD, standard deviation.
a Prevalence of current depression of at least mild-moderate degree.
b Average prevalence rate calculated across the NHANES cycles 2005–2012. The prevalence rates within each cycle were calculated using weighted data.
Symptom correlations
Substantial symptom correlations were revealed in the general-population samples (median correlations ranging from r = 0.55–0.74), while correlations in the depressed samples were surprisingly weak (median correlations from r = 0.04–0.24, Table 3). Correlations were greater in the general-population samples, and these differences were pronounced: in average correlations were higher by a factor ranging between 8.4 and 30.9 across samples (Table 4). Steiger tests confirmed that all general-population sample correlation matrices differed significantly from their counterparts in the depressed samples (Table 4). Only one correlation among women (‘life not worth living’ and ‘appetite changes’) was slightly higher in the depressed sample (ratio of squared correlation = 0.8, Table 3).
Table 3. Summary of tetrachoric correlations of the nine DSM-IV depression symptoms across general-population samples and subsamples of currently depressed subjects
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180427062612658-0184:S2045796016001086:S2045796016001086_tab3.gif?pub-status=live)
M, median; IQR, inter-quartile range.
Table 4. Comparison of tetrachoric correlations of the nine DSM-IV depression symptoms in general-population samples v. subsamples of currently depressed subjects
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180427062612658-0184:S2045796016001086:S2045796016001086_tab4.gif?pub-status=live)
M, median; IQR, inter-quartile range; df, degrees of freedom.
a Tests the hypothesis that two correlation matrices differ from each other.
b For each symptom pair, its squared correlation in the general-population sample was divided by its squared correlation in the corresponding sample of depressed. Ratios >1.0 indicate that the correlation was higher in the general population than among depressed.
Factor analyses
Step 1 revealed that the one-factor model fit the data very well in all general-population samples. This was revealed by both CFAs and EFAs (Table 5, models 1 and 2). In contrast, in depressed samples, no replicable dimensional structure was identified. Specifically, the one-factor CFAs failed to achieve good model fit in three of four samples and the EFAs revealed different numbers of factors across samples.
Table 5. Summary of exploratory and confirmatory factor and bifactor analyses of the nine DSM-IV depression symptoms in general-population samples and subsamples of currently depressed subjects
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180427062612658-0184:S2045796016001086:S2045796016001086_tab5.gif?pub-status=live)
CFA, confirmatory factor analysis; EFA, exploratory factor analysis; CFI, comparative fit index; TLI, Tucker–Lewis index; RMSEA, root-mean-square error of approximation.
a None of the admissible models reached the criteria for good model fit.
b Model inadmissible due to negative residual variance of at least one symptom.
Step 2 indicated that 50 of 288 (8 samples × 36 symptom pairs) possible residual co-variances (17.4%) were substantial. Including these residual co-variances in the CFAs improved the fit of all models and resulted in good-fitting models, except for the NHANES sample of depressed women (Table 5, model 3). Both positive and negative correlations were revealed, positive correlations ranging from 0.10 to 0.48 (median: r = 0.29) and negative correlations from − 0.46 to −0.07 (median: r = −0.26, Table 6). However, the correlations failed to exhibit any replicable pattern across the samples, and 17 of the 50 correlations (34.0%) were not statistically significant (Table 6).
Table 6. Correlations of the nine DSM-IV depression symptoms adjusted for overall depression severity as estimated by confirmatory factor analysis (model 3)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180427062612658-0184:S2045796016001086:S2045796016001086_tab6.gif?pub-status=live)
Note. Correlations printed in bold are statistically significant with p < 0.05.
In Step 3, no bifactor model was replicable across the samples (Table 5, models 4.a1–4.b). The most stable model was model 4.a2, which achieved good model fit in three of four NHANES samples and one C-SURF sample. Note that all models were inadmissible in at least four of the eight samples, due to negative residual variances.
A closer look at the bifactor models revealed two issues (see Supplementary material available at https://osf.io/a6tuw/). First, 18 of 24 models that were inadmissible were inadmissible because at least one of the group factors consisted of one large factor loading, with all other loadings being virtually zero, thereby leading to a model that was empirically underidentified (Kline, Reference Kline2011). Furthermore, this pattern of one very large loading with otherwise negligible loadings is indicative of overfactoring (i.e., the inclusion of unnecessary factors) (Rindskopf, Reference Rindskopf1984). In the remaining six inadmissible models, the majority or all of the loadings were non-significant for at least one group factor. Second, among the admissible models, six had at least one group factor with only non-significant factor loadings, and only two models had significant factor loadings across both the general and group factors. Thus, bifactor analysis provided no evidence for any dimensional structure existing beyond the general severity factor.
Discussion
Main findings
We sought to examine the dimensions underlying the nine DSM-IV depressive symptoms in young adults while adjusting symptom correlations for overall depression severity, and while comparing general-population samples v. subsamples of currently depressed individuals. Analyses revealed three main results. First, adjusting symptom correlations for overall depression severity left little substantial correlation between the symptoms, and we failed to find any evidence to support a replicable dimensional structure when correcting symptom correlations for overall depression severity. Second, in the general-population samples, symptoms correlated substantially and were uni-dimensional. Third, among depressed individuals, symptom correlations were mostly weak and there was no evidence of any replicable dimensional structure, regardless of whether or not correlations were adjusted for overall severity.
Our finding that depressive symptoms were uni-dimensional in the general population is totally consistent with results from previous studies that analysed combined samples of healthy and affected individuals. These studies included general population samples from the USA and Mexico, and youths ages 5–18 in the USA and UK (Muthén, Reference Muthén1989; Aggen et al. Reference Aggen, Neale and Kendler2005; Aggen et al. Reference Aggen, Kendler, Kubarych and Neale2011; Cole et al. Reference Cole, Cai, Martin, Findling, Youngstrom, Garber, Curry, Hyde, Essex, Compas, Goodyer, Rohde, Stark, Slattery and Forehand2011; Mezuk & Kendler, Reference Mezuk and Kendler2012; Familiar et al. Reference Familiar, Ortiz-Panozo, Hall, Vieitez, Romieu, Lopez-Ridaura and Lajous2015). Our results replicate these results among young adults in the USA and extend them to young Swiss men. Furthermore, they resemble recent results reported by Fried et al. who found that, as samples of American and Dutch depression patients became more heterogeneous with respect to overall depression severity, average symptom correlations increased and the factor structures became simpler (Fried et al. Reference Fried, van Borkulo, Epskamp, Schoevers, Tuerlinckx and Borsboom2016).
Our finding indicates that, within the general population, depression can be described by a single dimension of severity, the main reason being that depressed individuals form a comparably homogeneous group, relative to the large majority of individuals who are mostly or completely symptom-free (data not shown). The sizeable symptom correlations found in the general population samples mainly reflected this difference between depressed and non-depressed individuals. As such, the common set of ICD-10 and DSM-IV depression symptoms has diagnostic utility identifying individuals suffering from depression within the general population, and the listed symptoms seem to capture the basic scope and severity of the syndrome well.
Conversely, the uni-dimensionality of depression symptoms was not present among depressed individuals and we found no evidence of any replicable dimensional structure. Our failure to uncover such a structure is totally consistent with a recent systematic review that failed to identify any conclusive evidence that data-driven dimensions or subtypes of depression exist (van Loo et al. Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012, see also Chen et al. Reference Chen, Eaton, Gallo and Nestadt2000; Aggen et al. Reference Aggen, Neale and Kendler2005; Shafer, Reference Shafer2006; Carragher et al. Reference Carragher, Adamson, Bunting and McCann2009; Aggen et al. Reference Aggen, Kendler, Kubarych and Neale2011; Cole et al. Reference Cole, Cai, Martin, Findling, Youngstrom, Garber, Curry, Hyde, Essex, Compas, Goodyer, Rohde, Stark, Slattery and Forehand2011 Mezuk & Kendler, Reference Mezuk and Kendler2012; Hybels et al. Reference Hybels, Landerman and Blazer2013; Buhler et al. Reference Buhler, Seemuller and Lage2014; Li et al. Reference Li, Aggen, Shi, Gao, Tao, Zhang, Wang, Gao, Yang, Liu, Li, Shi, Wang, Liu, Zhang, Du, Jiang, Shen, Zhang, Liang, Sun, Hu, Liu, Miao, Meng, Hu, Huang, Li, Ha, Deng, Mei, Zhong, Gao, Sang, Zhang, Fang, Yu, Yang, Chen, Hong, Wu, Chen, Cai, Song, Pan, Dong, Pan, Zhang, Shen, Liu, Gu, Liu, Zhang, Flint and Kendler2014a, Reference Li, Aggen, Shi, Gao, Tao, Zhang, Wang, Gao, Yang, Liu, Li, Shi, Wang, Liu, Zhang, Du, Jiang, Shen, Zhang, Liang, Sun, Hu, Liu, Miao, Meng, Hu, Huang, Li, Ha, Deng, Mei, Zhong, Gao, Sang, Zhang, Fang, Yu, Yang, Chen, Hong, Wu, Chen, Cai, Song, Pan, Dong, Pan, Zhang, Shen, Liu, Gu, Liu, Zhang, Flint and Kendlerb; Rodgers et al. Reference Rodgers, Grosse Holtforth, Muller, Hengartner, Rossler and Ajdacic-Gross2014; Fried et al. Reference Fried, van Borkulo, Epskamp, Schoevers, Tuerlinckx and Borsboom2016). Furthermore, the factor structure of depression changes over time among depressed patients (Fried et al. Reference Fried, van Borkulo, Epskamp, Schoevers, Tuerlinckx and Borsboom2016). It therefore seems unlikely that a dimensional structure underlies the symptoms of depression. Consequently, previous literature reporting and making use of depression dimensions should be considered cautiously.
The mostly-weak symptom correlations among depressed individuals were particularly surprising. Symptom correlations have seldom been reported in the literature and, hence, this phenomenon seems to have gone unnoticed. Nonetheless, Cramer et al. reported average symptom correlations among American adults with a ‘dysphoric episode’ (defined as an episode with at least two depressive symptoms), and Fried et al. reported average symptom correlations among American and Dutch depression patients. Consistent with our results, these authors reported average correlations ranging from r = 0.17 to 0.23 (Cramer et al. Reference Cramer, Borsboom, Aggen and Kendler2012) and from r = 0.12 to 0.39 (Fried et al. Reference Fried, van Borkulo, Epskamp, Schoevers, Tuerlinckx and Borsboom2016). Additionally, previous studies failed to detect substantial stability of depression symptoms and subtypes between successive depressive episodes (Coryell et al. Reference Coryell, Winokur, Shea, Maser, Endicott and Akiskal1994; Lewinsohn et al. Reference Lewinsohn, Pettit, Joiner and Seeley2003; Melartin et al. Reference Melartin, Leskela, Rytsala, Sokero, Lestela-Mielonen and Isometsa2004; Oquendo et al. Reference Oquendo, Barrera, Ellis, Li, Burke, Grunebaum, Endicott and Mann2004). Thus, symptom correlations seem to be rather weak, both within and between depressive episodes.
That symptom correlations were so weak implies that, even if a replicable dimensional model existed, it would be based on an average correlation of r ≈ 0.20; the vast majority of symptom variance would remain unexplained, as correlation-based models cannot explain symptom variance beyond these correlations. This agrees with two recent studies that uncovered highly-diverse symptom profiles among depression patients (Fried & Nesse, Reference Fried and Nesse2015; Zimmerman et al. Reference Zimmerman, Ellison, Young, Chelminski and Dalrymple2015). For example, Fried and Nesse identified 1030 unique profiles of depression symptoms in a sample of 3703 depressed American outpatients, with the most frequent profile only occurring in 1.8% of patients (Fried & Nesse, Reference Fried and Nesse2015). One explanation of how such diverse profiles develop is that adverse life events and other risk factors exerted differential impacts on depressive symptoms (Keller & Nesse, Reference Keller and Nesse2006; Keller et al. Reference Keller, Neale and Kendler2007; Lux & Kendler, Reference Lux and Kendler2010; Fried et al. Reference Fried, Nesse, Zivin, Guille and Sen2014; Fried et al. Reference Fried, Nesse, Guille and Sen2015) and appeared to change the symptoms’ correlation patterns (Cramer et al. Reference Cramer, Borsboom, Aggen and Kendler2012). Thus, an individual's symptom profile depends at least partially on the aetiological factors that provoked the depressive episode. Furthermore, these different aetiologies are likely to imply differential responses to various treatment options. For example, evidence indicates that depression related to negative life events and trauma is more responsive to psychotherapy than to medication, whereas depressed individuals with maladaptive personality traits may respond better to selective serotonin-reuptake inhibitors (Simon & Perlis, Reference Simon and Perlis2010).
Two final issues concern the recent emergence of network models as an alternative account of mental disorders (Bringmann et al. Reference Bringmann, Vissers, Wichers, Geschwind, Kuppens, Peeters, Borsboom and Tuerlinckx2013; Goekoop & Goekoop, Reference Goekoop and Goekoop2014; van Borkulo et al. Reference van Borkulo, Borsboom, Epskamp, Blanken, Boschloo, Schoevers and Waldorp2014; Boschloo et al. Reference Boschloo, van Borkulo, Rhemtulla, Keyes, Borsboom and Schoevers2015; Bringmann et al. Reference Bringmann, Lemmens, Huibers, Borsboom and Tuerlinckx2015; van Borkulo et al. Reference van Borkulo, Boschloo, Borsboom, Penninx, Waldorp and Schoevers2015; Beard et al. Reference Beard, Millner, Forgeard, Fried, Hsu, Treadway, Leonard, Kertz and Bjorgvinsson2016). Network models are based on the premise that symptom inter-relationships reflect direct causal influences between symptoms, rather than underlying latent factors, as in the factor analysis framework. The exact relationship between factor and network models remains unclear, however (Molenaar, Reference Molenaar2010; Ross, Reference Ross2010), and various authors disagreed with the network proponents’ critique of the latent variable approach (Belzung et al. Reference Belzung, Billette de Villemeur, Lemoine and Camus2010; Danks et al. Reference Danks, Fancsali, Glymour and Scheines2010; Haig & Vertue, Reference Haig and Vertue2010; Humphry & McGrane, Reference Humphry and McGrane2010; Markus, Reference Markus2010). Most importantly, no empirical comparison of these two approaches has yet been reported (Krueger et al. Reference Krueger, Deyoung and Markon2010). Thus, how and to what degree one would draw different conclusions when applying factor analysis v. network modelling to one and the same sample is unclear. Future research needs to address this issue.
Second, our results are likely of importance to network research, since they indicate that the choice of sample type can impact the strength of symptom relationships considerably. Since networks are also based on symptom relationships, this should be an issue in network research, too. Indeed, depression-related network studies have been based on all sorts of samples (Bringmann et al. Reference Bringmann, Vissers, Wichers, Geschwind, Kuppens, Peeters, Borsboom and Tuerlinckx2013; Goekoop & Goekoop, Reference Goekoop and Goekoop2014; van Borkulo et al. Reference van Borkulo, Borsboom, Epskamp, Blanken, Boschloo, Schoevers and Waldorp2014; Boschloo et al. Reference Boschloo, van Borkulo, Rhemtulla, Keyes, Borsboom and Schoevers2015; Bringmann et al. Reference Bringmann, Lemmens, Huibers, Borsboom and Tuerlinckx2015; van Borkulo et al. Reference van Borkulo, Boschloo, Borsboom, Penninx, Waldorp and Schoevers2015; Beard et al. Reference Beard, Millner, Forgeard, Fried, Hsu, Treadway, Leonard, Kertz and Bjorgvinsson2016). Even more intriguing, it was recently found that global network connectivity increased as disorder severity decreased over time (Beard et al. Reference Beard, Millner, Forgeard, Fried, Hsu, Treadway, Leonard, Kertz and Bjorgvinsson2016).
Limitations
Our study had several limitations. First, it was restricted to young adults, so the results’ generalisability must be re-examined in demographically-broader samples. Second, symptom lists that are more differentiated than the nine DSM-IV criteria might be required, especially considering the weak correlations we detected in our depressed samples. More differentiated symptoms might be needed to capture depression subtypes in patient samples. Note, however, that studies using more comprehensive symptom sets have thus far also failed to uncover replicable dimensions (van Loo et al. Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012). Third, we used dichotomised symptom scores to facilitate comparisons against previous research. Doing so, some information might have been lost. Future studies should evaluate more finely-grained symptom scales. Fourth, step 2 of our analysis was exploratory and included multiple testing. Note, however, that we used a bootstrap procedure and replicated our analyses across different samples to safeguard against this. Finally, contrary to subtype research using latent class analysis, a dimensional approach could not detect subtypes that are based on only one or two symptoms (if a subtype is defined by several symptoms, however, these symptoms would be tightly correlated and, hence, emerge as a dimension). Thus, whereas our results rule out a dimensional structure of depression, there might still be subtypes of depression characterised by the presence of one or two specific symptoms. Note, however, that previous research focusing on statistically-derived subtypes has also failed to reveal replicable results (van Loo et al. Reference van Loo, de Jonge, Romeijn, Kessler and Schoevers2012).
Implications
Given prior research findings, our results have two implications. First, caution is warranted when considering studies assessing dimensions of depression because general population-based studies and studies of depressed individuals generate different data that can lead to different conclusions. This problem likely generalises to other models based on the symptoms’ inter-relationships (e.g., network models). Second, it appears that the two dominant aspects of depression are its overall severity and each individual's symptom profile. Whereas the overall severity aligns individuals on a continuum of disorder intensity that allows non-affected individuals to be distinguished from affected individuals, the clinical evaluation and treatment of depressed individuals should focus directly on each individual's symptom profile, since it seems to convey most clinically-relevant information.
Acknowledgements
None.
Financial Support
This work was supported by the Swiss National Science Foundation (grant number FN33CS30_148493). The funder had no role in study design, data collection or analysis, decision to publish or preparation of the manuscript.
Conflicts of Interest
None.
Ethical Standards
All procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975 revised in 2008.
Availability of Data and Materials
The raw data of the C-SURF-cohort study and the NHANES study are available at http://www.c-surf.ch/en/30.html and at http://www.cdc.gov/nchs/nhanes/nhanes_questionnaires.htm.