Introduction
A key component of the current approach to the diagnosis of major depression (MD) in DSM-IV is the symptomatic or ‘A’ criteria. These nine criteria (‘depressed mood’, ‘loss of interest’, ‘appetite or weight change’, ‘hyper- or insomnia’, ‘psychomotor agitation or retardation’, ‘fatigue’, ‘feelings of worthlessness or guilt’, ‘diminished ability to think or concentrate or indecisiveness’ and ‘suicidal ideation or thoughts of death’) derive, with only modest alteration, from the list of symptoms proposed in the Feighner et al. (Reference Feighner, Robins, Guze, Woodruff, Winokur and Munoz1972) criteria. These criteria were created by consensus of senior clinicians at Washington University in the early 1970s and were particularly influenced by a study by Cassidy et al. (Reference Cassidy, Flanagan, Spellman and Cohen1957) that analyzed symptom frequencies in manic-depressive patients as compared with those with medical illnesses (Cassidy et al. Reference Cassidy, Flanagan, Spellman and Cohen1957; Zimmerman et al. Reference Zimmerman, McGlinchey, Young and Chelminski2006b; Kendler et al. Reference Kendler, Munoz and Murphy2009). The symptoms included in the Cassidy study were, in turn, partly based on those utilized in a 1950 report, which examined symptoms in 50 cases of melancholia (Stone & Burris, Reference Stone and Burris1950).
Since then, many studies have examined the diagnostic reliability of this classification (for example, Spitzer & Fleiss, Reference Spitzer and Fleiss1974; Spitzer et al. Reference Spitzer, Endicott and Robins1978; Keller et al. Reference Keller, Klein, Hirschfeld, Kocsis, McCullough, Miller, First, Holzer, Keitner, Marin and Shea1995), the differentiation between clinical and subclinical depression (for example, Kendler & Gardner, Reference Kendler and Gardner1998; Parker et al. Reference Parker, Wilhelm and Asghari1998), the validity of subtypes (Frances et al. Reference Frances, Pincus, Widiger, Davis and First1990; Kendler et al. Reference Kendler, Eaves, Walters, Neale, Heath and Kessler1996; Schotte et al. Reference Schotte, Maes, Cluydts and Cosyns1997; Chen et al. Reference Chen, Eaton, Gallo and Nestadt2000; Sullivan et al. Reference Sullivan, Prescott and Kendler2002; Carragher et al. Reference Carragher, Adamson, Bunting and McCann2009) and the relationship between single symptoms and depressive subtypes with severity of MD (Faravelli et al. Reference Faravelli, Servi, Arends and Strik1996). By contrast, we are aware of only two studies that have addressed, even indirectly, the validity of these widely used symptomatic criteria for MD (McGlinchey et al. Reference McGlinchey, Zimmerman, Young and Chelminski2006; Zimmerman et al. Reference Zimmerman, McGlinchey, Young and Chelminski2006a). Both concluded that further efforts at validation were needed. Zimmerman et al. (Reference Zimmerman, McGlinchey, Young and Chelminski2006a) examined the content validity of the DSM-IV classification of MD and found the criteria were somewhat redundant. The same research group examined the ability of single depressive symptoms to differentiate between patients with and without a diagnosis of MD and found that the individual criterion performed quite differently in this regard (McGlinchey et al. Reference McGlinchey, Zimmerman, Young and Chelminski2006).
In this report, we examine the validity of the DSM-IV symptomatic criteria for MD in a sample of ∼1000 cases of MD reported in the last year in a population-based cohort of twins (Kendler & Prescott, Reference Kendler and Prescott2006). First, we examine the ability of individual criterion to predict uniquely a broad set of validators, including risk of MD in co-twin, demographic characteristics, lifetime co-morbidities, characteristics at index episode, prior depression history and risk for future episodes. Based on these results, we then aggregate two sets of symptomatic MD criteria ‘A’ and compare their performance on the same set of validators.
Methods
Sample and measurements
Participants in this report derive from two inter-related studies in Caucasian same-sex twin pairs from the Virginia Adult Twin Study of Psychiatric and Substance Use Disorders (VATSPSUD) (Kendler & Prescott, Reference Kendler and Prescott2006). All subjects were ascertained from the Virginia Twin Registry – a population-based register formed from a systematic review of birth certificates in the Commonwealth of Virginia. Female–female (FF) twin pairs, from birth years 1934–1974, became eligible if both members previously responded to a mailed questionnaire in 1987–1988, the response rate to which was ∼64%. The first face-to-face interview (FF1) was completed by 92% (n=2163) of the eligible twins. These twins participated in three subsequent interviews with cooperation rates ranging from 85 to 93%. Participating twin pairs were identified for the male–male and male–female (MMMF) cohort from twins with birth years from 1940–1974 initially ascertained directly from registry records. The first MMMF interview (MMMF1) was conducted by telephone with a response rate of 72% (n=6812). This sample was re-interviewed once with an 83% response rate. Zygosity was determined by discriminate function analyses using standard twin questions validated against DNA genotyping in 496 pairs (Kendler & Prescott, Reference Kendler and Prescott1999).
MD diagnoses were based on a module in the FF1 and MMMF1, where every subject was asked whether they experienced each of the disaggregated criteria symptoms for DSM-IV MD in the year prior to the interview. Separate questions were asked for psychomotor agitation and retardation, insomnia and hypersomnia, weight loss and gain and appetite increase and decrease.
DSM-IV criteria for last-year MD were met by 217 twins from the FF1 and 798 twins from the MMMF1 interviews, prevalence rates are 10% [95% confident interval (CI) 8.8–11.4%] in the FF1 and 11.7% (95% CI 11.0–12.5%) in the first interview of male–female twins. Of these 1015 twins, 518 were males and 497 were females. At the time of the first interview, their ages ranged from 18 to 57 years with a mean of 34.5. There were 83 twin pairs, with both twins diagnosed with MD, and two more pairs from two triplets who were diagnosed with MD.
For our first analysis, we created binary dummy variables for each of the nine criteria, indicating the presence of each as part of the depressive syndrome. For the disaggregated criteria, it was counted present when at least one of the disaggregated symptoms was reported (e.g. ‘weight loss’ for appetite/weight change). In addition, a criterion was not counted as part of the depressive syndrome when reported due to medication or illness.
For comparing the two symptom groups – cognitive and neurovegetative – we counted the number of endorsed symptoms separately for each group. Cognitive symptoms included ‘depressed mood’, ‘loss of interest’, ‘worthlessness/guilt’ and ‘suicidal ideation’ and neurovegetative symptoms included ‘sleep’, ‘appetite/weight’, ‘psychomotor’ changes and ‘fatigue’. ‘Trouble concentrating’ was excluded from this analysis (see Results for details).
The VATSPSUD includes a rich set of data about future episodes, co-twin history of MD, lifetime co-morbidities, demographic characteristics and characteristics of the index depressive episode (Kendler & Prescott, Reference Kendler and Prescott2006). For demographic characteristics, characteristics of the index depressive episode and last year co-morbidity with general anxiety disorder (GAD), the data came from the same interview wave. For future episodes, depressive episodes of the co-twin and all other co-morbidities, data were obtained from all interview waves to maximize available information.
GAD was diagnosed using the DSM-III-R criteria (APA, 1987) requiring a minimum of 1 month duration. Panic disorder was also diagnosed using the DSM-III-R criteria. We also examined a broad definition of panic (‘panic broad’), for which the only criterion was a positive response to a probe question for lifetime panic attacks (‘Thinking back over your entire life, have you ever had a spell or attack when you suddenly felt frightened or extremely uncomfortable in a situation in which you didn't expect to feel that way?’). Real danger or clear phobic stimuli were excluded. ‘Any phobia’ was diagnosed using an adaptation of DSM-III criteria (APA 1980) requiring one or more unreasonable fears, including fears of different animals, social phobia and agoraphobia that objectively interfered with the respondent's life. Nicotine dependence was defined as a score ⩾7 on the Fagerström Tolerance Questionnaire (Fagerström, Reference Fagerström1978) and alcohol dependence and illicit drug dependence were diagnosed using DSM-IV criteria (APA, 1994). Adult antisocial personality traits were defined as meeting three or more of the DSM-III-R (APA, 1987) ‘C criteria’ for antisocial personality disorder. Extraversion was assessed with eight and neuroticism with 12 items from the short form of the self-administered Eysenck Personality Questionnaire (Eysenck et al. Reference Eysenck, Eysenck and Barrett1985). For ‘co-occurring anxiety symptoms’ we used a binary variable indicating whether the respondent endorsed at least one of two anxiety symptoms for which we had separate questions: ‘felt anxious, nervous or worried’ and ‘muscles felt tense or felt jumpy or shaky inside’ lasting at least 5 days in the last year prior to the interview. ‘Chronic MD’ was defined as a depressive episode lasting 12 months or longer. For ‘something happened before the depressive episode’, we used a question asking whether ‘something happened to make you feel that way or did the feeling just come on you “out of the blue”?’ for the worst depressive episode in the last year. ‘Seeking help’ was assessed by a question asking whether the respondent went to get help from health professionals, ministers, self-help groups or anyone else. Finally, childhood sexual abuse was defined by a positive response to the question: ‘Have you ever been sexually abused or molested?’ and when the abuse happened before the age of 16 years.
Data analysis
We conducted logistic regression analyses with the logistic function in SAS (SAS Institute, 2005). Our approach was to compare the odds ratios (ORs) of each test to examine possible patterns of differences, with the p values indicating the general probability of results arising from chance. (For our approach to the question of multiple testing, see Limitations.) For the first set of analyses, the validator variable functioned as dependent variable while the dummy variable representing that the specific symptom was part of the syndrome was the predictor variable. For the second set of analyses, which compared the two symptom groups, the number of cognitive or number of neurovegetative symptoms was the predictor variable. Depending on the validator variable, we fitted a binary or cumulative logit model to the data, with age and sex as covariates. In addition, we included the number of positive criteria symptoms as a covariate in the model when comparing the single symptoms with control for the overall clinical severity of the depressive episode as indexed by the number of endorsed criteria. When we examined the two symptom groups, the symptom counts for both groups were included in the model to get the unique predictive power for the cognitive and the neurovegetative symptom count controlling for the number of symptoms endorsed in the other symptom group. In addition, we also tested if the observed differences in the odds ratios between both groups were significant using the test statement of the logistic procedure in SAS (SAS Institute, 2005).
When risk of MD in co-twin was the dependent variable, zygosity was included as covariate in the model. p Values are reported two-tailed except for risk of MD in co-twin, where we report one-tailed values, given the prior prediction of twin resemblance.
In interpreting these results, it is important to note that the first set of analyses that examined individual criterion controlled for total number of endorsed symptoms. In these analyses, an OR <1 would indicate that compared with all subjects who endorsed n criteria, those who endorsed symptom X predicted the validator more poorly. However, in our analyses of criterion groups, controlling for number of endorsed symptoms had the undesirable effect of causing a strong negative correlation between the predictive effects of the two symptom groups. Therefore, in these analyses, where we did not control for total endorsed criteria, we would expect nearly all ORs to be positive as more symptoms tend to predict these validators. Our goal here was to compare the strength of the observed positive ORs between the two symptom groups.
Results
Logistic regression analyses of single symptoms
The first nine columns of Table 1 show the results from the logistic regression analyses of the nine A criteria with each of the 25 validator variables. ORs, 95% CI and levels of statistical significance are presented, controlling for age, sex and the total number of endorsed criteria. For the cumulative logit models, the ORs reflect the impact of a 1 s.d. change in the dependent variable.
Table 1. Comparison major depression (MD) criteria symptoms
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921043524214-0417:S0033291709992157:S0033291709992157_tab1.gif?pub-status=live)
a One-tailed p values, regression model with zygosity as additional covariate;
n, sample size, differences in sample sizes throughout validators due to missing data; BL, binary logit model; CL, cumulative logit model, covariates included in the model: age, sex and criteria symptom count; xxx, no valid regression model; OR, odds ratio; CI, confidence interval.
* p<0.1, shaded cells are significant at the p<0.05 level, ** p<0.05, *** p<0.01.
† p<0.001, ‡ p<0.0001.
In total, 36 of the 225 analyses (16%) were statistically significant (p<0.05), far in excess of chance expectations (Feild & Armenakis, Reference Feild and Armenakis1974). Loss of interest was the symptom with the fewest (only one) and suicidal ideation was the symptom with the most significant results (nine). The means (s.d.) of the ORs for the nine criteria across all the analyses ranged from 1.99 (1.18) for depressed mood to 1.17 (0.14) for trouble concentrating.
These results reveal a pattern of relationships between the criteria and the validators that was unexpectedly complex and variable. Of the diverse set of findings, nine were noteworthy.
(1) Regarding co-morbidities with anxiety disorders, psychomotor agitation/retardation predicted co-morbidity with GAD and was strongly positive associated with prominent anxiety symptoms, whereas appetite/weight change was negative associated with the anxiety symptoms. In addition, suicidal ideation positively predicted fully syndromal panic disorder.
(2) With regard to co-morbidities with substance use disorders and adult antisocial personality traits, suicidal ideation positively predicted risk for illicit drug dependence and adult antisocial personality traits, while appetite or weight changes were negative associated with both these validator variables. Yet, only worthlessness/guilt predicted co-morbidity with alcohol dependence and nicotine dependence, while fatigue was negative associated with nicotine dependence.
(3) None of the nine A criteria was significantly related to a new episode of MD or a diagnosis of MD in the co-twin. However, examining non-significant findings (p>0.05), we observed a heterogeneous pattern of relationships for these validators. For example, loss of interest showed a positive but depressed mood a negative association with risk for future episodes. Problems sleeping and fatigue tended to be positively related and depressed mood and suicidal ideation negatively related to the risk for MD in the co-twin.
(4) Significant sex differences emerged for four of the criteria; depressed mood, appetite/weight changes and fatigue significantly predicted being female, while psychomotor agitation/retardation was associated with being male.
(5) The criteria varied substantially in their relationships to other demographic characteristics. For example, age differences were found for two criteria; depressed mood and psychomotor agitation/retardation had a significant positive correlation with current age. In addition, four symptoms were associated with education, with sleep changes and fatigue being associated with more, and psychomotor agitation/retardation and suicidal ideation associated with less years of education. Also, depressed mood was associated with lower and trouble concentrating associated with higher family income.
(6) Regarding the two personality traits – neuroticism and extraversion – feeling worthless/guilty and suicidal ideation predicted higher neuroticism scores, while appetite/weight change and sleep changes were associated with lower neuroticism scores. In addition, worthlessness/guilt was associated with a lower extraversion score.
(7) Only suicidal ideation predicted a longer duration of the index episode and chronic depression, while fatigue was positively associated with shorter episodes and negatively associated with ‘chronic depression’ (defined as lasting ⩾12 months).
(8) Loss of interest was associated with episodes precipitated by a particular life event (‘something happened’), while suicidal ideation was associated with experiencing the depressive episode ‘out of the blue’.
(9) None of the symptoms was significantly associated with the number of prior depressive episodes, while only sleep problems predicted (older) age of first onset of MD.
Logistic regression analyses of the two symptom groups
A review of the results obtained at the individual symptom level reveals a tendency for predictions from neurovegetative symptoms (most typically appetite/weight, sleep problems, psychomotor changes and fatigue) to differ qualitatively from those obtained for the ‘cognitive/emotional’ symptoms (hereafter ‘cognitive’) of sad mood, loss of interest, guilt and suicidal ideation. For example, a significantly elevated risk for alcohol, nicotine and illicit drug dependence and antisocial traits is associated with symptoms of guilt or suicidal ideation, whereas a significantly decreased risk is associated with symptoms of fatigue or appetite changes. Levels of neuroticism are positively associated with guilt or suicidal ideation but negatively associated with appetite and sleep changes. Fatigue predicts shorter and suicidal ideation longer depressive episodes. Similar trends that do not reach significance are seen for co-morbidity, with panic disorder (positive with suicidal ideation and negative with appetite and sleep problems) and alcoholism (positive with guilt and negative with appetite and sleep problems), help-seeking (positive with suicidal ideation and negative with psychomotor changes and fatigue) and age at onset (positive with sleep problems and negative with depressed mood, loss of interest and suicidal ideation). While this trend was far from uniform (and did not appear to involve trouble concentrating in either group), we judged it sufficiently prominent to warrant further investigation.
We therefore examined, in our sample of individuals with MD in the last year, the ability of the sum of the symptoms endorsed in these two groups to predict the same set of validators we examined for the individual symptoms. Table 2 shows the results of these analyses.
Table 2. Comparison of cognitive and neurovegetative criteria count (logistic regression)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921043524214-0417:S0033291709992157:S0033291709992157_tab2.gif?pub-status=live)
n, Sample size, differences due to missing data; BL, binary logit model; CL, cumulative logit model, covariates included in the model: sex, age and symptom count of the other symptom group; MD, major depression; OR, odds ratio; CI, confidence interval.
Cognitive symptoms: depressed mood; loss of interest; feeling worthless/guilty; suicidal ideation.
Neurovegetative symptoms: appetite/weight change; sleep; psychomotor agitation/retardation; fatigue.
Not counted: trouble concentrating.
a One-tailed p values, zygosity included as covariate in the model.
Two-tailed p values:
* p<0.1, ** p<0.05, *** p<0.01, † p<0.001, ‡ p<0.0001.
ORs, 95% CI and levels of statistical significance are presented, controlling only for age, sex and the number of endorsed criteria symptoms of the other symptom group. We also controlled for zygosity when the ‘MD diagnosis in co-twin’ was the dependent variable. For the cumulative logit models, the ORs of 1 s.d. increase of the dependent variable are reported in the table.
Strikingly, for 17 of the 25 validator variables, the predictive power of the two symptom groups was significantly different from one another. Out of these results, eight were noteworthy.
(1) The number of endorsed cognitive criteria better predicted co-morbidities with other disorders than did the number of neurovegetative criteria, and these differences were significant except for GAD and our broad definition of panic.
(2) The number of cognitive criteria better predicted both higher neuroticism scores and lower extraversion scores and these differences were also significant.
(3) Regarding characteristics of the index episode, the cognitive criteria count significantly better predicted the duration of the episode, chronic MD defined as a depressive episode lasting longer than 12 months and seeking help, while the number of neurovegetative criteria better predicted that the depressive episode occurred ‘out of the blue’.
(4) The cognitive symptom count also significantly better predicted the number of prior episodes.
(5) The two symptom scores did not predict age at interview while the neurovegetative criteria count better predicted being female; this difference was significant.
(6) Regarding the other demographic characteristics, the cognitive criteria count was associated with fewer years of schooling and lower income; these differences were also significant.
(7) The cognitive criteria count significantly better predicted childhood sexual abuse.
(8) We found no significant differences between the two criteria counts regarding the prediction of a new depressive episode and a MD diagnosis in the co-twin.
Discussion
The aim of our analysis was to examine carefully the validity of the individual DSM-IV symptomatic or ‘A’ criteria for MD. In a large epidemiological sample of individuals who reported episodes of MD in the last year, we began by comparing the nine criteria on a wide range of potentially validating characteristics, none of which played any role in the diagnostic process. Our analyses focused on potential similarities and differences in the patterns of association of the nine criteria symptoms with these characteristics. In these analyses, we controlled for the number of endorsed A criteria so that an OR >1 meant that the presence of this criterion predicted the validator more strongly than expected for individuals with the same number of endorsed symptoms without that particular criterion. An OR <1 meant that the presence of this criterion predicted the validator more poorly than expected for individuals with the same number of endorsed symptoms without that particular criterion.
These results revealed an unexpected degree of heterogeneity in the performance of the individual criterion. Indeed, each of the nine criteria had a relatively unique pattern of findings with regard to our set of validator variables, suggesting a surprising degree of evidence for ‘covert heterogeneity’ within the MD syndrome. The relationships we observed support the notion of heterogeneity of the classification of MD discussed in the literature (e.g. Merikangas et al. Reference Merikangas, Wicki and Angst1994; Kendler et al. Reference Kendler, Eaves, Walters, Neale, Heath and Kessler1996; Chen et al. Reference Chen, Eaton, Gallo and Nestadt2000; Carragher et al. Reference Carragher, Adamson, Bunting and McCann2009; Hybels et al. Reference Hybels, Blazer, Pieper, Landerman and Steffens2009).
To further characterize this heterogeneity, we also examined an intermediate level between the MD diagnosis and the single criterion by comparing the number of endorsed criteria separately for two symptom groups – cognitive and neurovegetative – through the same set of validator variables. Of note, in recent exploratory factor analyses of depressive symptoms run with the entire VATSPSUD sample, we found a good statistical fit for a two-factor model with one factor reflecting cognitive and the other neurovegetative symptoms. When we compared these two symptom groups, we did not control for the total number of endorsed symptoms. Rather, we specifically examined whether the predictive power differed between the two groups of criteria. As our results show, some of this heterogeneity can be represented by the differences between the number of endorsed cognitive and neurovegetative criteria, with these differences being significant for most of the validator variables. Therefore, co-morbidities with other psychiatric disorders and the personality traits – higher neuroticism and lower extraversion – were most strongly associated with the cognitive criteria group, while we found significant sex differences (predicting being female) for the number of endorsed neurovegetative criteria.
Some of our individual findings have precedents in the literature. For example, Keller et al. (Reference Keller, Neale and Kendler2007) found that thoughts of self-harm were not associated with adverse life events, which are partly reflected in our result of suicidal ideation, predicting the depressive episode as experienced ‘out of the blue’. Although sex differences for psychomotor changes and fatigue have already been reported for our sample by Khan et al. (Reference Khan, Gardner, Prescott and Kendler2002), our approach was probably not sensitive enough to also show the sleep differences that Khan et al. reported in their study of matched twins. Corresponding to our results, Angst and Dobler-Mikola (Reference Angst and Dobler-Mikola1984) found that females report weight change more often when diagnosed with MD. Their finding of females reporting more feelings of worthlessness/guilt is not supported by our results. In addition, other studies found no sex differences for expressed depressive symptoms (Middeldorp et al. Reference Middeldorp, Wray, Andrews, Martin and Boomsma2006). Furthermore, some associations between substance use disorders and psychomotor agitation and appetite/weight loss were reported in the literature (for example, Balazs et al. Reference Balazs, Benazzi, Rihmer, Rihmer, Akiskal and Akiskal2006; Leventhal et al. Reference Leventhal, Francione and Zimmerman2008), but they are not comparable with our results because of the use of disaggregated symptoms in these studies.
The question of homogeneity versus heterogeneity of MD, including the distinction of ‘types’ of depression by symptom groups or causes (endogenous versus reactive) has a long history in psychiatry (for the homogeneity/unitary position, see Mapother, Reference Mapother1926; Lewis, Reference Lewis1934; Kendell, Reference Kendell1969; Akiskal & McKinney, Reference Akiskal and McKinney1975; for the heterogeneity/binary position, for example, Gillespie, Reference Gillespie1929; Kiloh & Garside, Reference Kiloh and Garside1963; Eysenck, Reference Eysenck1970). Because of a lack of empirical support for a clear binary concept of MD, DSM-III and IV definition were based on the unitarian position accompanied with the hope of finally settling these debates (see Parker, Reference Parker2000; van Praag, Reference van Praag2000). However, the validity of the current DSM-IV classification of MD has been questioned (see e.g. Parker, Reference Parker2005; Zimmerman et al. Reference Zimmerman, McGlinchey, Young and Chelminski2006b). The amount of heterogeneity that we found comparing the performance of criteria symptoms support these concerns. Yet, our results indicate that only some of this heterogeneity can be captured by a simple binary structure of cognitive versus neurovegetative symptoms.
Our results also have implications for the basic structure of DSM diagnostic categories. Beginning with DSM-III, ‘polythetic’ diagnostic models formed the basis for DSM diagnoses. Such models implicitly assume an approximate equivalence of the individual criterion – that is, that each criterion plays a similar and roughly interchangeable role in the diagnostic process. Our results question this assumption for the MD symptomatic criteria. In fact, we found that different symptoms or groups of symptoms of MD are associated with different clinical characteristics.
While each DSM revision has been justifiable proud about the increasing role of empirical findings in making nosologic decisions, in fact most of the diagnostic categories and the diagnostic criteria they contain have been accepted for historical rather than strictly empirical reasons (Kendler & Zachar, Reference Kendler, Zachar, Kendler and Parnas2008). Given the deep clinical wisdom and experience contained in these traditions, this is not an illogical approach. However, these results provide some challenge to this approach and, at a minimum, suggest that detailed psychometric evaluation of these historically defined diagnostic entities is overdue.
Finally, we find that cognitive–emotional criteria are rather consistently stronger predictors of our validators than are the neurovegetative criteria. In contrast, common clinical teaching highlights the neurovegetative or ‘biological’ symptoms as the ‘core’ features of depression. These results are broadly consistent with the work of Beck and colleagues, which emphasizes the primacy of cognitive changes in the etiology of MD (Beck & Alford, Reference Beck and Alford2008).
Limitations
These results should be interpreted in the light of four potentially important methodological concerns. First, our sample is limited to white twins born in the Commonwealth of Virginia and these results may or may not extrapolate to other samples. Second, the nature of our validator variables made it difficult to account formally, in most of our analyses, for the non-independence of observations in our twin data. However, only ∼17% of our data result from twin pairs and by exploring formal corrections for the binary logit models we found no substantial effects. Third, given the number of tests we performed, some of the significant findings are surely a result of chance effects. To correct formally for multiple testing, a strict Bonferroni (see Abdi, Reference Abdi and Salkind2007) would require setting the p value to around 0.0002. This would surely be too conservative, because of the substantial high correlations between a number of symptoms and between some of the validators. Furthermore, our goal here is not so much to strictly evaluate every single hypothesis; rather, we advocate taking an experimental wide approach, asking if the broad pattern of relationships between symptoms and validators are likely chance effects. As we demonstrate, this is very unlikely to be the case, as the total number of significant findings far exceeds that expected under the null hypothesis. While we cannot be confidant that any one of our findings is not due to chance effects, the probably that the entire pattern of results arose solely from false positives due to multiple testing is extremely unlikely.
Fourth, we created the two symptom groups based partly on our initial results and partly on conceptual and clinical grounds. Only the decision to drop ‘trouble concentrating’ was based on empirical results from our single symptom analysis. Other clusters of symptom groups could have possibly revealed other structures. However, the distinction between cognitive and neurovegetative symptoms has long precedence in the depression literature (Koenig et al. Reference Koenig, Cohen, Blazer, Krishnan and Sibert1993; Beck & Alford, Reference Beck and Alford2008; Carragher et al. Reference Carragher, Adamson, Bunting and McCann2009) and has been empirically supported by factor analyses conducted subsequently in this sample.
Conclusion
In individuals meeting DSM-IV criteria for MD in the last year in an epidemiological sample, the individual symptomatic criteria showed strongly varying patterns of relationships with a broad set of clinical relevant validators including demographic characteristics, co-morbidities, characteristics of depressive episodes and personality traits. This pattern is partly reflected by a dichotomy between cognitive and neurovegetative symptoms. Therefore, regarding the characteristics represented by endorsed symptoms, the current DSM-IV definition of MD represents a construct of substantial heterogeneity. These results challenge our understanding of MD as homogenous categorical entity. Hopefully, future research will determine whether a distinction between cognitive and neurovegetative symptoms is clinically relevant to the etiology, course and treatment of MD.
Acknowledgements
This work was supported by the American Psychiatric Association and NIH grants MH-0828 and MH/DA/AA 49492.
Declaration of Interest
None.