Introduction
Depressive disorders are common mental health afflictions. In the United States alone, lifetime morbid risk of major depressive disorder is 29.9% and the 12-month prevalence is estimated to be between 6.6% and 10.3% (Kessler et al. Reference Kessler, Berglund, Demler, Jin, Koretz, Merikangas, Rush, Walters and Wang2003, Reference Kessler, Chiu, Demler and Walters2005, Reference Kessler, Petukhova, Sampson, Zaslavsky and Wittchen2012; Reeves et al. Reference Reeves, Strine, Pratt, Thompson, Ahluwalia, Dhingra, McKnight-Eily, Harrison, D'angelo, Williams, Morrow, Gould and Safran2011). Psychological treatments for depression are effective (Cuijpers et al. Reference Cuijpers, Van Straten, Andersson and Van Oppen2008) and desirable to patients (Priest et al. Reference Priest, Vize, Roberts, Roberts and Tylee1996; Brody et al. Reference Brody, Khaliq and Thompson1997; Bedi et al. Reference Bedi, Chilvers, Churchill, Dewey, Duggan, Fielding, Gretton, Miller, Harrison, Lee and Williams2000; Churchill et al. Reference Churchill, Khaira, Gretton, Chilvers, Dewey, Duggan and Lee2000; Dwight-Johnson et al. Reference Dwight-Johnson, Sherbourne, Liao and Wells2000). Cognitive behavioral therapy (CBT) is the most studied psychotherapy for the treatment of depression and carries the strongest body of evidence for its effectiveness (Dobson, Reference Dobson1989; Butler et al. Reference Butler, Chapman, Forman and Beck2006; Cuijpers et al. Reference Cuijpers, Berking, Andersson, Quigley, Kleiboer and Dobson2013). Increasingly, the delivery of CBT through different media has been investigated, including via telephone (Mohr et al. Reference Mohr, Hart, Julian, Catledge, Honos-Webb, Vella and Tasch2005), internet websites (Andersson & Cuijpers, Reference Andersson and Cuijpers2009), and bibliotherapy (Cuijpers, Reference Cuijpers1997). While CBT is effective when delivered across many different media, improvement is not inevitable in any of them.
A vast literature of baseline predictors of outcome for CBT for depression exists. Most predictors tend to be prognostic indicators, or factors that identify patients that may do better in treatment in general. Positive outcomes are often found among patients with: (1) lower symptom severity (Jarrett et al. Reference Jarrett, Eaves, Grannemann and Rush1991; Sotsky et al. Reference Sotsky, Glass, Shea, Pilkonis, Collins, Elkin, Watkins, Imber, Leber, Moyer and Oliveri1991; Shapiro et al. Reference Shapiro, Barkham, Rees, Hardy, Reynolds and Startup1994; Thase et al. Reference Thase, Reynolds, Frank, Simons, Garamoni, McGeary, Harden, Fasiczka and Cahalane1994; Agosti & Ocepek-Welikson, Reference Agosti and Ocepek-Welikson1997; Persons et al. Reference Persons, Bostrom and Bertagnolli1999; Hamilton & Dobson, Reference Hamilton and Dobson2002; Coffman et al. Reference Coffman, Martell, Dimidjian, Gallop and Hollon2007); (2) shorter current episode duration (Sotsky et al. Reference Sotsky, Glass, Shea, Pilkonis, Collins, Elkin, Watkins, Imber, Leber, Moyer and Oliveri1991; Rush et al. Reference Rush, Beck, Kovacs and Hollon1977; Hamilton & Dobson, Reference Hamilton and Dobson2002); (3) absence of family history of depression (Sotsky et al. Reference Sotsky, Glass, Shea, Pilkonis, Collins, Elkin, Watkins, Imber, Leber, Moyer and Oliveri1991); (4) older age of initial onset (Jarrett et al. Reference Jarrett, Eaves, Grannemann and Rush1991; Sotsky et al. Reference Sotsky, Glass, Shea, Pilkonis, Collins, Elkin, Watkins, Imber, Leber, Moyer and Oliveri1991; Agosti & Ocepek-Welikson, Reference Agosti and Ocepek-Welikson1997; Hamilton & Dobson, Reference Hamilton and Dobson2002); (5) lower number of previous episodes (Sotsky et al. Reference Sotsky, Glass, Shea, Pilkonis, Collins, Elkin, Watkins, Imber, Leber, Moyer and Oliveri1991; Hamilton & Dobson, Reference Hamilton and Dobson2002; Bockting et al. Reference Bockting, Spinhoven, Koeter, Wouters, Visser, Schene and Group2006; Fournier et al. Reference Fournier, Derubeis, Shelton, Hollon, Amsterdam and Gallop2009); and (6) absence of co-morbid conditions (Reich et al. Reference Reich, Warshaw, Peterson and White1995; Gelhart & King, Reference Gelhart and King2001; Driessen & Hollon, Reference Driessen and Hollon2010).
Identifying prognostic variables is difficult and applying them to clinical practice introduces additional challenges. Statistical techniques for identifying moderators are often underpowered to uncover these relationships in individual studies (Brown et al. Reference Brown, Sloboda, Faggiano, Teasdale, Keller, Burkhart, Vigna-Taglianti, Howe, Masyn and Wang2013). Furthermore, most studies explore only a few variables that investigators selected a priori. These findings can then only support that the variable selected was meaningful, but not whether another variable would have been more useful. Practically, clinicians need to know which are the main factors to consider in determining questions of prognosis. However, attempting to predict outcomes from multiple predictors might lead to classification issues especially in cases of high multicollinearity between predictors. Classification and Regression Tree (CART) analysis is a data-mining technique that uses recursive binary partitioning to select optimal splits of predictor variables to obtain increasingly homogenous groups with respect to the outcome (Breiman et al. Reference Breiman, Friedman, Olshen and Stone1984; King & Resick, Reference King and Resick2014). The benefits of this method over more commonly used analyses for predicting outcomes include the interpretability (i.e. the trees demonstrate if-then conditions that are often considered to be easily interpreted), the capability of including many possible predictor variables, and that these models are robust to violations of normality and linearity to which typical regression models are bound. Thus, CART is an ideal method for exploratory analyses using a broad number of variables.
In exploring predictors of treatment response to CBT, it is important to understand them in the context of different forms of delivery beyond face-to-face (FtF-CBT). The most prominent alternative delivery medium for CBT is via the telephone (T-CBT). For the treatment of depression, T-CBT and FtF-CBT produce similar changes in depression severity at treatment completion (Mohr et al. Reference Mohr, Ho, Duffecy, Reifler, Sokol, Burns, Jin and Siddique2012), and produce similar therapeutic alliance ratings from both the patient and therapist perspectives (Stiles-Shields et al. Reference Stiles-Shields, Kwasny, Cai and Mohr2014b ). As the use of the telephone to provide psychotherapy becomes more widespread (Novotney, Reference Novotney2011), it will be important to understand whether any subset of patients might be more likely to benefit from a particular approach. A number of studies have examined the usefulness of patient variables, such as marital status, employment status, severity of baseline depression, presence of personality disorders, stress, reactance, and internalizing v. externalizing coping to differentially predict responsiveness to various forms of psychotherapy and pharmacotherapy (Beutler et al. Reference Beutler, Engle, Mohr, Daldrup, Bergan, Meredith and Merry1991; Dimidjian et al. Reference Dimidjian, Hollon, Dobson, Schmaling, Kohlenberg, Addis, Gallop, Mcglinchey, Markley, Gollan, Atkins, Dunner and Jacobson2006; Fournier et al. Reference Fournier, Derubeis, Shelton, Gallop, Amsterdam and Hollon2008). In addition, people with a diagnosis of co-morbid anxiety at baseline experience less benefit from T-CBT, relative to FtF-CBT (Stiles-Shields et al. Reference Stiles-Shields, Kwasny, Cai and Mohr2014a ). Thus, it is worth exploring whether predictors of outcome are similar across T-CBT and FtF-CBT.
The primary aim of this study was to use CART analyses to explore patient predictors of response to CBT for depression in T-CBT and FtF-CBT. An exploratory aim of the study was to explore whether patient predictors vary between treatment delivery method (i.e. telephone v. face-to-face).
Method
This study is a secondary analysis of data from a randomized controlled trial comparing the efficacy and retention rates of T-CBT and FtF-CBT in a cohort of 325 depressed participants (Mohr et al. Reference Mohr, Ho, Duffecy, Reifler, Sokol, Burns, Jin and Siddique2012).
Participants
Recruitment of participants occurred from November 2007 to December 2010 from primary-care clinics located in an academic medical center in the Chicago area.
Participants were eligible for randomization if they met criteria for major depressive disorder, had a minimum score of 16 on the Hamilton Depression Rating Scale (HAMD), were at least 18 years of age, spoke English, and were able to participate in face-to-face or telephone therapy. Exclusion criteria included having visual or hearing impairments preventing participation, meeting criteria for depression of an organic etiology or a severe psychiatric disorder, reporting severe alcohol or substance abuse, meeting criteria for dementia; exhibiting severe suicidality (i.e. plan and intent), receiving or planning to receive individual psychotherapy, or initiation of antidepressant pharmacotherapy in the previous 10 days.
In compliance with the University's Institutional Review Board, participants were sent a consent form. Research staff reviewed the consent with them over the phone and participants were given an opportunity to ask questions. Consent forms were signed and returned prior to baseline interviews.
Treatments
Participants were randomized to either T-CBT or FtF-CBT, stratified by antidepressant status and research study therapist (n = 9) by a blinded statistician. The treatment delivery medium was the only experimental factor to vary between the two groups, with both treatments using the same CBT protocol (Beck, Reference Beck1995) adapted and validated for use over the phone (Mohr et al. Reference Mohr, Hart, Julian, Catledge, Honos-Webb, Vella and Tasch2005). To eliminate therapist effects, PhD-level psychologists acted as therapists for both conditions. All therapists received training and supervision from the Beck Institute for Cognitive Behavioral Therapy. All therapy sessions were recorded and 8% were randomly selected and rated by the supervisor on the Cognitive Therapy Rating Scale (Vallis et al. Reference Vallis, Shaw and Dobson1986) for fidelity. Further detail of therapist training and fidelity are noted elsewhere (Mohr et al. Reference Mohr, Ho, Duffecy, Reifler, Sokol, Burns, Jin and Siddique2012).
FtF-CBT participants were seen in the Preventive Medicine Clinic at Northwestern University, whereas T-CBT was conducted exclusively via the telephone. Participants in the T-CBT condition received instructions to conduct the telephone sessions in a private, safe, and distraction-free environment. All participants received eighteen 45-min sessions, with two sessions weekly for the first weeks, followed by 12 weekly sessions, and two final booster sessions over 4 weeks. Participants also received a client workbook that explained CBT concepts and provided worksheets for topics including behavioral activation, cognitive restructuring, and social support. Optional chapters addressed common co-morbidities, such as anxiety management, relaxation training, assertiveness training, anger management, and insomnia.
Assessment
CART analyses included measures of depression severity, measured from baseline and end of treatment (week 18). Self-reported depression severity was measured using the Patient Health Questionnaire-9 (PHQ-9; Kroenke & Spitzer, Reference Kroenke and Spitzer2002), which has high internal consistency (Cronbach's alphas were 0.75–0.91 for this trial) and face validity (Corson et al. Reference Corson, Gerrity and Dobscha2004). Interviewer-based depression severity was evaluated using the 17-item HAMD (Hamilton, Reference Hamilton1960). Bachelor-level research assistants, who were trained and supervised by a licensed clinical psychologist, administered the HAMD. To ensure inter-rater reliability, one audiotape of the HAMD assessment was randomly selected every 1–2 weeks for calibration ratings with all evaluators. The mean interclass correlations were 0.96.
In addition to measures of depression severity, bachelor-level research assistants administered the Mini International Neuropsychiatric Interview (MINI; Sheehan et al. Reference Sheehan, Lecrubier, Sheehan, Janavs, Weiller, Keskiner, Schinka, Knapp, Sheehan and Dunbar1997) at baseline over the telephone. This semi-structured diagnostic interview diagnosed any DSM-IV-TR co-morbid conditions (APA, 2000). Baseline data included in the CART analysis included age, sex, ethnicity, race, marital status, education, employment status, household income, antidepressant medication status, trauma and abuse history (Wolfe & Kimberling, Reference Wolfe, Kimberling, Wilson and Keane)1997), presence of a co-morbid anxiety disorder from the MINI (Sheehan et al. Reference Sheehan, Lecrubier, Sheehan, Janavs, Weiller, Keskiner, Schinka, Knapp, Sheehan and Dunbar1997; Stiles-Shields et al. Reference Stiles-Shields, Kwasny, Cai and Mohr2014a ), the HAMD total score, the PHQ-9 total score, the Insomnia Severity Index total score (ISI; Doghramji, Reference Doghramji2006), the Life Experiences Survey total score (LES; Sarason et al. Reference Sarason, Johnson and Siegel1978), the Generalized Anxiety Disorder-7 total score (GAD-7; Spitzer et al. Reference Spitzer, Kroenke, Williams and Lowe2006), the Alcohol Use Disorders Identification Test total score (AUDIT; Babor et al. Reference Babor, De La Fuente, Saunders and Grant1992), the Positive and Negative Affect Scale total score (PANAS; Watson et al. Reference Watson, Clark and Tellegen1988), Medical Outcomes Study 36 total and subscale (Vitality, Physical Functioning, Bodily Pain, General Health Perceptions, Physical Role Functioning, Emotional Role Functioning, Social Role Functioning, Mental Health) scores (SF-36; Brazier et al. Reference Brazier, Harper, Jones, O'cathain, Thomas, Usherwood and Westlake1992), the Brief Symptom Inventory total and subscale (Somatization, Obsessive-Compulsive, Interpersonal Sensitivity, Depression, Anxiety, Anger-Hostility, Phobic Anxiety, Paranoid Ideation, Psychoticism) scores (Lehman et al. Reference Lehman, Burns, Gagen and Mohr2012), the Coping Self-Efficacy Scale total score (CSE; Chesney et al. Reference Chesney, Neilands, Chambers, Taylor and Folkman2006), the Life Stressors and Social Resources total score (LISRES-A; Moos et al. Reference Moos, Fenn and Billings1988), the Nijmegen Motivation Questionnaire-2 total score (NML-2; Keijsers et al. Reference Keijsers, Schaap, Hoogduin, Hoogsteyns and De Kemp1999), Outcome Expectations Questionnaire – Patient Version, Perceived Barriers to Psychotherapy total score (PBP; Mohr et al. Reference Mohr, Hart, Howard, Julian, Vella, Catledge and Feldman2006), Social Provisions Scale total score (SPS; Russell & Cutrona, Reference Russell and Cutrona1984), Scale for Interpersonal Behavior total score (SIBS; Arrindell & van der Ende, Reference Arrindell and Van Der Ende1985), Perceived Stress Scale total score (PSS; Cohen et al. Reference Cohen, Kamarck and Mermelstein1983), and Apathy Evaluation Scale total score (AES; Marin et al. Reference Marin, Biedrzycki and Firinciogullari1991). Treatment assignment was also included to explore whether predictors would vary as a function of treatment delivery medium.
Data analysis
CART analyses use recursive partitioning algorithms to find optimal ‘splits’ of any variable to separate responders from non-responders. For each systematic split or step, one or more rules are assessed to determine how or whether to proceed down the tree. This process results in a visualization that is reminiscent of an inverted tree, with a single root at the top that leads to branches, finalizing in non-branching leaves at the bottom. Each split or stop in a branch is referred to as a node. Each node denotes a predictor variable critical for that decision point and provides the amount of the sample that were impacted by that variable for the prediction (for a more detailed description of CART analyses in psychological treatment research, see King & Resick, Reference King and Resick2014).
CART analyses were conducted with all baseline covariates. Treatment response was defined as end of treatment scores below 16 and 10 for the HAMD and PHQ-9, respectively. The cut-off of 16 on the HAMD was determined as it produces a comparative sample to that of a PHQ-9 cut-off of 10, which is consistent with the MacArthur recommendations for referrals to psychotherapy at the cut-off for mild depressive symptoms (The MacArthur Foundation's Initiative on Depression and Primary Care, 2004). The cut-offs for both measures create a sample of roughly 2/3 response, which is consistent with response rates for the treatment of depression with CBT (Driessen & Hollon, Reference Driessen and Hollon2010).
The trees were fit using the rpart package in R version 3.0.1 (R Core Team, 2013), and pruned with a complexity parameter set at the largest value that was within a standard error of the minimum cross-validated error.
For increased understanding and validation of the findings from the CART analyses, Random Forest analyses were also run (Breiman, Reference Breiman2001). Random Forests is a method of generating many trees and aggregating their results. This is done through creating a large number of trees, each of which is constructed using a bootstrap of the sample dataset. Each node in these trees is split using the best among a subset of predictors randomly chosen at that node. An estimate of the error rate for Random Forests is obtained by predicting the data not in the bootstrap sample at each bootstrap iteration [referred to as ‘out of the bag’ (OOB) data]. The OOB data is beneficial in determining variable importance. The value of variable importance is measured through the program examining how much the prediction error (OOB) increases when data for that variable is permuted, with all other variables left unchanged (Breiman, Reference Breiman2001; Liaw & Wiener, Reference Liaw and Wiener2002). Random Forests were run using the randomForest package in R version 3.0.1 (R Core Team, 2013) to obtain the OOB and variable importance for the HAMD and PHQ-9 prediction trees.
Results
Participants
Baseline participant demographic and clinical characteristics are displayed in Table 1. Among the 325 participants entered into the trial, there were no significant differences in demographics across treatment groups.
FtF-CBT, Face-to-Face cognitive behavioral therapy; T-CBT, telephone cognitive behavioral therapy; HAMD, Hamilton Depression Rating Scale; PHQ-9, Patient Health Questionnaire-9; GAD-7, Generalized Anxiety Disorder Questionnaire-7; PANAS, Positive and Negative Affect Schedule; LES, Life Experiences Survey; NML, Nijmegen Motivation; SPS, Social Provisions Scale; CSE, Coping Self-Efficacy Scale; AUDIT, Alcohol Use Disorders Identification Test; PBP, Perceived Barriers to Psychotherapy; SIBS, Scale for Interpersonal Behavior; PSS, Perceived Stress Scale; AES, Apathy Evaluation Scale; SF-36, Medical Outcomes Study 36.
Predictors of treatment response and non-response based on the HAMD
The treatment response rate based on HAMD < 16 was 66.2% (49.5% for T-CBT and 50.5% for FtF-CBT). The CART model, pruned using the prune command, generated a tree based on the outcome of dichotomized HAMD scores at end of treatment. Fig. 1 displays the pruned tree. The model predicted that 231 participants would be treatment responders; 197 (85.3%) of these were accurately predicted. Of the 56 participants who were predicted to be treatment non-responders 48 (85.7%) were accurately predicted.
Variables (and scores) that predicted response included CSE (⩾ 104), baseline depression severity (HAMD < 23, PHQ-9 < 17), physical functioning (PF ⩾ 27), social support (SPS < 65), education level [some high school, general education diploma (GED), or bachelor's/master's degree], and employment status (employed or unemployed). Variables (and scores) that predicted non-response included CSE (< 104), baseline depression severity (HAMD ⩾ 23, PHQ-9 ⩾ 17), PF (< 27), social support (SPS ⩾ 65), education level (some college or professional degree), and employment status (disability or retired). No other variables, including treatment assignment to FtF-CBT or T-CBT were related to response.
Predictors of treatment response and non-response based on the PHQ-9
The treatment response rate based on PHQ-9 < 10 was 73.8% (50.2% for T-CBT and 49.8% for FtF-CBT). The CART model, pruned using the prune command, generated a tree based on the outcome of dichotomized PHQ-9 scores at end of treatment. Fig. 2 displays the pruned tree. The model predicted 240 participants would be treatment responders; 204 (85.0%) of these were accurately predicted. Of the 46 participants predicted to be treatment non-responders, 39 (85.0%) were accurately predicted. Variables (and scores) that predicted response included CSE (⩾ 104), baseline depression severity (PHQ-9 < 17, HAMD < 26), education level (some college, bachelor's degree, or professional degree), and being male. Variables (and scores) that predicted non-response included CSE (< 104), baseline depression severity (PHQ-9 ⩾ 17, HAMD ⩾ 26), education level (some high school, GED or master's degree/PhD), and being female. No other variables, including treatment assignment to FtF-CBT or T-CBT were related to response.
Variables of importance
The Random Forest analyses yielded an OOB error rate of 30.8% for both the HAMD and PHQ-9 models. The top values of variable importance for both the HAMD and PHQ-9 are presented in Table 2. The top variables of importance for the HAMD Random Forests were education, CSE, interviewer-based assessment of depression through the HAMD, age, and self-reported anxiety through the GAD-7. The top variables of importance for the PHQ-9 Random Forests were CSE, self-report depression through the PHQ-9, interviewer-based assessment of depression through the HAMD, education, and self-reported anxiety through the GAD-7.
HAMD, Hamilton Depression Rating Scale; PHQ-9, Patient Health Questionnaire-9; CSE, Coping Self-Efficacy Scale; GAD-7, Generalized Anxiety Disorder Questionnaire-7.
Discussion
Demographic and psychological characteristics at baseline of participants receiving outpatient T-CBT and FtF-CBT for depression accurately identified 85.3% and 85.0% of treatment responders and 85.7% and 85.0% of treatment non-responders with response defined by the HAMD and PHQ-9, respectively. CSE, baseline depression severity, and education consistently predicted outcome using both the HAMD and PHQ-9. Social support, PF, and employment emerged as predictors only for the HAMD, and sex predicted response on the PHQ-9. Treatment delivery method (i.e. telephone or face-to-face) and presence of co-morbid anxiety did not impact the prediction of outcome. These findings were supported through the Random Forests top variables of importance, including CSE, baseline depression, and education.
Participant baseline scores on CSE, which measures a person's confidence in his or her ability to cope effectively with situations that are appraised to be stressful (Chesney et al. Reference Chesney, Neilands, Chambers, Taylor and Folkman2006), was the primary predictive branch of the CART trees for response to treatment, regardless of how depression was measured. This measure alone predicted positive response outcomes for 41.3% and 40.2% of the total sample for the HAMD and PHQ-9, respectively. The value identified, surprisingly equivalent for both the HAMD and PHQ-9 (CSE = 104), is indicative of a moderate level of CSE according to previously established norms (Chesney et al. Reference Chesney, Neilands, Chambers, Taylor and Folkman2006), meaning that those with moderate levels of CSE or higher are more likely to respond relative to those with lower levels. To our knowledge, CSE has not been directly investigated or reported as a predictor of outcome in CBT for depression. However, studies investigating related concepts have shown findings consistent with these. For example, resourcefulness (Simons et al. Reference Simons, Lustman, Wetzel and Murphy1985) as well as increased stressful life events (Fournier et al. Reference Fournier, Derubeis, Shelton, Hollon, Amsterdam and Gallop2009) have both been found to be prescriptive predictors of which patients benefit more from CBT compared to antidepressant medication.
The CSE taps a person's confidence in his or her ability to cope effectively with situations that are appraised to be stressful (Chesney et al. Reference Chesney, Neilands, Chambers, Taylor and Folkman2006). CBT often requires patients to confront difficult or stressful thoughts or situations through cognitive restructuring, exposure, behavioral experiments, and other strategies (Beck, Reference Beck1995; Simos & Hofmann, Reference Simos and Hofmann2013). Patients who come into therapy with high levels of CSE may be more willing and capable of tolerating the distress created by these strategies and more likely in general to persevere and succeed. This suggests that patients with low CSE may achieve better results if greater focus on enhancing CSE is provided earlier in treatment, which may provide patients more confidence in discussing and facing difficult life situations. Further research is needed to cross-validate these findings and explore their treatment implications.
It was interesting that the CART analyses found for both the PHQ-9 and HAMD outcomes, moderate to high CSE was the optimal split predicting response, with no additional covariates necessary. For those with low CSE, two constructs provided additional predictive value consistently across both models: depression and education. The finding that education consistently predicts response is not consistent with the available evidence indicating that education is not predictive of outcome in CBT (Jarrett et al. Reference Jarrett, Eaves, Grannemann and Rush1991; Hamilton & Dobson, Reference Hamilton and Dobson2002). Indeed, no gradient was identified in the current analyses; having attained a master's degree was grouped with some high school, GED, and college; while some college education and professional degrees are grouped in another. Furthermore, some of the educational categories (e.g. some college, professional degree) predict response for one depression outcome, and non-response in the other.
By contrast, the finding that baseline depressive symptom severity contributes to the prediction of treatment response is consistent with and extends previous findings (Jarrett et al. Reference Jarrett, Eaves, Grannemann and Rush1991; Sotsky et al. Reference Sotsky, Glass, Shea, Pilkonis, Collins, Elkin, Watkins, Imber, Leber, Moyer and Oliveri1991; Shapiro et al. Reference Shapiro, Barkham, Rees, Hardy, Reynolds and Startup1994; Thase et al. Reference Thase, Reynolds, Frank, Simons, Garamoni, McGeary, Harden, Fasiczka and Cahalane1994; Agosti & Ocepek-Welikson, Reference Agosti and Ocepek-Welikson1997; Persons et al. Reference Persons, Bostrom and Bertagnolli1999; Hamilton & Dobson, Reference Hamilton and Dobson2002; Coffman et al. Reference Coffman, Martell, Dimidjian, Gallop and Hollon2007). However, severity of depressive symptoms was only valuable as a predictor of treatment response among those with CSE, suggesting that greater severity of symptoms only exerts its negative influence among those who do not have the confidence in the coping skills to manage stress and distress. This finding points to the utility of CART analyses in uncovering potentially complex relationships among predictor variables.
Other variables provided predictive value, however not consistently across the models. In the HAMD model, among those with lower CSE, more depressed individuals and those with higher perceived social support were at most risk for non-response. This finding is inconsistent with the literature on the influence of social support on outcomes for treatment of depression (George et al. Reference George, Blazer, Hughes and Fowler1989); however, given that social support was not a top variable of importance in the Random Forests, we are hesitant to over-interpret these findings.
Among individuals with low CSE and depressive severity, poorer PF predicted non-response in the HAMD model. Among those with both low CSE and higher depressive severity, individuals with good social support and employment further differentiated responders from non-responders. However, the grouping of employment (disabled and retired v. employed and unemployed) suggested that the variable might be acting as a surrogate for PF. Sex entered only the PHQ-9 model, with women with low CSE and higher depressive severity at greater risk of non-response than their male counterparts. However, the inconsistency of these variables across the models suggests that the findings may not be reliable.
Randomization to T-CBT or FtF-CBT did not appear in the CART analyses as a predictor variable. This is consistent with the findings of the parent trial, finding no difference in depression severity at post-treatment across the two treatment arms (Mohr et al. Reference Mohr, Ho, Duffecy, Reifler, Sokol, Burns, Jin and Siddique2012). Thus, this investigator found no support for any prescriptive variables, it appears as though characteristics that predict response held regardless of the modality used. This might be due to the fact that the treatments were quite similar and the only difference in modality was whether the patient was in the room with the therapist or over the phone. Prescriptive variables might be more important for treatments that differ more markedly in mechanism of action such as psychotherapy v. medication. However, for a given type of treatment (e.g. CBT), prescriptive variables might be more valuable for media that differ more markedly, such as face-to-face v. internet CBT.
A number of other predictor variables included in the CART analyses that the literature has identified as predictors of CBT outcome, such as stress, motivation for treatment, and co-morbid anxiety, were included in the analyses, yet they were not significant predictors for this sample. It may be that CSE and depressive symptom severity account for most of the variance, leaving little room for other variables, which might, on their own, have some predictive power. Specifically, CSE may be associated with traits such as anxiety, and studies examining anxiety may not have accounted for coping, as it has not previously been identified as a potential confounder. This demonstrates the potential strength of CART analyses to identify new relationships one might not have predicted that could be more important than previously established relationships. It also demonstrates the clinical utility of CART analyses, as the interaction options provided by trees maps more closely to intuitive clinical decision making and is a further step towards personalized medicine.
There are several limitations and caveats that should be considered in interpreting these results. First, as a secondary, exploratory analysis, these findings should be viewed with caution until they are replicated or refuted. Second, this trial examined CBT for depression; it is unclear how these results generalize to other forms of psychotherapy, other mental health conditions, and other treatment delivery media. Third, while the sample was ethnically diverse, participants were fairly well educated. The small number of participants with lower levels of education may be partly responsible for the inconsistent findings with respect to education and certainly limit generalizability to less-educated populations. Fourth, we used response criteria based on level of symptoms used for referral or initiation of treatment. For the HAMD, this was identical to entry criteria. While patients overall showed a strong response to treatment, it is possible that a few patients may only have moved 1 or 2 points on the HAMD to reach response criteria. Furthermore, these decision trees may not generalize to other criteria, such as full remission of symptoms. Finally, although CART analysis is an effective data-mining tool, there are some disadvantages to these models. Trees minimize total variability in the data, both from the population as well as sample variability. Without adequate pruning, trees are often overfit, in that they minimize sample variability. Additionally, misclassification errors may build. Since each branch depends on a previous one, misclassification errors in early branches will continue down the tree. CART is a form of data mining, thus, these findings should be validated in another dataset before generalizing these findings more broadly. Nevertheless, a strength of CART analyses is that they can identify new relationships one might not have predicted that could be more important than previously established relationships. Additionally, a sensitivity analysis using Random Forests supported the findings of the CART analyses. Thus, although CART is an exploratory method, it has the potential to identify relationships that can be investigated in subsequent studies.
To the best of our knowledge, this is the first study to use exploratory CART analyses to evaluate patient characteristics that predict depression outcomes among patients receiving T-CBT or FtF-CBT for the treatment of major depressive disorder. The findings of the present study indicate that depressed patients with moderate to high CSE are likely to do well with CBT, regardless of other baseline characteristics. Among those with poor CSE, lower levels of depressive symptom severity also indicate a likely positive response. Those with low CSE and high depressive symptom severity were consistently found to be the most at risk of non-response. There was also a suggestion that in this group of low CSE, depressed individuals with low social support may be at risk for non-response. While these findings should be confirmed in future research, they point to the possibility of improving outcomes by enhancing CSE early in treatment as a strategy of mitigating potential treatment non-response to improve the overall impact of CBT.
Acknowledgements
This research was supported by grants from the National Institute of Mental Health R01 MH059708 and R01 MH095753 (PI: Mohr); K08 MH102336 (PI: Schueller); and F31 MH106321 (PI: Stiles-Shields).
Declaration of Interest
None.