Introduction
It is generally acknowledged that care for depression could be improved because the delivery and uptake of antidepressant medication (ADM) and evidence-based psychotherapies is often suboptimal (Simon, Reference Simon2002; Bijl et al. Reference Bijl, de Graaf, Hiripi, Kessler, Kohn, Offord, Ustun, Vicente, Vollebergh, Walters and Wittchen2003; National Institute for Health and Clinical Excellence, 2011; Piek et al. Reference Piek, van der Meer, Penninx, Verhaak and Nolen2011, Reference Piek, Nolen, van der Meer, Joling, Kollen, Penninx, van Marwijk and van Hout2012). Improvement of care is more likely to come from changes in the way care is provided than from adding new treatment options (Katon & Unutzer, Reference Katon and Unutzer2006).
Currently, the standard approach in which mental health care is delivered to patients is called matched care. In this approach the patient is referred to a certain therapist or therapy. The therapy choice is based (matched) on patients' characteristics and preferences. As a result, the treatment may vary (e.g. ADM or different types of psychotherapy) as well as the setting (primary care, mental health care, online therapy, group therapy, individual therapy) and the provider [e.g. general practitioner (GP), nurse, psychological wellbeing practitioner, psychologist, psychiatrist]. A major problem with this model at present is our lack of clear prognostic determinants with which to match patients to the available treatments. It has been argued that some patients receive too much treatment (Lovell & Richards, Reference Lovell and Richards2000), whilst others too little, as those lucky enough to be given treatment utilize highly scarce resources to the detriment of many others who receive little or nothing.
An alternative approach is called ‘stepped care’. Within the last 10 years and in the context of international concern regarding the cost and prevalence of common mental health problems, stepped care has been recommended as a means to increase access and efficiency of mental health care (Andrews et al. Reference Andrews2006; National Institute for Health and Clinical Excellence, 2009). In stepped care models, the default position is that patients start with an evidence-based treatment of low intensity as a first step. Progress is monitored systematically and those patients who do not respond adequately will step up to a subsequent treatment of higher intensity (Bower & Gilbody, Reference Bower and Gilbody2005). Low-intensity treatments are usually defined as those treatments that require less time from a professional than a conventional treatment (Bennett-Levy et al. Reference Bennett-Levy, Richards, Farrand, Bennett-Levy, Richards, Farrand, Christensen, Griffiths, Kavanagh, Klein, Lau, Proudfoot, Ritterband, White and Williams2010). However, intensity may also mean the time required of patients, cost, and therapists' level of expertise and it is possible for treatments to differ in one but not all of these dimensions. Patients, for example, may themselves spend similar amounts of time undertaking high- or low-intensity treatments that require a different amount of time from a professional.
Whilst the concept of intensity readily applies to psychological therapies, it is difficult to characterize pharmacological and, perhaps, physical treatments as intensive or otherwise. Given the widespread use of pharmacotherapy alongside psychological treatment for depression, it is perhaps unsurprising that the term ‘stepped care’ is also used to refer to treatment that is not organized in order of increased intensity; at each ‘step’ patients switch or add treatments of different modalities (pharmacological, psychological) – patients may start with intensive psychological therapy (Araya et al. Reference Araya, Rojas, Fritsch, Gaete, Rojas, Simon and Peters2003; Katon et al. Reference Katon, von Korff, Lin, Simon, Ludman, Russo, Ciechanowski, Walker and Bush2004; Ell et al. Reference Ell, Xie, Quon, Quinn, Dwight-Johnson and Lee2008).
In practice, self-help treatments (through books or the Internet) are often used as a first step in stepped care. The effectiveness of self-help for depression, guided by a mental health worker but still of less intensity than traditional psychological therapy, has been demonstrated convincingly (Gellatly et al. Reference Gellatly, Bower, Hennessy, Richards, Gilbody and Lovell2007; Andrews et al. Reference Andrews, Cuijpers, Craske, McEvoy and Titov2010; Cuijpers et al. Reference Cuijpers, Donker, van Straten, Li and Andersson2010; Richards & Richardson, Reference Richards and Richardson2012). Therefore, the assumption of stepped care is that for most patients the low-intensity treatment will be sufficient and only few will need a higher-intensity treatment, thereby making better use of scarce and expensive resources such as therapist time. Many depression treatment guidelines have endorsed this stepped care principle, e.g. the English NICE guideline (National Institute for Health and Clinical Excellence, 2009; National Collaborating Centre for Mental Health, 2010) and the Dutch multidisciplinary guideline (Spijker et al. Reference Spijker, van Vliet, Meeuwissen and van Balkom2010). This has also led to implementing stepped care in routine practice. The most notable initiative in this respect is the implementation of the Improving Access to Psychological Therapies (IAPT) programme (www.iapt.nhs.uk), for which stepped care underpins the organizational structure.
The question remains how much evidence there is for the effectiveness of stepped care. Does stepped care really deliver similar or better patient outcomes compared with other systems? Although observational data from the first year of English IAPT services show that recovery rates were higher in services making use of the full range of low- and high-intensity treatments in stepped care systems (Clark, Reference Clark2011), no systematic review of randomized trials has been published yet. Therefore, our aim in this study was to conduct a systematic review and meta-analysis of studies investigating the effectiveness of stepped care for depression.
Method
Search strategy
We carried out a comprehensive literature search in PubMed, PsycINFO, EMBASE and the Cochrane Central Register of Controlled Trials. We combined terms indicative of depression with those of stepped care, e.g. for Medline we used (depression [MESH] OR depressive disorder [MESH] or mood disorders [MESH]) AND (stepped [all fields] AND care [all fields]). We searched all literature up to April 2012 without any language restrictions and followed up identified protocol papers published before April 2012 to determine if the researchers had subsequently published their findings before May 2013. Two independent researchers (A.v.S. and J.H.) reviewed all abstracts and titles of retrieved references for eligibility. We retrieved the full papers for all references that had been judged as potentially eligible and the full papers were examined independently by two of the research team (A.v.S., J.H., D.A.R.). In the case of disagreement the paper was discussed with the third reviewer until a consensus was achieved. We also checked the reference lists of the included papers and a recent meta-analysis on collaborative care (Archer et al. Reference Archer, Bower, Gilbody, Lovell, Richards, Gask, Dickens and Coventry2012).
Inclusion criteria
We used the following inclusion criteria: (1) the study had to be a randomized controlled trial; (2) aimed at adults; (3) with a Diagnostic and Statistical Manual, 4th revision (DSM-IV) depressive disorder identified through a diagnostic interview, or with depressive symptoms established by scoring above a cut-off on a depression questionnaire; and (4) investigating ‘stepped care’ as one of the randomized trial groups. Stepped care had to include psychological therapy and was defined as the availability of more than one psychological treatment of different intensities and/or the availability of more than one treatment modality (pharmacological and psychological). We defined the intensity of psychological treatments with respect to the time to deliver; non-psychological (pharmacological) treatments were not characterized in this respect. We did not require treatments to be organized in a hierarchy of low to high intensity. Decisions about stepping up had to be based on a systematic clinical evaluation undertaken by a clinician or through questionnaire assessment, done at a pre-specified time interval and with an explicit aim to determine the next treatment step. We included studies in which only a proportion of patients were depressed, for example studies including patients with a common mental health disorder and a subgroup of patients specifically diagnosed with depression. We allowed both physical and psychiatric co-morbidity. Studies were included regardless of their setting or control group.
Data extraction
We coded the following general characteristics of the studies: year of publication, country, randomization level (patient or cluster), the way depression or depressive symptoms were established (e.g. diagnostic interview or scoring above a cut-off on a questionnaire), possible co-morbidity as an inclusion criterion (e.g. cancer patients, diabetes), age, and total number of patients included in the study. The stepped care interventions were coded as follows: number of steps, the content of the interventions in the different steps, criteria to step up, and total duration of the programme. Two independent assessors coded each study and differences were discussed among the review team until consensus was reached.
Quality assessment
We assessed the validity of the studies using the criteria as suggested by the Cochrane Handbook (Higgins & Green, Reference Higgins and Green2011): adequate sequence generation, concealment of allocation, blinding of outcome assessors, adequate handling of incomplete outcome data, selective reporting of data and other potential threats to validity. Two reviewers (A.v.S, J.H.) conducted the quality assessment independently of each other.
Meta-analyses
We calculated between-group effect sizes (Cohen's d) for all individual studies. The effect size represents the difference between two groups in number of standard deviations (Hedges & Olkin, Reference Hedges and Olkin1985; Lipsey & Wilson, Reference Lipsey and Wilson1993; Cooper & Hedges, Reference Cooper and Hedges1994). To calculate between-group effect sizes we used the available statistics as published in the papers [means and standard deviations, mean difference score and 95% confidence interval (CI), or proportions of patients improved or recovered]. When more than one outcome was reported (e.g. more than one depression questionnaire or more than one cut-off score) we performed a sensitivity analysis. We pooled the effects using (a) the highest reported effect sizes for all studies, (b) the lowest reported effect sizes for all studies and (c) the average or combined effect size for all studies.
To calculate the individual effect sizes as well as the pooled mean effect size we used the computer program Comprehensive Meta-analysis version 2.2.046 for Windows, developed for support in meta-analysis (www.metaanalysis.com). As we expected considerable heterogeneity, we calculated pooled effect sizes with the random-effects model. However, we first tested heterogeneity under the fixed-effects model using the statistics I 2 and Q. I 2 describes the variance between studies as a proportion of the total variance. A value of 0% indicates no observed heterogeneity, and larger values show increasing heterogeneity, with 25% as low, 50% as moderate, and 75% as high heterogeneity. The statistical significance of the heterogeneity is tested with the Q statistic. A significant Q value rejects the null hypothesis of homogeneity. We mark all results in which p < 0.05.
In addition, we performed subgroup analyses. In these analyses we tested whether there were significant differences between the effect sizes in different categories of studies. We used the mixed-effects model, which pools studies within subgroups with the random-effects model, but tested for significant differences between subgroups with the fixed-effects model. Lastly, publication bias was tested by inspecting the funnel plot, and by Duval and Tweedie's trim-and-fill procedure, which yields an estimate of the effect size after publication bias has been taken into account (as implemented in Comprehensive Meta-analysis; Duval & Tweedie, Reference Duval and Tweedie2000).
Results
Inclusion of studies
We retrieved 61 papers for eligibility after screening 438 references (Fig. 1). We excluded 47 of the 61 that did not fulfill our inclusion criteria. In total, we included 14 studies on stepped care for depression (see Table 1) [Unutzer et al. Reference Unutzer, Katon, Callahan, Williams, Hunkeler, Harpole, Hoffing, Della Penna, Noel, Lin, Arean, Hegel, Tang, Belin, Oishi and Langston2002 (study no. 13); Araya et al. Reference Araya, Rojas, Fritsch, Gaete, Rojas, Simon and Peters2003 (study no. 2); Katon et al. Reference Katon, von Korff, Lin, Simon, Ludman, Russo, Ciechanowski, Walker and Bush2004 (study no. 10); Ell et al. Reference Ell, Xie, Quon, Quinn, Dwight-Johnson and Lee2008 (study no. 7); Van ‘t Veer-Tazelaar et al. Reference Van ‘t Veer-Tazelaar, van Marwijk, van Oppen, van Hout, van der Horst, Cuijpers, Smit and Beekman2009 (study no. 14); Bot et al. Reference Bot, Pouwer, Ormel, Slaets and de Jonge2010 (study no. 3); Davidson et al. Reference Davidson, Rieckmann, Clemow, Schwartz, Shimbdo, Medina, Albanese, Kronish, Hegel and Burg2010 (study no. 4); Ell et al. Reference Ell, Katon, Xie, Lee, Kaoetanovic, Guterman and Chou2010 (study no. 8); Patel et al. Reference Patel, Weiss, Neerja, Smita, Sulochana, Chatterjee, de Silva, Bhat, Araya, King, Simon, Verdeli and Kirkwood2010 (study no. 11); Seekles et al. Reference Seekles, van Straten, Beekman, van Marwijk and Cuijpers2011 (study no. 12); Apil et al. Reference Apil, Hoencamp, Haffmans and Spinhoven2012 (study no. 1); Dozeman et al. Reference Dozeman, van Marwijk, van Schaik, Smit, Stek, van der Horst, Bohlmeijer and Beekman2012 (study no. 6); Davidson et al. Reference Davidson, Bigger, Burg, Carney, Chaplin, Czajkowski, Dornelas, Duer-Hefele, Frasure-Smith, Freedland, Haas, Jaffe, Ladapo, Lesperance, Medina, Newman, Osorio, Parsons, Schwartz, Shaffer, Shapiro, Shep, Vaccarino, Whang and Ye2013 (study no. 5); Huijbregts et al. Reference Huijbregts, de Jong, van Marwijk, Beekman, Ader, Hakkaart-van Roijen, Unutzer and van der Feltz-Cornelis2013 (study no. 9)]. In one trial (study no. 3), only part of the results were published and we contacted the authors to obtain the (unpublished) research protocol and additional data.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045007533-0049:S0033291714000701:S0033291714000701_fig1g.gif?pub-status=live)
Fig. 1. Flowchart of studies included in the meta-analysis on stepped care for depression. RCT, Randomized control trial.
Table 1. Characteristics of randomized controlled trials comparing stepped care for depression with usual care
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045007533-0049:S0033291714000701:S0033291714000701_tab1.gif?pub-status=live)
MINI, Mini International Neuropsychiatric Interview; GP, general practitioner; MDD, major depressive disorder; ADs, antidepressants; CES-D, Center for Epidemiological Studies Depression Scale; BDI, Beck Depression Inventory; ACS, acute coronary syndrome; NS, not specified; PCP, primary care physician; PHQ-9, Patient Health Questionnaire-9; SCID, Structured Clinical Interview for DSM disorders; SCL, Symptom Checklist; CIS-R, Clinical Interview Schedule – Revised; GHQ, General Health Questionnaire; K10, Kessler Psychological Distress Scale; CIDI, Composite International Diagnostic Interview.
a Not included in quantitative meta-analysis.
b Age inclusion and exclusion criteria ‘not specified’.
c No particular feature of usual care described.
d Oncologists may have attended a depression treatment didactic session by the study psychiatrist at the start of the study and yearly after and may have been informed of patients' depression status although it is unclear whether these features applied to patients in the enhanced usual care group.
e Total n in this trial is 2796 but we only used the depressed subsample in our meta-analysis.
We included 10 of the 14 studies in our quantitative meta-analyses on the treatment of depression in which outcomes were expressed as the reduction of depressive symptoms. One treatment trial was excluded from this analysis because the authors did not report post-treatment data but only long-term follow-up. The three remaining trials were aimed at prevention of depression, either as indicated prevention (studies 6 and 14) or as relapse prevention (study no. 1) with the incidence of depressive disorders as the main outcomes. Given that it is not useful to pool results from treatment and prevention we excluded the prevention trials from our quantitative meta-analyses.
Characteristics of the 14 included treatment and prevention studies
The 14 studies included a total of 5194 patients of whom 2560 were randomized to stepped care and 2634 to a control condition. For the 10 studies included in the quantitative meta-analyses the total number of included patients was 4580 with 2243 in the stepped care arms and 2337 in the control conditions (Table 1).
Of the trials, 12 were patient-randomized (studies 1–8, 10 and 12–14), and two were cluster-randomized (studies 9 and 11); six were conducted in the USA (studies 4, 5, 7, 8, 10 and 13), six in The Netherlands (studies 1, 3, 6, 9, 12 and 14), one in Chile (study no. 2) and one in India (study no. 11). Participants were recruited mainly from primary care (studies 2, 9–11 and 12–14), or secondary care (studies 3–5 and 7). All studies compared stepped care with usual care, either standard (studies 1–6, 9, 10 and 12–14) or ‘enhanced’ (studies 7, 8 and 11).
Of the treatment trials, five (studies 3–5, 8 and 10) included patients scoring above a cut-off on a self-rated depression questionnaire only [two also used the core symptoms of major depressive disorder (MDD)] while five others (studies 6, 9 and 11–13) performed diagnostic interviews to include patients with MDD (one also included minor depression, and two also included dysthymia). The three prevention trials (studies 1, 6 and 14) used a diagnostic interview to exclude patients with MDD. Of the studies, six were aimed at depressive symptoms among patients with either co-morbid acute coronary syndrome (studies 4 and 5), cancer (study no. 7) or diabetes mellitus (studies 3, 8 and 10) and five trials, including the three prevention studies, were specifically aimed at older adults (studies 1, 3, 6, 13 and 14).
Characteristics of the stepped care interventions
We found considerable between-study heterogeneity in numbers of steps (two, three or four), types of treatments offered at each step, and duration of the total intervention (between 3 and 12 months; Table 2).
Table 2. Characteristics of the stepped care interventions for depression
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045007533-0049:S0033291714000701:S0033291714000701_tab2.gif?pub-status=live)
CWD, Coping with Depression; GP, general practitioner; CES-D, Center for Epidemiological Studies Depression Scale; PE, psycho-education; HAMD, Hamilton Depression Rating Scale; ADs, antidepressants; CBT, cognitive–behavioural therapy; PST, problem-solving treatment; PHQ-9, Patient Health Questionnaire-9; CDCS, cancer depression clinical specialist; SCL, Symptom Checklist; ADM, antidepressant medication; DCM, depression care manager; IPT, interpersonal psychotherapy; IDS, Inventory of Depressive Symptomatology; HADS-A, Hospital Anxiety and Depression Scale-Anxiety; WSAS, Work and Social Adjustment Scale; MDD, major depressive disorder.
a 'Providers’ includes the role of all health care professionals involved in the stepped care intervention except for professionals who cared for patients ‘on referral’.
Of the studies, seven (studies 4, 5, 7–10 and 13), six of which were US trials, were based on the ‘IMPACT’ model and used problem-solving treatment (PST) and ADM as the core of the intervention. The IMPACT intervention is primarily a collaborative intervention in which a dedicated team works together to provide optimal depression care, meeting our inclusion criteria as a stepped care approach because patients were evaluated at predetermined time intervals according to defined improvement criteria and care was adjusted or augmented if the patient did not improve sufficiently. Treatments were provided according to patients' needs and preferences. In all seven ‘IMPACT’ studies and one other involving both psychological treatment (psycho-education) and ADM (study no. 2), there was no progression of increasing therapeutic intensity.
In contrast, care was delivered in the other six trials (studies 1, 3, 6, 11, 12 and 14) through steps of increasing intensity. Of these six studies, five started with watchful waiting although two studies (studies 12 and 14) only included patients after the watchful waiting period while the other three (studies 1, 3 and 6) included watchful waiting as part of their stepped care model. The first therapeutic component included psycho-education or bibliotherapy alone or combined, offered either as self-help (with online, telephone or face-to-face support), in a group, or as individual sessions. The next step in these six studies varied widely and included psychological therapy [cognitive–behavioural therapy, life review, interpersonal psychotherapy (IPT), PST, Coping with Depression course] (studies 1, 3, 6, 12 and 14) or a psychological therapy (IPT) combined with ADM (study no. 11). The last step typically consisted of referral to specialists, a GP or mental health services. Only two of those six studies that used steps of increasing intensity were included in the quantitative meta-analysis (studies 11 and 12). As mentioned above, one study was excluded because of unavailability of post-test data (study no. 3), and the three other trials were aimed at (relapse) prevention (studies 1, 6 and 14).
In 12 studies more than one healthcare professional was involved in stepped care (studies 1, 2 and 4–13) including nurses (studies 1, 2, 4–6, 10, 12 and 13), psychiatrists (studies 4, 5, 7–11 and 13), GPs (studies 2, 5, 8, 9, 11 and 13), social workers (studies 2, 4, 7 and 8), psychologists (studies 4, 5, 12 and 13) and relatively less qualified staff [residential home staff (study no. 6), an assistant patient navigator (study no. 8), lay health counsellor (study no. 11) and study researcher (study no. 1)]. In two studies, treatment was provided by one healthcare professional: a nurse or psychologist (study no. 3) or a nurse only (study no. 14). No details are available for external professionals providing treatment after referral outside the core stepped care team.
Patient progress was assessed using one (studies 1–7, 9–11, 13 and 14), two (study no. 8) or three (study no. 12) self-rated instruments. In five studies the decision to ‘step up’ was contingent on patients' score relative to a specific cut-off on the Hamilton Depression Rating Scale (study no. 2), the Center for Epidemiological Studies Depression Scale (CES-D) (studies 1 and 14), the Patient Health Questionnaire-9 (PHQ-9) (study no. 7) or the Hospital Anxiety and Depression Scale, the Inventory of Depressive Symptomatology and the Work and Social Adjustment Scale (study no. 12). In five studies the decision to ‘step up’ was dependent on improvement (relative to baseline or the last assessment) on the PHQ-9 (studies 4, 5, 10 and 13) or the CES-D (study no. 6). In all, three studies used a combination of improvement and a specific cut-off on the CES-D (study no. 3), PHQ-9 (study no. 9) or the PHQ and Symptom Checklist (study no. 8). In one study (study no. 11) improvement was assessed by health counsellors following application of the General Health Questionnaire with no further detail specified.
Quality of the included studies
In one study (study no. 3) we rated all quality criteria as either unclear or at high risk of bias and in a second (study no. 1) we rated five of the six criteria as unclear or at high risk of bias. For the remaining 12 studies quality on most criteria was high. The description of randomization sequence generation was adequate but four of these 12 studies did not clearly report methods of allocation concealment (studies 4, 10, 11 and 14). No studies were able to blind patients or clinicians but all studies used assessors to measure outcomes who were unaware of the randomization status of the patients or used self-report. Post-intervention study drop-out ranged between 8.0% (study no. 5) and 49.6% (study no. 3) and one study (study no. 9) was rated at high risk of bias with respect to handling incomplete outcome data. All studies used intention-to-treat analyses. Of the 12 studies, three were at high risk of other biases because of the potential for contamination between trial arms (studies 6, 8 and 13) or because patients were recruited in different ways in the intervention and control groups (study no. 9).
Effects of stepped care
Most of the studies used more than one depression outcome measure, so we averaged the between-group differences from the various measures as a single combined-measures effect size for each study (Table 3). We found an overall post-intervention effect size of d = 0.38 (95% CI 0.18–0.57). We also examined the post-test effect sizes from the measure with the highest effect size for each study (d = 0.42, 95% CI 0.22–0.62) and repeated this with the measure producing the lowest effect sizes (d = 0.33, 95% CI 0.13–0.52). All effect sizes were significantly in favour of stepped care.
Table 3. Meta-analysis, and subgroup analysis, of 10 studies examining the effects of stepped care for depression compared with care as usual: effect sizes – Cohen's d
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045007533-0049:S0033291714000701:S0033291714000701_tab3.gif?pub-status=live)
N comp, Number of comparisons; CI, confidence interval; n.a., not applicable.
* p < 0.01.
The stepped care interventions varied in duration between 3 and 12 months. We used the combined-measures effect size to examine outcomes at different time points. The effects were d = 0.57 at 2 to 4 months (95% CI 0.21 to 0.94), d = 0.34 at 6 months (95% CI 0.20 to 0.48), d = 0.43 at 9 to 12 months (95% CI 0.20 to 0.65) and d = 0.26 at 18 months (one study only). All effects were significantly in favour of the stepped care intervention with the exception of the 18-month result. Heterogeneity, as indicated by I 2, was high for the post-intervention effect sizes as well as for the effect sizes at the different time points. From Fig. 2 it can be observed how the 6-month effect sizes varied between the different studies. To examine this heterogeneity we performed subgroup analyses.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045007533-0049:S0033291714000701:S0033291714000701_fig2g.gif?pub-status=live)
Fig. 2. Effects of stepped care versus care as usual (6-month outcomes). Std diff, Standardized difference; CI, confidence interval; BDI, Beck Depression Inventory; IDS, Inventory of Depressive Symptomatology.
Subgroup analysis and publication bias
We analysed the association of the 6-month outcomes (overall d = 0.34; Table 3) with the following variables: country in which the study was performed (USA, Netherlands, or other), treatment based on the IMPACT protocol (yes or no), stepped care treatment using progressive intensity (yes or no), physical health co-morbidity (present or absent) and diagnostic status at inclusion (diagnosis assessed or not). The effect of the eight studies on stepped care models without progressive intensity was significantly higher (d = 0.41) than those of the two studies examining stepped care models with progressive intensity (d = 0.07, p < 0.01). None of the remaining variables were significantly related to the effect size. Even though not statistically significant (p = 0.63), the effect size for the two Dutch studies was lower (d = 0.18) than for those conducted in the USA (d = 0.38) or other countries (d = 0.44).
We found no indication of publication bias in our funnel plot on the 6-month outcomes or in Duval and Tweedie's trim-and-fill procedure. No studies needed to be imputed.
Effects of stepped care intervention for depression: four studies excluded from the quantitative analyses
The treatment study of Bot et al. (Reference Bot, Pouwer, Ormel, Slaets and de Jonge2010) (study no. 3) only provided 2-year follow-up data for the complete cases (49.6%) and reported no difference between the groups (d = –0.12, 95% CI–0.62 to 0.39). Both of the trials on indicated prevention showed results in favour of stepped care (studies 6 and 14). One (study no. 6) demonstrated 12-month MDD rates of 6.5% in the intervention group and 14.1% in the control group (incidence rate ratio = 0.46, 95% CI 0.17–1.21). The other (study no. 14) demonstrated 12-month prevalence rates of combined MDD and anxiety disorders of 11.6% in the intervention group and 23.8% in the control group (incidence rate ratio = 0.49, 95% CI 0.24–0.98). The pooled rate ratio of the two studies was 0.48 (95% CI 0.27–0.83, I 2 = 0). The study on relapse prevention (study no. 1) reported no difference in the 12-month MDD incidence rate between stepped care and care as usual.
Discussion
We identified 14 trials on stepped care for depression, 10 of which could be used in a meta-analysis of treatment outcomes. Stepped care has a moderate effect on depression (d = 0.34 at 6 months and d = 0.38 post-intervention). Stepped care interventions based on progressive treatment intensity performed worse (n = 2, d = 0.07) than those without a clear intensity order (n = 8, d = 0.41, p < 0.01). Most trials were of good quality. The stepped care interventions were extremely heterogeneous, with different numbers of steps, different treatment components, different duration of the steps, different rules about stepping up and different professionals involved.
Even though we demonstrated that stepped care is effective, the effect sizes were modest. Meta-analyses have demonstrated higher effect sizes (Cohen's d between 0.42 and 0.88) for self-help interventions, which are usually considered as a first step in stepped care (Gellatly et al. Reference Gellatly, Bower, Hennessy, Richards, Gilbody and Lovell2007; Andrews et al. Reference Andrews, Cuijpers, Craske, McEvoy and Titov2010; Richards & Richardson, Reference Richards and Richardson2012, Bower et al. Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody, Knowles, Cuijpers, Andersson, Christensen, Meyer, Huibers, Smit, van Straten, Warmerdam, Barkham, Bilich, Lovell and Liu2013). However, the majority of the trials on self-help have been performed in population samples rather than in clinical samples. Even though baseline severity of symptoms does not seem to be associated with the effect of self-help interventions (Bower et al. Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody, Knowles, Cuijpers, Andersson, Christensen, Meyer, Huibers, Smit, van Straten, Warmerdam, Barkham, Bilich, Lovell and Liu2013), there might be other differences between clinical and population samples that might account for differences in effects.
The stepped care 6-month effect size (d = 0.34) was similar to the one found in the Cochrane review on collaborative care (Archer et al. Reference Archer, Bower, Gilbody, Lovell, Richards, Gask, Dickens and Coventry2012). [Collaborative care may include a broad range of interventions, settings and providers; defining characteristics are that a team of health care professionals are responsible for providing the ‘right’ care at the ‘right’ time and that there is a structured management plan which includes scheduled patient follow-ups (Bower et al. Reference Bower, Gilbody, Richards, Fletcher and Sutton2006; Gunn et al. Reference Gunn, Diggens, Hegarty and Blashki2006).] This finding may not be suprising given that six out of the 10 studies (studies 2, 7, 8, 10, 11 and 13) included in our meta-analysis were also included in the meta-analysis of collaborative care.
In stepped care the primary focus is on psychological interventions of different intensity. However, as we noted in our introduction, it is unclear how medication management, which might be offered with significant support from case managers, fits into stepped care programmes. Since medication management is an important treatment option in depression care, we decided to include it in our definition of stepped care (the availability of more than one treatment modality, medication and psychotherapy). This choice led to the inclusion of several of the collaborative care trials, albeit the majority of which were also described as stepped care (studies 2, 7, 8, 10 and 11), and three other studies (studies 4, 5 and 9) in which stepped care was not defined by a progressive increase in treatment intensity. Our definition is debatable: others may choose to review or conduct future research on stepped care in line with how it was originally conceived; findings based on one definition of stepped care may not generalize to the other; future research may be required to compare stepped care defined by a progressive increase in treatment intensity and stepped care that is not.
We compared the results of the eight studies without a hierarchy in treatment intensity with the two studies that did provide ‘true’ stepped care with increasing treatment intensity. This comparison demonstrated that the ‘true’ stepped care studies performed significantly worse. This indicates that it might be better to match the first treatment to the patient's need than to offer a low-intensity treatment regardless of the patient's clinical profile. However, we think that this conclusion would be premature: first, because the results of ‘true’ stepped care are based on two studies only; and second, because seven of the eight studies without increasing intensity were based on the IMPACT protocol. Those seven IMPACT studies did not show better results than the three non-IMPACT studies. In other words, the difference in results between the two subgroup analyses (IMPACT versus non-IMPACT, and increasing intensity versus no increasing intensity) was actually based on one study with a very high effect size (study no. 2). Third, because the two studies aiming at prevention of (indicated) depression both offered ‘true’ stepped care and they demonstrated very large effects (almost halving the incidence of depression). In conclusion, we think that more ‘true’ stepped care studies need to be performed before we can reach a definite conclusion. Moreover, it is important not only to look at treatment studies but also prevention studies, especially as it has been argued that prevention contributes most in reducing the global burden of depression (Cuijpers et al. Reference Cuijpers, Beekman and Reynolds2012). This and other key areas for future research are summarized in Appendix 1.
The central tenet of stepped care is that for many patients the first (low-intensity) treatments are sufficient and relatively few patients need to step up. This means that similar (or better) patient outcomes could be achieved against lower costs. In the current meta-analyses only a limited number of trials provided data on the proportion of patients recovered after the first treatment. The data that were available were hard to interpret since the definition of adequate recovery varied between the studies as well as the duration of the steps, the number of patients dropping out of treatment and the number of patients not reporting health status. We also do not know how many patients needed to step up or the actual percentage of patients who took up this second step. This is important information because within stepped care there is a risk that patients do not start a second higher-intensity treatment after failure of the first. To improve reporting on clinical trials of stepped care for depression, we have identified data that are important to include (Appendix 2); including these would maximize subsequent systematic reviews.
We did demonstrate that better outcomes were reached in stepped care compared with care as usual. However, the question is whether or not care as usual is the best comparator. One could argue that care as usual is similar to matched care since this is the current dominant treatment approach. However, all the trials used an active approach to find and select patients. In four trials it was reported that the GP was informed about the diagnostic status of the patients in the control group, while the other studies refrained from informing the GP or did not report how they handled this. This indicates that care as usual probably more closely resembled ‘no care’. In other words we demonstrated that stepped care is better than doing nothing. The ideal test, against true matched care or against high-intensity care for all patients, has not been performed yet. We identified five (Dutch) protocol papers on stepped care (Braamse et al. Reference Braamse, van Meijel, Visser, van Oppen, Boenink, Eeltink, Cuijpers, Huijgens, Beekman and Dekker2010; Krebber et al. Reference Krebber, Leemans, de Bree, van Straten, Smit, Smit, Becker, Eeckhout, Beekman, Cuijpers and Verdonck-de Leeuw2012; Pommer et al. Reference Pommer, Pouwer, Denollet and Pop2012; Van der Weele et al. Reference Van der Weele, de Waal, van den Hout, de Craen, Spinhoven, Stijnen, Assendelft, van der Mast and Gussekloo2012; Van Dijk et al. Reference Van Dijk, Pols, Adriaanse, Bosmans, Elders, van Marwijk and van Tulder2013); none compared stepped with matched care or with intensive psychological treatment for all.
The remaining assumption of stepped care is that it reduces health care costs. Six out of the 10 studies included in the meta-analyses published a separate paper on the cost-effectiveness of their (collaborative) stepped care programme (Katon et al. Reference Katon, Schoenbaum, Fan, Callahan, Williams, Hunkeler, Harpole, Zhou, Langston and Unutzer2005; Araya et al. Reference Araya, Flynn, Rojas, Fritsch and Simon2006; Simon et al. Reference Simon, Katon, Lin, Rutter, Manning, Von Korff, Ciechanoswki, Ludman and Young2007; Van ‘t Veer-Tazelaar et al. Reference Van ‘t Veer-Tazelaar, Smit, van Hout, van Oppen, van der Horst, Beekman and van Marwijk2010; Butorff et al. Reference Buttorff, Hock, Weiss, Naik, Araya, Kirkwood, Chisholm and Patel2012; Hay et al. Reference Hay, Katon, Ell, Lee and Guterman2012; Ladapo et al. Reference Ladapo, Shaffer, Fang, Ye and Davidson2012). The results of the studies performed in Chile and India are hard to generalize to the Western world. The remaining four (US) papers either report savings or incremental costs that are offset by the health gains. This means that there is an indication that stepped care interventions might indeed be more cost-effective. However, because stepped care has not been compared with either matched care or high-intensity care, final conclusions about cost-effectiveness cannot be made.
Our study has several limitations. First is the limited number of studies. This made it especially hard to perform subgroup analyses. In this respect, the five protocol papers on stepped care are relevant, indicating that there is considerable clinical trials work in progress. Second, the stepped care interventions varied greatly as well as the samples included in the studies (countries, with or without co-morbidity, age, definitions of depression, etc.). This may limit the generalizability of our findings. A strength of this study is that it is the first to systematically describe all the available evidence with respect to stepped care, which is regarded in many countries as the preferred way to offer depression care. Furthermore, most of the studies were of good quality.
Although many guidelines recommend stepped care, there is currently only limited evidence to suggest that it should be the dominant model of treatment organization compared with alternative systems. Consistent with a previous observational study (Richards et al. Reference Richards, Bower, Pagel, Weaver, Utley, Cape, Pilling, Lovell, Gilbody, Leibowitz, Owens, Paxton, Hennessy, Simpson, Gallivan, Tomson and Vasilakis2012), we found considerable variety in the implementation of stepped care (with respect to the number and duration of treatment steps, treatments offered, professionals involved and criteria to step up) and only one significant difference between subgroups of studies (progressive intensity, yes/no), which requires further research. Hence, it was not possible to identify any optimal component of stepped care or to suggest a preferred model for delivery that may be associated with increased effectiveness. It was also not possible to determine with any certainty the relative effectiveness of stepped care models defined by combined treatment modalities (psychological and pharmacological) compared with those defined by progressive intensity of psychological treatment. The balance of costs, effectiveness and acceptability has not been investigated and further research is needed to determine if stepped care really should have such prominence in treatment guidelines. The first stage of such a research programme should be a fully powered clinical trial of stepped psychological versus high-intensity treatment to test both the non-inferiority hypothesis and the potential cost advantages of stepped versus more intensive treatment.
Appendix 1. Key areas of future research on stepped care
-
(1) Appropriately powered, non-inferiority randomized controlled trial of stepped care for depression and/or other disorders defined by a progressive increase in treatment intensity compared with a single-step high-intensity psychological treatment; cost-effectiveness and process analysis of above to be included.
-
(2) Pilot research into defining (a) stepping criteria (algorithm) for stepped care and (b) stratification criteria for matched care, leading to an appropriately powered, non-inferiority randomized controlled trial of stepped care for depression and/or other disorders defined by a progressive increase in treatment intensity compared with a matched care control.
-
(3) Appropriately powered, non-inferiority randomized controlled trial of stepped care for depression defined by progressive intensity of psychological versus stepped care defined by combined treatment modalities (psychological and pharmacological).
-
(4) Following more published trials, an updated systematic review of stepped care to help identify (via subgroup analysis) optimal components of stepped care.
-
(5) Additional randomized controlled trials to compare stepped care with other treatment for the prevention of depression.
Appendix 2. Recommended reporting standards on stepped care
Data to include in the report of a clinical trial on stepped care for depression:
Number of patients in stepped care and control group(s)
Drop-out prior to step 1 and between steps (n, %)
People discharged from treatment at each step (n, %)
People stepping up to subsequent steps (n, %)
For each step:
Treated, n
Health care professionals involved
Training and education provided to deliver clinical protocols
Treatment received:
-
n patients in receipt
-
dose, e.g. n sessions of psychological therapy (mean, s.d.)
-
duration, e.g. n weeks (mean, s.d.)
-
Drop-out of treatment during specific step (n, %)
-
Patient outcomes on end of each treatment step
-
n patients' health status assessed
-
depressive symptoms (mean, s.d., n in analysis)
-
n, % recovered or improved with definition of recovery/improvement specified
Stepping criteria:
Measure
Frequency and time-frame of assessment
Definition of improvement/recovery required to end treatment or to step up
For the control group:
Treated, n
Treatment received (detail as above)
Treatment drop-out (n, %)
Declaration of Interest
None.