Introduction
Depression is a major health problem causing substantial disability and set to become the second largest cause of global disability by 2020 (WHO, 2001). In the UK depression and anxiety are estimated to cost the economy £17bn in lost output with a £9bn impact on the Exchequer through benefit payments and lost tax receipts (Layard, Reference Layard2006). Only one in four depressed people receive effective pharmacological treatment and less than 10% a talking therapy (Singleton et al. Reference Singleton, Bumpstead, O'Brien, Lee and Meltzer2001). Bower & Gilbody (Reference Bower and Gilbody2005) have identified four types of organizational strategies to improve this situation: training and the use of guidelines for general practitioners and primary care staff; ‘consultation liaison’, whereby specialist mental health practitioners advise on the care of individual patients in primary care; ‘collaborative care’, an enhanced form of consultation liaison which also includes a case manager to deliver care and liaise between GP, specialist and patient; and ‘replacement referral’ which refers to the deployment of specialists in secondary or primary care to whom GPs can refer.
A systematic review (Gilbody et al. Reference Gilbody, Whitty, Grimshaw and Thomas2003) of 36 such organizational intervention studies concluded that effective strategies require complex interventions at the ‘systems level’, consisting of: (1) a multi-professional approach to patient care; (2) a structured patient management plan; (3) scheduled patient follow-ups; and (4) enhanced inter-professional communication (Wagner et al. Reference Wagner, Austin and Von Korff1996; Gunn et al. Reference Gunn, Diggens, Hegarty and Blashki2006). The most effective systems-level intervention in this review was ‘collaborative care’ (Von Korff & Goldberg, Reference Von Korff and Goldberg2001; Simon, Reference Simon2006).
Although collaborative care improves outcomes over usual care (Katon et al. Reference Katon, Von Korff, Lin, Simon, Walker, Unützer, Bush, Russo and Ludman1999; Wells et al. Reference Wells, Sherbourne, Schoenbaum, Duan, Meredith, Unutzer, Miranda, Carney and Rubenstein2000; Unutzer et al. Reference Unutzer, Katon, Callahan, Williams, Hunkeler, Harpole, Hoffing, Della, Noel, Lin, Arean, Hegel, Tang, Belin, Oishi and Langston2002), two recent systematic reviews found small to medium mean effect sizes of either 0.24 (95% CI 0.17–0.32) (Gilbody et al. Reference Gilbody, Bower, Fletcher, Richards and Sutton2006) or 0.40 (95% CI 0.20–0.60) (Gensichen et al. Reference Gensichen, Beyer, Muth, Gerlach, Von Korff and Ormel2005). The effects associated with individual studies varied significantly, reflecting variation in the content of these ‘complex’ interventions (MRC, 2000). Further, most of the studies originated from the USA (Gilbody et al. Reference Gilbody, Bower, Fletcher, Richards and Sutton2006).
Although there have been calls for the implementation of collaborative care in the UK (Simon, Reference Simon2006), these have not been supported by UK clinical guidelines (National Institute for Health and Clinical Excellence, 2004) and may be premature given that it is not known exactly which models of collaborative care work best, and whether the model will generalize to the UK. In other areas of mental health such as assertive community treatment (Killaspy et al. Reference Killaspy, Bebbington, Blizard, Johnson, McCrone, Nolan, Pilling and King2005), the adoption of complex interventions based on international/US data before UK evaluation has resulted in ineffective UK service developments.
We adopted the phased approach (Campbell et al. Reference Campbell, Fitzpatrick, Haines, Kinmonth, Sandercock, Spiegelhalter and Tyrer2000) recommended by the Medical Research Council (MRC) for investigating complex interventions (MRC, 2000). We developed a UK-specific collaborative care intervention for depression, based on analysis of ‘active ingredients’ in published interventions (Gilbody et al. Reference Gilbody, Bower, Fletcher, Richards and Sutton2006) and in-depth qualitative research with stakeholders (Richards et al. Reference Richards, Lankshear, Fletcher, Rogers, Barkham, Bower, Gask, Gilbody and Lovell2006a). We then tested it in an exploratory Phase II randomized controlled trial.
Cluster-randomized controlled trials are recommended for testing systems-level interventions such as collaborative care (Ukoumunne et al. Reference Ukoumunne, Gulliford, Chinn, Sterne, Burney and Donner1999), since patient-randomized trials may be vulnerable to contamination. Patients in the control group may be influenced by system-level changes such as advice from specialists and changes to the process of care. Contamination in a patient-randomized trial may result in underestimating the real effect size of collaborative care. However, cluster-randomized trials require larger patient samples and often greater resources. Our Phase II trial, therefore, used an unusual design, nesting patient-level randomization within a cluster-randomized controlled trial to investigate the presence and magnitude of contamination (Fig. 1). We report the results of that randomized controlled trial here, the first UK trial of collaborative care and an early test of the utility of the MRC's complex-interventions framework.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160708210040-37858-mediumThumb-S0033291707001365_fig1g.jpg?pub-status=live)
Fig. 1. Investigating contamination in a trial of a complex organizational intervention.
Ethical approval was given by the South West Multi-site Research Ethics Committee.
Objectives
The objectives were:
(1) To estimate an effect size for a UK-specific collaborative care protocol.
(2) To determine whether cluster or patient randomization would be the most appropriate design for a Phase III trial.
Method
General practice sites were randomly allocated to treatment or cluster control conditions from four primary-care trusts (PCT) in the northern UK, stratified by PCT. Almost all practices had a deprivation index higher than the UK national average and a number were from areas where black and minority ethnic groups were strongly represented. Patients in the treatment cluster group were then individually randomized to either collaborative care or usual care control. Allocation was by a remote computer-generated number sequence concealed from researchers and conducted independently after patients were enrolled in the study by research interviewers. The randomization team at the trials unit informed patients, GPs and, where appropriate, case managers, of participant allocation. This created three study groups (cluster-randomized controls, individually randomized intervention patients, and individually randomized control patients). Fig. 2 details the consort diagram. To try to reduce the possibility of recruitment bias GPs were given no information about the allocation of their practice.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160708210040-82921-mediumThumb-S0033291707001365_fig2g.jpg?pub-status=live)
Fig. 2. Consort diagram.
Sample
We recruited patients from primary care aged >18 years diagnosed as depressed by a GP, confirmed by a score of ⩾5 on the depression section of the Standard Clinical Interview for DSM-IV (SCID; Spitzer et al. Reference Spitzer, Williams, Gibbon and First1992) undertaken by trained research assistants. We excluded patients with post-natal, bereavement or physical causes for their depression. We only included patients with a newly identified episode of major depression, defined as a current episode of GP-initiated treatment of not more than 1 month's duration. We excluded patients reporting active suicidal plans and those with a primary drug or alcohol dependence. Although a standard power calculation based on detecting treatment effects would be the conventional approach to determining the sample size for a clinical trial, our Phase II study was designed to help inform the overall power calculation for a definitive Phase III trial to be used together with estimates of the ‘expected treatment effect’ (MRC, 2000) from published studies to allow us to estimate a plausible effect size for use in the sample size calculation for the definitive Phase III trial. Recruiting 32 patients in each of the patient-randomized groups within the treatment cluster would have given us a 95% confidence interval width of half an effect size, allowing us to test for an effect of 0.25 in either direction. Therefore, we aimed to recruit 144 patients in total between the intervention and control clusters, 32 of the extra patients compensating for the design effect, the remainder to account for any attrition.
Intervention
Experimental
We developed a UK-specific collaborative care protocol in the modelling phase of our complex-interventions trial (Gilbody et al. Reference Gilbody, Bower, Fletcher, Richards and Sutton2006; Richards et al. Reference Richards, Lankshear, Fletcher, Rogers, Barkham, Bower, Gask, Gilbody and Lovell2006a), which met the four criteria for an organizational, systems-level intervention (Wagner et al. Reference Wagner, Austin and Von Korff1996; Gunn et al. Reference Gunn, Diggens, Hegarty and Blashki2006): (1) a multi-professional approach to patient care provided by a case manager working with the GP under weekly telephone supervision from specialist mental health medical and psychological therapies clinicians; (2) a structured management plan of medication support and behavioural activation – a structured cognitive-behaviourally based, depression-specific psychological intervention which has equivalent efficacy to other more complex CBT interventions (Dimidjian et al. Reference Dimidjian, Hollon, Dobson, Schmaling, Kohlenberg, Addis, Gallo, McGlinchey, Markley, Gollan, Atkins, Dunne and Jacobson2006; Cuijpers et al. Reference Cuijpers, van Straten and Warmerdam2007) but is simpler to use and thus more suitable for collaborative care (Jacobson et al. Reference Jacobson, Dobson, Traux, Addis, Koerner, Gollan, Gortner and Prince1996; Martell et al. Reference Martell, Addis and Jacobson2001). No other interventions were permitted for the duration of the trial; (3) scheduled patient follow-ups via a maximum of ten scheduled contacts over a period of 3 months, predominantly using the telephone; (4) enhanced inter-professional communication patient-specific written feedback to GPs via electronic records and personal contact. Case managers were a mix of professionals (nurse, counsellor and occupational therapist) and para-professionals (graduate primary-care mental health workers) all of whom received 2 days of protocol-specific training in addition to their existing clinical training and 30–45 min of supervision per week for the duration of the trial.
Control
Usual care management of depression by patients' GPs, including access to secondary services, and to best practice guidance published in local NHS depression protocols in the trial localities.
Outcome measures
The primary outcome was symptoms of depression as measured by the Patient Health Questionnaire-9 (PHQ-9; Kroenke et al. Reference Kroenke, Spitzer and Williams2001). Secondary outcomes were the Clinical Outcomes in Routine Evaluation – Outcome Measure (CORE-OM; Barkham et al. Reference Barkham, Margison, Leach, Lucock, Mellor-Clark, Evans, Benson, Connell, Audin and McGrath2001), measuring general wellbeing; the Short Form Health Survey (SF-36 v.2; Ware et al. Reference Ware, Kosinski and Dewey2000), measuring health-related quality of life. All assessments were completed at baseline and 3 months post-randomization by trained assessors blind to participant allocation.
Analysis
We aimed to determine a point estimate of the effect size of collaborative care, specific to the UK primary-care setting. We conducted analysis of covariance accounting for baseline imbalances in depression scores and clustering within the units of randomization using the Huber–White sandwich estimator (White, Reference White1980) within Stata 8 (Stata Corp., College Station, TX, USA). We used an intention-to-treat approach to examine the mean differences between the three groups and the associated confidence intervals and calculated coefficients to represent the difference between the cluster-randomized controls and the intervention group in follow-up outcome measures. For the main analysis of the effectiveness of the intervention, we calculated the standardized effect size (mean difference divided by the pooled standard deviation) between the intervention and cluster-randomized control groups. We examined the degree of contamination by comparing the coefficients of individually randomized and cluster-randomized control groups. We examined clustering of outcomes within practices by calculating the intra-class correlation coefficient (ICC).
Results
We recruited 114 patients, 41 to the intervention group, 38 to the patient randomized control group and 35 to the cluster-randomized control group (Fig. 2) from February 2005 until March 2006. Table 1 details the sample characteristics. The average number of case manager/patient contacts was 6.46 (s.d.=1.69), taking a mean time per patient of 191.13 min (s.d.=70.68).
Table 1. Baseline characteristics of sample
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160708210040-72757-mediumThumb-S0033291707001365_tab1.jpg?pub-status=live)
PHQ-9, Patient Health Questionnaire-9; SCL-20, Hopkins Symptoms Checklist Depression Scale; CORE-OM, Clinical Outcomes in Routine Evaluation – Outcome Measure; SF-36 MCS, Short Form Health Survey mental component score; SF-36 PCS, Short Form Health Survey physical component score; EQ5D, EuroQuol 5 Dimension Scale.
We found an effect size on PHQ-9 depression symptoms of 0.63 (95% CI 0.18–1.07) for the intervention compared to the cluster control (Table 2). We found the intervention to be more effective than the cluster control on the CORE-OM (0.45, 95% CI 0.11–1.01) and the mental component score of the SF-36 (0.67, 95% CI 0.19–1.16) but not more effective on the physical component score of the SF-36 (0.11, 95% CI −0.49 to 0.72). No adverse events were reported in any group.
Table 2. Follow-up scores, coefficients of difference between intervention and patient-randomized control group with cluster controls, 95% CI and p values of the difference
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160708210040-28828-mediumThumb-S0033291707001365_tab2.jpg?pub-status=live)
PHQ-9, Patient Health Questionnaire-9; CORE-OM, Clinical Outcomes in Routine Evaluation – Outcome Measure; SF-36 MCS, Short Form Health Survey mental component score; SF-36 PCS, Short Form Health Survey physical component score.
Evidence for substantial contamination was observed, as there was less difference in PHQ-9 depression outcomes between the intervention group and patient-randomized control group (−2.99, 95% CI −7.56 to 1.58, p=0.186) than between the intervention group and cluster-randomized controls (coefficient −4.64, 95% CI −7.93 to −1.35, p=0.008). The ICC for our primary outcome was 0.06 (95% CI 0.00–0.32).
Discussion
We found a moderate to large effect (Cohen, Reference Cohen1988; Lipsey & Wilson, Reference Lipsey and Wilson2001) of collaborative care, an effect which would be considered clinically significant under the guidelines for depression produced by the National Institute for Health and Clinical Excellence (2004), the first time this has been demonstrated in the UK. This effect is greater than that determined by systematic reviews (Gensichen et al. Reference Gensichen, Beyer, Muth, Gerlach, Von Korff and Ormel2005; Gilbody et al. Reference Gilbody, Bower, Fletcher, Richards and Sutton2006) and equates to a mean difference between treated and usual care patients of 5 points on the PHQ-9. Five points is the difference between symptoms of mild or moderate/severe intensity and between symptoms of moderate/severe and severe intensity. Furthermore, change in PHQ-9 scores achieved by the intervention patients from baseline to follow-up equates to a clinical shift of almost two categories of depression severity.
The optimal design for a full Phase III trial has also been clarified by the phased approach. We detected clear evidence of contamination, which has not always been accounted for in previous collaborative care studies (Gilbody et al. Reference Gilbody, Whitty, Grimshaw and Thomas2003). The individually randomized controls were closer to the intervention patients than to the cluster-randomized controls. Although striking, the precise mechanism of the contamination is unclear, but may relate to the sharing of information between case manager and GP. Examination of this mechanism may prove fruitful for the further development of interventions. As in any cluster-randomized trial, however, unmeasured differences between doctors could also explain some differences in outcomes, although we were careful to stratify cluster randomization by our four clinical sites to protect against this source of bias.
The main limitation of this study is the relatively small numbers in what was a Phase II trial. Consequently the results have wide confidence intervals around the mean and the effect size we obtained requires confirmation in a full trial. Although we did control for baseline depression severity as a covariate in our analysis, small numbers also prevented us balancing potentially important variables such as ethnicity, marital status and gender in our randomization which may or may not have influenced our results. Further, although there were no differences in consultations with GPs between groups, we do not have full detailed information on what constituted usual care in the control groups, which may have varied substantially and affected our results. However, Phase II trials are an important stage in carefully developing and testing new interventions and these results lend support to the utility of the MRC's complex-interventions research framework (MRC, 2000), which provided a logical and systematic structure to help us in the process of designing and testing collaborative care in the UK. For example, during our development work, we found four previous UK studies of collaborative care which in contrast to our results produced no or inconclusive effects (Wilkinson et al. Reference Wilkinson, Allen, Marshall, Walker, Browne and Mann1993; Blanchard et al. Reference Blanchard, Waterreus and Mann1995; Mann et al. Reference Mann, Blizard, Murray, Smith, Botega, MacDonald and Wilkinson1998; Peveler et al. Reference Peveler, George, Kinmonth, Campbell and Thompson1999). These studies were early trials in the development of collaborative care, and had not used the systematic framework to develop their interventions; our review indicates they had used suboptimal intervention ingredients (Gilbody et al. Reference Gilbody, Bower, Fletcher, Richards and Sutton2006). Although the principle of carefully phased intervention development is an effective way to think about designing interventions and is supported by our results, the framework is not prescriptive and lacks close detail. Our specific approach is only one of many methodological possibilities.
The research implications are that a fully powered Phase III cluster-randomized trial should be the next step of the MRC's complex-interventions phased approach (Campbell et al. Reference Campbell, Fitzpatrick, Haines, Kinmonth, Sandercock, Spiegelhalter and Tyrer2000; MRC, 2000) to investigating this complex intervention. Such a design will provide the best protection against both over- and underestimating the real effect size of collaborative care in the UK and will allow us to achieve a better balance of baseline demographic characteristics. A parallel qualitative investigation to this trial (Richards et al. Reference Richards, Barkham, Bower, Gask, Gilbody, Lovell, Rogers, Torgerson, Escott, Fletcher, Hennessy, Kendall, Lankshear, Richardson and Simpson2006b) has shown the clinical procedures to be acceptable to patients, mental health workers and GPs. If such a trial were to confirm the effect size of our Phase II trial results, we will have evidence to enable the NHS to substantially improve the organization of its care for depressed patients in primary care and to assist primary-care providers to deliver an effective model of enhanced depression service within the GP contract.
Acknowledgements
This trial was funded by MRC grant no. G03000677; ID: 68073, International Standard RCT no.: ISRCT63222059. The researchers worked independently of the research funder.
Declaration of Interest
None.