Schizophrenia and bipolar disorder (BD) are among the top twenty causes of disability in the world, with an estimated 15.2 million years lived with disability (YLD) due to schizophrenia and 9.9 million YLD due to BD globally (2013 estimates) (1). The rate of all-cause mortality in schizophrenia and BD, over 15-year follow-up, is 2.08 and 1.77 times greater than the general population, respectively (Reference Hayes, Miles, Walters, King and Osborn2). Additionally, the premature mortality gap between individuals with schizophrenia and BD, and the general population is growing (Reference Hayes, Miles, Walters, King and Osborn2). The majority of premature deaths are linked to physical illness, such as cardiovascular and metabolic disease (Reference Hayes, Miles, Walters, King and Osborn2–Reference Wahlbeck, Westman, Nordentoft, Gissler and Laursen8). Review evidence points to a link between premature mortality and morbidity and long-lasting negative health behaviors (Reference Millier, Schmidt and Angermeyer7). A wide range of other factors also impact on quality of life, including cognitive impairment, discrimination, stigma, social exclusion, and reduced opportunities for employment and education (7; 9–19). The impact on caregivers quality of life and time is also substantial (Reference Millier, Schmidt and Angermeyer7).
As well as the humanistic burden, there is a large economic burden associated with schizophrenia and BD. Review evidence points to annual costs attributed to schizophrenia of between US$94 million and US$102 billion for different countries across the world (Reference Chong, Teoh and Wu20). At least half (50-85 percent) of these were indirect costs, such as productivity loss (e.g., absenteeism from work) or informal care (Reference Chong, Teoh and Wu20). Total costs for people with schizophrenia and BD are estimated to reach £14.7 billion by 2026, with 57 percent of these associated with lost earnings (Reference McCrone, Dhanasiri, Patel, Knapp and Lawton-Smith21).
Typically, the first-line therapeutic option for schizophrenia or BD is pharmacological (e.g., antipsychotic medication and/or mood stabilizers). However, some individuals do not adhere to, or actively decline medication for a variety of reasons; and symptoms may be unresponsive to medication, or require further support (Reference Kennedy, Altar, Taylor, Degtiar and Hornberger22). Psychological therapies, plus usual care (typically pharmacological treatments), can improve symptoms, and increase quality of life and functioning in people with schizophrenia and BD (Reference McDonagh, Dana and Selph23–Reference Swartz and Swanson26). Guidelines suggest that psychological therapies should be part of management strategies that are tailored to individual needs (27;28).
People with schizophrenia and BD comprise the majority of the population with severe mental illness; 94 percent of severe mental health service users in the United Kingdom have a diagnosis of either schizophrenia or BD (Reference Reilly, Planner and Hann29). Available literature highlights the economic and patient burden of schizophrenia and BD, demonstrating the need for effective treatment in this patient group. Constraints on health and social care funding require that existing resources are allocated efficiently; economic evaluation provides a useful tool to support decision making for these patient groups.
Most systematic reviews of the cost-effectiveness of treatments for psychosis have focused on pharmacological therapies. Three previous reviews that included psychological therapies were identified for schizophrenia or BD (Reference Amos30–Reference Abdul Pari, Simon, Wolstenholme, Geddes and Goodwin32). Amos et al. (2012) focused on early intervention (EI) for psychosis (Reference Amos30). Desmedt et al. (2016) evaluated the cost-effectiveness of integrated care models for people with chronic diseases, which included six studies on schizophrenia (Reference Desmedt, Vertriest and Hellings31). Finally, Abdul Pari et al. (2014) focused on management strategies (including pharmacological management) for BD (Reference Abdul Pari, Simon, Wolstenholme, Geddes and Goodwin32). While these reviews are valuable, they contain little recent evidence and are somewhat limited in scope, either by intervention or population group. Therefore, a more up-to-date literature search and comprehensive synthesis of the evidence is required to support evidence-based practice and research in the field.
The aim of this review was to determine the robustness of the current evidence base for economic evaluations of psychological interventions for schizophrenia or BD, and to identify any gaps in this evidence base.
Methods
The review protocol was published on the online PROSPERO international register of systematic reviews (CRD42017056579) (Reference Shields, Davies, Buck, Elvidge and Hayhurst33).
Search Strategy
Searches were initially performed in August 2015 and were updated in January 2017 and November 2018. Searches were restricted to publications from year 2000 onward (to maximize relevance to current practice) in English language on the OVID Medline, EMBASE, and PsychINFO databases. The NHS EED database was searched in the initial search; later searches excluded this database because new papers were not added after 2015. Search terms included terms specific to economic evaluation, the population of interest, and psychological therapy. Strategies and terms varied according to the database design. Free-text and standardized (MESH) subject terms were used. Strategies were pilot tested to ensure all studies already known to the authors were retrieved. The full search strategies are provided in the Supplementary Table 1.
Selection of Studies
Inclusion criteria are outlined in Table 1. Studies not meeting these criteria were excluded during the screening process.
Table 1. Inclusion Criteria for Systematic Review of Psychological Interventions for Schizophrenia and Bipolar Disorder
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224144507564-0146:S0266462319000448:S0266462319000448_tab1.gif?pub-status=live)
Independent screening was undertaken at two stages (first, of abstracts and titles, then of full papers) by four reviewers (G.E.S., D.B., K.P.H., J.E.). A fifth reviewer was used to resolve disagreements (L.M.D.). The primary reason for exclusion was recorded at both stages.
Data Extraction and Quality Appraisal
The NHS EED quality checklist for data extraction and critical appraisal was adapted to develop a predefined data extraction form and the CHEERS checklist was used to support critical appraisal (34;35). Data extraction and quality assessment included information on study methodology, results, limitations, evidence gaps, and risk of bias. Data extraction was completed by one reviewer (D.B. papers published until 2016; L.M.D. papers published from 2016 onward) with 20 percent of data extraction checked by a second reviewer (G.E.S.).
Review findings are presented by means of narrative synthesis. As expected, and typical of economic evaluations, included studies and interventions were highly heterogeneous, limiting the usefulness of any quantitative synthesis (e.g., meta-analysis).
Cost values were converted into 2017 U.S. dollars, using the price index for each country and the purchasing power parity conversion factor, to facilitate comparison between studies set in different countries (36;37).
Results
In total, 4,412 articles were identified through database searches; 3,864 remained after excluding duplicates. Primary screening of abstracts and titles reduced this to 232 papers for full text review. Twelve papers, specific to schizophrenia or BD, were identified and included in the review (Figure 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224144507564-0146:S0266462319000448:S0266462319000448_fig1g.gif?pub-status=live)
Figure 1. Identification of studies.
An overview of the key study characteristics is provided in Table 2.
Table 2. Overview of Studies
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224144507564-0146:S0266462319000448:S0266462319000448_tab2.gif?pub-status=live)
ACT, assertive community treatment; CEA, cost-effectiveness analysis; CBT, cognitive behavioural therapy; CUA, cost-utility analysis; EQ-5D, EuroQoL 5-Dimension questionnaire; GAF, global assessment of functioning; NR, not reported; IMT, integrated multidisciplinary team-based treatment approach; PANSS, positive and negative symptom scale; QLS, quality of life scale; SD, standard deviation; SFS, social functioning scale; WAIS-III, Wechsler Adult Intelligence Scale-III working memory.
Critical Appraisal
An overview of the quality of reporting (using the CHEERS checklist) of the economic evaluations is provided in Supplementary Table 2.
Population
According to trial recruitment criteria, the majority of study populations were predominantly made up of participants with schizophrenia (38–47). Two studies focused on participants with BD (48;49). Study participants were predominantly male and between 23 and 46 years old (Table 2). Note that this review included studies considering a population with an author reported diagnosis of schizophrenia or BD, exact diagnosis criteria could vary across studies.
There was inconsistency in how duration of illness was defined. Three studies did report duration of illness: Barton et al. (2009) reported a mean of 4.8 years, van der Gaag et al. (2011) 10.14 to 11.02 years, and Priebe et al. (2016) a median of 11 years (range, 7 to 18 years) (39;42;46). Patel et al. (2010) reported that over a half of patients had been in contact with psychiatric services for at least 10 years (41). McCrone et al. (2010) reported the proportion of patients with a first episode of psychosis (86 percent in the intervention group versus 71 percent in the control group) (40). Two studies reported the duration of untreated illness: Karow et al. (2012) reported means of 167 and 182 weeks in the intervention and control arms, respectively, and Rosenheck et al. (2016) a median of 74 weeks (44;47). One study reported the age of onset (22 years) (Reference Crawford, Killaspy and Barnes43), while another focused on the mean number of previous hospitalizations (6.3 in the intervention arm, 5.1 in the control arm) (Reference Lam, McCrone, Wright and Kerr49). Camacho et al. (2017) reported that over a half of participants had 20 or more previous bipolar episodes (Reference Camacho, Ntais and Jones48). In two studies, duration of illness was not reported at all (38;45). These differences in quantifying duration of illness demonstrate the challenges in comparing study populations and drawing conclusions related to subgroups or specific populations. The observed variation in study populations (some of which is unclear due to differences in reporting) means that we are unable to differentiate between subgroups. Therefore, we considered the twelve studies as a whole.
Intervention and Comparator
The most common intervention was CBT (6/12 studies) (38;39–42;49). The inclusion of CBT is clearly associated with publication date; the six oldest studies evaluate CBT as an intervention whereas the six more recent studies consider alternative psychological interventions. Four studies considered interventions focusing on multidisciplinary provision of care (40;44;45;47). Three other intervention types were identified; Crawford et al. (2012) evaluated art therapy, Priebe et al. (2016) evaluated body psychotherapy (which uses movement and the body as a form of treatment) and Camacho et al. (2017) evaluated group psychoeducation which aims to enhance people's understanding of their condition (43;46;48). “Standard care” was the most common comparator (9/12) (38;40–45;47;49). Standard care definitions varied across studies, but common themes included usual access to secondary mental health services and pharmacological treatment. The exact components of standard care (e.g., specific medications and doses) were not typically reported; however, these are likely to be reported in the trial publications. One study did not describe standard care at all (Reference Patel, Knapp and Romeo41). This variation reduces the generalizability of each study.
Measure of Health Benefit
Six of the included articles presented a CUA (39;43;44;46–48), typically using the generic EQ-5D questionnaire to derive utilities and estimate quality-adjusted life-years (QALYs) (39;43;44;46;48). Rosenheck et al. (2016) applied a published mapping algorithm to estimate utility weights from Positive and Negative Symptom Scale (PANSS) scores using standard gamble and visual analogue scales (47;50).
The ten cost-effectiveness analyses (CEAs) used eight different measures of health benefit. Three used the Global Assessment of Functioning (GAF), a scoring system for severity of illness in psychiatry (38;43;45). Other psychiatric measures included full vocational recovery (Reference McCrone, Craig, Power and Garety40), the Social Functioning Scale (Reference van der Gaag, Stant, Wolters, Buskens and Wiersma42), and PANSS negative scale scores (Reference Priebe, Savill and Wykes46). One study used a subset of the Wechsler Adult Intelligence Scale-III (WAIS-III) focusing on working memory (Reference Patel, Knapp and Romeo41). Another used the generic Quality of Life Scale (Reference Rosenheck, Leslie and Sint47). The studies focusing on participants with BD conducted a CEA using the cost per days free of bipolar episodes (Reference Lam, McCrone, Wright and Kerr49) and the cost to avoid one relapse or to gain a relapse-free year (Reference Camacho, Ntais and Jones48). The heterogeneity in outcome measures used means that it is difficult to determine an overall, comparative, estimate of cost-effectiveness. This is a common issue with CEAs, as there are no agreed threshold levels of cost per disease-specific outcome (Reference Shiroiwa, Sung and Fukuda51–Reference Claxton, Martin and Soares53).
Included Costs
The most common perspective taken within included studies was that of the healthcare provider (7/12) (Reference Barton, Hodgekins and Mugford39,Reference Karow, Reimer and König44,Reference Priebe, Savill and Wykes46–Reference Lam, McCrone, Wright and Kerr49). Types of costs differed across studies. One study considered costs of schizophrenia-related health care only, excluding assertive community treatment and all other costs. This was the most limited inclusion of costs in the identified studies (Reference Karow, Reimer and König44). Other studies captured intervention, inpatient, outpatient, and residential day service costs. Most included primary and community care, medication and social workers (10/12). Less common costs included were: criminal justice services (3/12) (40;41;43), patient out of pocket costs (3/12) (38;39;42), social security benefits (e.g., sick pay) (Reference Patel, Knapp and Romeo41), lost wages (Reference van der Gaag, Stant, Wolters, Buskens and Wiersma42), and informal care (all 1/12) (Reference van der Gaag, Stant, Wolters, Buskens and Wiersma42). Three studies discounted future costs (38;43;45). However, this was not necessary in most studies due to the short time horizons (<1 or 2 years). Two studies appeared to collect certain costs but did not report them. The first described the inclusion of productivity losses; later excluding these because fewer than 5 percent of participants were used at baseline (Reference Crawford, Killaspy and Barnes43). The second detailed that medication data were collected but not costed. It is unlikely that this omission has an important bearing on cost-effectiveness conclusions, as interventions are unlikely to affect medication use within the short study timeframe (Reference Priebe, Savill and Wykes46).
Risk of Bias
All economic evaluations were conducted using data collected in trials. One trial was nonrandomized, increasing the risk of selection bias (Reference Karow, Reimer and König44). The remaining evaluations were part of randomized trials, generally regarded as robust evidence. Over half reported that assessors were blinded but blinding of clinicians and participants was not possible due to the intervention types (38;41;43;45;46;48;49). The remaining studies did not report blinding (39;40;42;44;47). While blinding is important in reducing study bias, it is accepted that blinding is more challenging in pharmacological trials (Reference Boutron, Tubach, Giraudeau and Ravaud54). All studies showed that arms were similar at baseline, with no significant differences that may confound results. Six studies confirmed that they were not powered to detect differences in cost-effectiveness (41–43;45–47); the remainder did not report this information. Two studies applied a complete case analysis (40;45), two did not report how they handled missing data (41;47), and eight imputed missing data using various techniques (38;39,42–44;46;48;49). Eight studies explicitly adjusted for a variety of baseline demographic data (38-41;43;46;48;49).
Follow-up ranged from 6 months to 5 years (median, 18 months). Only two studies were over 2 years in duration.
Study Results
The key results from included studies are presented in Table 3. It is important to note that these results reflect varying time horizons.
Table 3. Key Study Outcomes
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224144507564-0146:S0266462319000448:S0266462319000448_tab3.gif?pub-status=live)
CEAC, cost-effectiveness acceptability curve; CUA, cost-utility analysis; GAF, global assessment of functioning; MANSA, Manchester Short Assessment of Quality of Life; NR, not reported; PANSS, positive and negative symptom scale; QALY, quality-adjusted life-year; QLS, quality of life scale; SD, standard deviation; SFS, social functioning scale; WAIS-III, Wechsler Adult Intelligence Scale-III working memory.
a For the study by Camacho et al. (2017) (Reference Camacho, Ntais and Jones48), the trial results are presented here as they reflect the primary analysis.
All included studies found empirical clinical improvements in the intervention arm. Seven noted that this difference was statistically significant, including five of six CBT studies and half of all team-based interventions (two of four) (38;40–42;44;47;49). Barton et al. (2009) also considered a CBT intervention but did not describe whether the change was statistically significant (Reference Barton, Hodgekins and Mugford39). One of the team-based interventions (early interventions for first-episode psychosis) was found to cause no significant change in health (Reference Hastrup, Kronborg and Bertelsen45). Two “experimental” treatments (art therapy and body psychotherapy) did not have a statistically significant impact on health (43;46). The third (psychoeducation) did not report statistical significance, however, the 95 percent CI for the trial analysis did not cross zero for QALYs, indicating a statistically significant result (Reference Camacho, Ntais and Jones48). Camacho et al. (2017) reported that the within trial economic analysis demonstrated a net benefit of group psycho-education (vs group peer support) on three health benefit measures (QALYs, relapse avoided, and relapse free years). The authors used an economic model to explore the cost effectiveness of group psychotherapy compared with treatment as usual. This also found a net improvement in QALYs for group psychotherapy. However, the 95 percent confidence intervals crossed zero, suggesting this difference was not statistically significant (Reference Camacho, Ntais and Jones48).
Incremental costs were highly uncertain. All but two studies noted that the impact of the intervention on overall costs was not statistically significant (38;40–47;49). The two remaining studies did not report any statistical significance testing (Reference Amos30); one of these did show that the 95 percent CI crossed zero, indicating incremental costs were not statistically significant (Reference Camacho, Ntais and Jones48). The study by Hastrup et al. (2013) stands out as an outlier, reporting the highest intervention saving of $35,864 per patient, but with the longest follow-up period (5 years) (Reference Hastrup, Kronborg and Bertelsen45). Six studies conducted some form of sensitivity analysis on costs including: omitting medication costs, varying discount rates, and assuming therapy groups were run by volunteers (38;39;43;45–47). None of these sensitivity analyses indicated statistically significant differences in incremental costs. Because they focused on costs only, key drivers of cost-effectiveness are unknown.
Incremental cost-effectiveness ratio (ICER) results across studies are challenging to summarize due to differences in the chosen measure of health benefit. Seven studies reported cost savings due to the intervention (38;40;41;44–46;49); therefore, the intervention was dominant in cost-effectiveness terms (health improving and cost saving). Six studies reported the ICER using QALYs, ranging from dominant to $87,562 per QALY. Studies reported the likelihood of cost-effectiveness relative to a specified cost-per-outcome threshold, and this raises one of the key limitations of the studies: recommended thresholds only exist for QALYs (52;55;56). Three studies presented the percentage likelihood of cost-effectiveness conservatively, assuming that the decision maker would not be prepared to pay anything additional for an improvement in health (38;40;41). Given the lack of thresholds for the majority of health benefits, it is left to decision makers to consider how much they would be prepared to pay for specific health gains.
Discussion
All included studies showed health benefits attributed to the respective intervention, generally to a significant degree, although incremental costs were much more uncertain, as no studies identified a significant impact on costs. The majority of studies, therefore, concluded that psychological interventions, including CBT, are cost-effective for the treatment of people with schizophrenia. Only two studies were identified for BD with different conclusions regarding cost-effectiveness. A key limitation of the identified literature is that many studies used arbitrary thresholds for cost-effectiveness and no study reported being powered to detect differences in costs so there is some subjectivity around these conclusions.
Heterogeneity across studies makes comparisons challenging, in particular, the use of different measures of health outcome. The variation in outcome measures is likely to be partly due to the lack of a recommended outcome measure. It has been argued that generic measures of health, such as the EQ-5D, may be insensitive to psychosis symptoms and that a symptom-based measure should be considered (Reference Connell, O'Cathain and Brazier57–Reference Brazier59), although others refute this (Reference Subramaniam, Abdin and Vaingankar60–Reference Hayhurst, Palmer, Abbott, Johnson and Scott62). One review noted that while there have been many patient-reported outcome measures (PROMs) developed in psychosis, methodological quality is limited and different measures focus on different aspects (e.g., treatment satisfaction, quality of life, quality of the therapeutic relationship) (Reference Reininghaus and Priebe63). This multi-faceted approach to PROMs makes it hard to identify which outcome is the most appropriate. In addition, the choice of outcome measure is likely to be affected by the specific objectives of the intervention, which may focus on one aspect of the illness. There is a continuing debate regarding whether clinical recovery is aligned with patient experience, and thus whether it is truly meaningful to the individual (Reference Macpherson, Pesola and Leamy64).
A further complication is that there are no clear decision-making thresholds for most measures used in studies. Where there is no agreed threshold, the percentage likelihoods of cost-effectiveness produced by studies cannot be meaningfully compared.
The review found limited evidence regarding long-term differences in health and cost outcomes with psychological interventions in this population. Longer-term trials, and modeling studies extrapolating trial data over longer durations, would be useful to resolve this evidence gap.
While the evidence generally supports the use of psychological therapies in this population, decision/policy makers wishing to use the evidence would need to consider whether the results can be generalized to their setting, such as the applicability of standard care in the trial and the age of the population within the trial. They would also have to determine an acceptable willingness to pay for specific health gains. Additionally, there are psychological interventions that have not yet been evaluated for cost-effectiveness, such as mindfulness, for which clinical evidence is heterogeneous and of limited quality (Reference Langer, Schmidt and Mayol65). No studies were identified that looked at varying medication and introducing psychological treatment concurrently, which would be interesting as these are likely to be used in combination with one another. As more evidence becomes available, this review will need to be updated.
Our findings are similar to previous reviews, suggesting that, although the results are predominantly in favor of psychological therapies within this group, there are issues of generalizability, uncertainty, and a paucity of long-term data (Reference Amos30–Reference Abdul Pari, Simon, Wolstenholme, Geddes and Goodwin32).
This review is subject to several limitations. It was restricted to English language articles and did not include unpublished literature. Including the grey literature and unpublished reports may be more likely to identify studies with inconclusive or negative cost-effectiveness results (Reference Bell, Urbach and Ray66). Additionally, studies have found that economic data are more susceptible to publication bias when compared with clinical data (Reference Thorn, Noble and Hollingworth67). However, a search of the gray literature was outside the scope and resources for this review. Nevertheless, our review did identify published studies with inconclusive results (e.g., demonstrated by the reporting of nonsignificant cost outcomes) and negative results, which may mitigate the impact of publication bias. Our setting type did not include online interventions; although no studies were screened out because of this. The growth of online therapies in mental health has so far focused on adults with depression and anxiety. However, it is likely that future studies will consider the role of technology in the treatment of severe mental illness meaning that the scope of future reviews will need to be expanded (Reference Lal and Adair68). Finally, the review included psychological therapies, as existing reviews typically focused on pharmaceuticals. However, it may be useful to view findings for each intervention type side by side.
The review highlights several important considerations for future research; longer-term evidence, from randomized controlled trials and/or economic modeling studies, is needed to assess all important differences in health and cost outcomes from psychological interventions for schizophrenia and BD. Future studies should consider the comparability and ease of interpretation of their cost-effectiveness results; decision makers are unlikely to be able to draw firm conclusions from an evidence base comprising such varied measures of health benefit. Generic measures of benefit, such as QALYs, can be easily compared across studies and even disease groups. Furthermore, QALYs have well-defined thresholds against which to base decisions regarding cost-effectiveness. Sensitivity analyses of clinical data, which can identify key drivers of cost-effectiveness, were generally lacking in the included studies. This form of analysis can characterize decision uncertainty and also guide future data collection. Finally, most studies were conducted in the United Kingdom and Europe. Research findings from a wider range of geographical settings are needed to ensure that decision makers have evidence that is generalizable to their jurisdiction.
This study presents independent research funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research program (grant reference no. RP-PG-0611-20004). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0266462319000448
Conflicts of interest
The authors declare that they have no conflicts of interest.