Introduction
Psychological trauma is common in the general population, with lifetime prevalence between 40% and 90% (Breslau et al. Reference Breslau, Davis, Andreski and Peterson1991, Reference Breslau, Kessler, Chilcoat, Schultz, Davis and Andreski1998; Norris, Reference Norris1992; Resnick et al. Reference Resnick, Kilpatrick, Dansky, Saunders and Best1993; Kessler et al. Reference Kessler, Sonnega, Bromet, Hughes and Nelson1995; Bernat et al. Reference Bernat, Ronfeldt, Calhoun and Arias1998). Many individuals develop psychological symptoms in the aftermath of trauma experience, commonly referred to as post-traumatic stress disorder (PTSD) symptoms. These may include re-experiencing the traumatic event, avoiding stimuli associated with the traumatic event, and increased arousal. Although many survivors of psychological trauma do not satisfy DSM-III/IV criteria for PTSD diagnosis, they may still be severely impaired and at increased risk of suicide (Stein et al. Reference Stein, Walker, Hazen and Forde1997; Marshall et al. Reference Marshall, Olfson, Hellman, Blanco, Guardino and Struening2001; Zlotnick et al. Reference Zlotnick, Franklin and Zimmerman2002). Trauma survivors, with or without a formal diagnosis of PTSD, often develop chronic symptoms (Kessler et al. Reference Kessler, Sonnega, Bromet, Hughes and Nelson1995; Koren et al. Reference Koren, Arnon and Klein2001; Mayou et al. Reference Mayou, Ehlers and Bryant2002; Perkonigg et al. Reference Perkonigg, Pfister, Stein, Höfler, Lieb, Maerker and Wittchen2005; Breslau, Reference Breslau2009) and contribute considerably to health-care costs (Walker et al. Reference Walker, Gelfand, Katon, Koss, Von Korff, Bernstein and Russo1999).
A variety of psychological interventions have been suggested to treat PTSD symptoms. Some are based on aetiological models and propose specific intervention components to relieve symptoms, such as exposure to trauma-related stimuli (Foa et al. Reference Foa, Rothbaum, Riggs and Murdock1991) or working through cognitions associated with the trauma (Ehlers et al. Reference Ehlers, Clark, Hackmann, McManus and Fennell2005). Other psychological interventions are based on components that are not unique to any particular psychological intervention and are therefore often described as ‘common’ or ‘non-specific’, for example empathy in supportive therapies or attention effects in relaxation treatments. Several standard meta-analyses have been designed to determine which psychological interventions are most promising for patients with PTSD symptoms, but they have been unable to come to a definite conclusion (Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Benish et al. Reference Benish, Imel and Wampold2008; Bisson & Andrew, Reference Bisson and Andrew2007).
There are two possible sources of evidence for a comparison of the effectiveness of two interventions A and B. The first is a direct within-trial comparison of intervention A and B (direct evidence). The second is a comparison of results from trials that compare either of the two interventions A and B with a common third intervention C (indirect evidence). Although direct, within-trial comparisons of active psychological interventions are the gold standard for establishing relative effectiveness, for many psychological interventions direct within-trial comparisons are rare or even non-existent. Therefore, both sources of evidence are useful for a comprehensive evaluation of relative effectiveness of interventions A and B.
Two of the above-mentioned meta-analyses took two approaches to determining the relative effects of different psychological interventions. One research group (Bisson & Andrew, Reference Bisson and Andrew2007) conducted separate meta-analyses for each direct comparison of psychological interventions that occurred in the literature (direct evidence). They included trials that compared the different interventions head-to-head, or with control interventions, and compared effects across single meta-analyses (informal indirect evidence). When effects of single meta-analyses were equal, they assumed that those interventions were equally effective even when interventions had not been directly compared within a single trial. Heterogeneity remained unexplained in some cases, and this limited interpretation of their results. By this method, they concluded that trauma-focused cognitive behavioral therapy (CBT) and eye movement desensitization and reprocessing (EMDR) were more effective than other psychological interventions, including supportive therapy (ST) and psychodynamic therapy (PT) for instance.
A second group (Benish et al. Reference Benish, Imel and Wampold2008) included only trials that compared different types of psychological interventions head-to-head (direct evidence) and pooled them in one meta-analysis. They restricted the range of psychological interventions to those they classified as ‘intended to be therapeutic’. This classification included psychological interventions such as PT, but excluded ST or stress management (SM). The latter interventions are typically used to control for non-specific intervention effects and the authors therefore did not consider them to be ‘intended to be therapeutic’. The authors interpreted the absence of between-trial heterogeneity of effect sizes as an indicator that different psychological interventions were equally effective and concluded that all interventions that were ‘intended to be therapeutic’ were equally effective.
Other meta-analyses that considered only direct evidence for the effectiveness of particular types of psychological interventions found no difference in the effectiveness of EMDR and trauma-focused CBT (Seidler & Wagner, Reference Seidler and Wagner2006), EMDR and exposure-based therapy (ET) (Davidson & Parker, Reference Davidson and Parker2001), CBT, ET and cognitive therapy (CT) (Mendes et al. Reference Mendes, Mello, Ventura, de Medeiros Passarela and de Jesus Mari2008) and ET, CBT, EMDR and CT (Powers et al. Reference Powers, Halpern, Ferenschak, Gillihan and Foa2010).
The results of recent meta-analyses leave us with a patchwork of findings based on direct evidence and informal indirect evidence. Although the equivalent effectiveness of trauma-focused CBT, ET, CT and EMDR seems to have been established, the effectiveness of other psychological interventions (including PT, ST and SM) has not yet been ascertained. Residual heterogeneity complicates the interpretation of previous findings in many cases (Davidson & Parker, Reference Davidson and Parker2001; Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Bisson & Andrew, Reference Bisson and Andrew2007; Powers et al. Reference Powers, Halpern, Ferenschak, Gillihan and Foa2010).
To overcome the limitations of the available comparisons and informal indirect evidence, we used network meta-analysis, a methodological approach to integrate trials, comparing a variety of psychological interventions head-to-head or with a control condition (Lumley, Reference Lumley2002; Lu & Ades, Reference Lu and Ades2004; Salanti et al. Reference Salanti, Higgins, Ades and Ioannidis2008; Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009). In network meta-analyses, the information available from within-trial comparisons of interventions A and B is combined with indirect comparisons of A and B derived from trials that compare either of the two interventions with a common comparator C (either a third psychological intervention or a control condition). Network meta-analysis has previously been used to investigate the effectiveness of pharmacological treatments for depression (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009) and mania (Cipriani et al. Reference Cipriani, Barbui, Salanti, Rendell, Brown, Stockton, Purgato, Spineli, Goodwin and Geddes2011), and in the evaluation of psychological interventions for depression (Barth et al. Reference Barth, Munder, Gerger, Nüesch, Trelle, Znoj, Jüni and Cuijpers2013).
Meta-analyses on psychological interventions for PTSD used different approaches to classify such interventions. As a result, the number and definition of categories vary across meta-analyses. Some researchers used a large number of categories to capture differences between interventions according to their theoretical backgrounds (e.g. differentiating between mainly cognitive interventions, primarily exposure-based interventions and a mixture of cognitive and behavioral elements; Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Mendes et al. Reference Mendes, Mello, Ventura, de Medeiros Passarela and de Jesus Mari2008; Watts et al. Reference Watts, Schnurr, Mayo, Young-Xu, Weeks and Friedman2013). Other authors suggested summarizing interventions with different theoretical backgrounds to broad categories (e.g. trauma-focused CBT including CBT, CT and ET; Bisson & Andrew, Reference Bisson and Andrew2007). Finally, some authors even argued against categorizing interventions at all (Benish et al. Reference Benish, Imel and Wampold2008; Wampold et al. Reference Wampold, Imel, Laska, Benish, Miller, Flückiger, Del Re, Baardseth and Budge2010) because differences in their effectiveness are only small. We chose an approach that allowed us to include a large number of direct comparisons between psychological interventions in the network (e.g. from dismantling studies such as Resick et al. Reference Resick, Galovski, Uhlmansiek, Scher, Clum and Young-Xu2008). At the same time we aimed at limiting the number of categories to a reasonable number (e.g. no differentiation between different types of exposure such as in vivo and in sensu).
To examine whether different approaches to classifying psychological interventions affect meta-analytic results, we reduced the number of intervention categories subsequently and looked at possible changes in effect sizes, heterogeneity statistics and model fit.
As the quality of primary studies is known to be a potential threat to the validity of meta-analyses (Jüni et al. Reference Jüni, Altman and Egger2001; Matt & Cook, Reference Matt, Cook, Cooper, Hedges and Valentine2009; Cuijpers et al. Reference Cuijpers, Van Straten, Bohlmeijer, Hollon and Andersson2010b ), we assessed the influence of study quality (Wood et al. Reference Wood, Egger, Gluud, Schulz, Jüni, Altman, Gluud, Martin, Wood and Sterne2008; Nüesch et al. Reference Nüesch, Trelle, Reichenbach, Rutjes, Bürgi, Scherer, Altman and Jüni2009), sample size (Nüesch et al. Reference Nüesch, Trelle, Reichenbach, Rutjes, Tschannen, Altman, Egger and Jüni2010) and type of outcome assessment (Cuijpers et al. Reference Cuijpers, Li, Hofmann and Andersson2010a ). We also controlled for the presence of a formal PTSD diagnosis in our analyses.
Method
Literature search
The literature search was based on a comprehensive initiative to build a database of references to clinical trials that had investigated the effectiveness of any psychological intervention for adults with PTSD symptoms. We searched bibliographic databases relevant for the field of psychotherapy (EMBASE, Medline, PsycINFO, Cochrane Controlled Trials Register and PSYNDEX) by combining key words and text words related to psychological interventions, randomized trials and PTSD (see online Appendix 1 for the search strategies used). We also checked the reference lists of relevant systematic reviews and meta-analyses (van Etten & Taylor, Reference van Etten and Taylor1998; Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Benish et al. Reference Benish, Imel and Wampold2008; Bisson & Andrew, Reference Bisson and Andrew2008; Cloitre, Reference Cloitre2009). The search was performed in January 2011 for trials published between 1980 and 2010.
Selection of trials
We included randomized trials in adults with full or subclinical PTSD that compared specific psychological interventions head-to-head (e.g. CBT compared with ET) against wait-list (WL), or against another control intervention using only non-specific intervention components such as therapist alliance, general attention or empathy (e.g. ST). Other potential control interventions, such as standard care involving pharmacological intervention or the use of pill placebos, were not eligible. Patients were considered to have subclinical PTSD if they had experienced at least one psychological trauma according to DSM-IV criteria and reported subsequent PTSD symptoms. We included both veteran and civilian samples. For a specific psychological intervention to qualify, it had to be implemented at the individual level (rather than as group, family or couples therapy), include face-to-face contact between the patient and the therapist (as opposed to telephone or internet-based interaction between patient and therapist), consist mainly of verbal communication, and directly address the trauma or subsequent PTSD symptoms. Trials had to be published as full journal articles; there were no language restrictions. We contacted the authors if the available information was not sufficient to determine inclusion of the trial. Seven investigators (H.G., T.M., one Ph.D. student and four M.Sc. students) determined eligibility according to a structured manual. Eligibility of a random sample of 200 references was determined by all seven investigators; the κ statistic for the coding of a reference as clearly included, clearly excluded or unclear based on the title and the abstract was 0.73. There were no disagreements about whether a reference could be excluded based on title and abstract. In ambiguous cases, a decision was made by consensus between H.G., T.M. and a senior researcher (J.B.) based on the full text.
Outcome measures
The prespecified primary outcome was severity of PTSD symptoms after the intervention or at maximum of 1 month after the intervention was terminated, measured with a validated scale. We preferred data from scales that assessed symptoms according to DSM-III/IV diagnostic criteria over (sub)scales that focused on only one symptom cluster. When more than one outcome measure was reported, we extracted the highest outcome on a predefined hierarchy. Most frequently used scales were given precedence. Self-rated PTSD symptoms were preferred to observer-rated PTSD symptoms, and results from intention-to-treat (ITT) analyses that included all randomized patients took precedence over results from analyses that excluded patients. Both self-rated outcome assessment and ITT analyses have been shown to result in conservative effect estimates (Nüesch et al. Reference Nüesch, Trelle, Reichenbach, Rutjes, Bürgi, Scherer, Altman and Jüni2009; Cuijpers et al. Reference Cuijpers, Li, Hofmann and Andersson2010a ).
Data extraction and coding
We classified interventions according to eight prespecified categories: WL, SM, ST, ET, CT, EMDR, CBT, and other psychological interventions (OPIs; see online Appendix 2 for descriptions of interventions). This classification was based primarily on the treatment descriptions in the published study reports. For further analyses we combined single categories. First, we reduced the number of intervention categories that relied on cognitive behavioral components: from three individual categories (i.e. CBT, CT and ET) to two (i.e. CBTc and ET), to one broad CBT category (CBTb); EMDR, OPI, ST, SM and WL remained unchanged. Second, we further reduced the number of categories to three, with all interventions that were based on specific intervention components for PTSD in one category of specific psychological interventions (i.e. CBT, EMDR, CT, ET and OPI), and interventions that were used as control for non-specific intervention components that are common to all psychological interventions as the second category (i.e. ST and SM), and WL as the third category.
Studies were classified according to their adherence to DSM-III/IV criteria for PTSD during patient inclusion. Studies in which at least 80% of patients satisfied DSM-III/IV criteria for PTSD were considered to adhere to DSM-III/IV criteria for PTSD during patient inclusion.
We assessed concealment of treatment allocation (Jüni et al. Reference Jüni, Altman and Egger2001; Wood et al. Reference Wood, Egger, Gluud, Schulz, Jüni, Altman, Gluud, Martin, Wood and Sterne2008), the reporting of ITT data (Nüesch et al. Reference Nüesch, Trelle, Reichenbach, Rutjes, Bürgi, Scherer, Altman and Jüni2009) and the type of outcome assessment (Cuijpers et al. Reference Cuijpers, Li, Hofmann and Andersson2010a ). Concealment of allocation was considered adequate if the investigators responsible for patient selection did not suspect which treatment was next before allocation. Analyses were considered adequate if all recruited patients were analysed in the group to which they were originally allocated, regardless of intervention received (ITT principle). Analyses were considered inadequate if data were insufficient to calculate ES based on the ITT sample. Outcome assessment was classified as self-rated when the patients used self-rating scales for outcome assessment, and as observer rated when some other person was involved in data collection, for example through clinical interviews.
When necessary, means and measures of dispersion of clinical outcome data were approximated from figures in the reports. All trial data were extracted in duplicate on a standardized form (Epidata 3.1, The Epidata Association, Denmark) by two out of five investigators (H.G. or T.M. and three M.Sc. students). All investigators were trained with a manual in a 2-day training workshop. Disagreements were resolved after they had been reviewed by a third investigator. The median κ across all extracted clinical and methodological characteristics was 0.79 (range 0.62–0.97).
Statistical analysis
For each treatment arm, we standardized mean values at the end of treatment using the pooled standard deviation (s.d.) across arms within each trial. If s.d.s were not provided, we calculated them from standard errors (s.e.s), confidence intervals (CIs) or other measures as described elsewhere (Follmann et al. Reference Follmann, Elliott, Suh and Cutler1992; Reichenbach et al. Reference Reichenbach, Sterchi, Scherer, Trelle, Bürgi, Bürgi, Dieppe and Jüni2007).
For the network meta-analysis, we used an extension of Bayesian random effects models for comparisons of multiple interventions (Smith et al. Reference Smith, Spiegelhalter and Thomas1995; Lu & Ades, Reference Lu and Ades2004). It considers all included comparisons between interventions, while completely preserving randomization within each trial and accounting for correlation between multiple comparisons within a trial with more than two treatment arms (Cooper et al. Reference Cooper, Sutton, Lu and Khunti2006). Pooled effect sizes (ESs) were derived by the median of the posterior distribution of the difference in standardized mean values of two treatments. Negative ESs indicate the experimental intervention had a beneficial effect and may be interpreted as described elsewhere (Cohen, Reference Cohen1988), with −0.20 s.d. units representing a small, −0.50 a moderate and −0.80 a large difference between interventions. Corresponding 95% credibility intervals (CrIs) were derived by the 2.5th and 97.5th percentiles of the posterior distribution. The between-trial heterogeneity estimate τ 2 was also estimated from the median of the corresponding posterior distribution. τ represents the s.d. of the underlying distribution from which the included trials are assumed to be a random sample. Based on our definition of small, moderate and large differences between interventions, we interpreted τ 2 as follows: τ 2 = 0.01 [(0.2/2)2] was considered to represent low heterogeneity, τ 2 = 0.0625 [(0.5/2)2] moderate heterogeneity and τ 2 = 0.16 [(0.8/2)2] high heterogeneity between studies. τ 2 has been shown to be independent of the number of studies and the number of patients included in the meta-analysis (i.e. no increase with large numbers of studies or large sample sizes; Rücker et al. Reference Rücker, Schwarzer, Carpenter and Schumacher2008). The consistency of the network was determined by comparing effect estimates derived by a meta-analysis including only direct comparisons with the indirect effect estimates derived by a network meta-analysis excluding the respective direct comparison. This procedure was applied to all existing pair-wise comparisons in the analysis dataset. Goodness-of-fit of the model was assessed with Q–Q plots.
To determine whether the network of evidence was affected by small-study effects, we drew contour-enhanced funnel plots (Peters et al. Reference Peters, Sutton, Jones, Abrams and Rushton2008) and added lines representing predicted intervention effects derived from random effects meta-regression using the s.e. as the explanatory variable. Then we assessed funnel plot asymmetry with a regression test (Egger et al. Reference Egger, Davey Smith, Schneider and Minder1997).
To determine whether estimated intervention effects were affected by trial characteristics, we performed stratified analyses by including an interaction term of treatment and trial characteristics as covariates in the network meta-analysis. We considered the following characteristics: adequate concealment of allocation, ITT analysis performed, trial size, type of outcome assessment and adherence to DSM-III/IV criteria for PTSD. p values for interaction effects between trial characteristics and intervention effects were estimated from the posterior distribution of covariates. These p values can be interpreted in the same way as traditional p values for interaction (Altman & Bland, Reference Altman and Bland2003). We used two cut-offs for trial size. The first was based on the median of 19 patients per trial arm observed in included studies, and distinguished between very small trials with an average of 19 patients or less per arm, and trials with 20 patients or more. The second cut-off distinguished between small to moderate trials with an average of 59 patients or less per arm, and trials with 60 patients or more. This trial size yields more than 90% power to detect a moderate to large ES of −0.60 s.d. units at a two-sided α = 0.05.
Finally, we investigated whether different approaches to classifying psychological interventions affect meta-analytic results. We subsequently combined single interventions into broader categories. This was implemented through a fully Bayesian strategy including the network with all psychological interventions, but effect parameters were restricted to be equal for comparisons of interventions between groups (e.g. the specific and the non-specific psychological interventions) and zero for comparison of interventions within the same group (e.g. interventions within the CBTb category). The network with effects for every psychological intervention was compared with a network based on the group-wise effects through goodness-of-fit using the deviance information criterion (DIC; Spiegelhalter et al. Reference Spiegelhalter, Best, Carlin and Van Der Linde2002). We used Stata releases 11 and 12 (StataCorp LP 2005, USA) and WinBUGS version 1.4.3 (MRC Biostatistics Unit 2007, UK) for all analyses.
Results
Initially, we identified 1311 references in our literature search and found 341 to be potentially eligible (Fig. 1). Sixty-six trials met our criteria and were included in the initial network meta-analysis (see online Appendix 7 for references and online Appendix 3 for a detailed description of each trial). The 66 trials had 155 arms that qualified for the analysis, with a median of 19 patients per arm (range 6–143), and a total of 4190 randomized patients. All but one trial was published in English. Thirty-two trials were conducted in the USA (48%). The median year of publication was 2003 (range 1989–2010). The most frequently evaluated specific psychological interventions were CBT in 31 trials (47%), ET in 23 trials (35%), EMDR in 20 trials (30%), followed by OPI in 14 trials (21%) and CT in six trials (9%). As control groups, WL was used in 37 trials (56%), followed by ST in 11 trials (17%) and SM in seven trials (11%) as non-specific control interventions (see Fig. 2 for the network of evidence).
Initial network meta-analysis
All 66 trials contributed to the overall network meta-analysis. Table 1 presents ESs of all interventions compared with WL. The five interventions, all based on specific psychological components, were associated with large ESs between −1.10 and −1.37. The ESs of ST and SM as non-specific control interventions were moderate (ES = –0.62 and −0.58 respectively). However, the only significant difference between two psychological interventions was that EMDR outperformed ST. Figure 3 a presents an overview of pair-wise comparisons (ESs with 95% CrIs) of all interventions. The τ 2 estimate of 0.30 suggested very large heterogeneity. There was no evidence of network inconsistency: although direct and indirect effect estimates differed in a range of moderate to large ESs for some comparisons, all CIs overlapped zero (see online Appendix 5).
PTSD, Post-traumatic stress disorder; CBT, cognitive behavioral therapy; CT, cognitive therapy; CrI, credibility interval; EMDR, eye movement desensitization and reprocessing; ES, effect size; ET, exposure therapy; ITT, intention-to-treat; OPI, other psychological intervention; τ 2, variability between trials; SM, stress management; ST, supportive therapies; n.e., not estimated: if CrIs were larger than 20 standard deviation (s.d.) units.
a The p value indicates whether the difference between subgroups is significant.
Bold font indicates whether the ES was statistically significant.
Exploration of variation between trials
Network meta-analyses that were stratified according to different characteristics of the included trials showed that the adherence to DSM-III/IV criteria for PTSD during patient inclusion was the most relevant moderator, with p = 0.01 (Table 1). ESs were larger in trials that adhered to DSM-III/IV during patient inclusion (with the smallest ES of −0.76 for SM and the largest ES of −1.58 for EMDR) and ESs were smaller in trials that included patients with subclinical PTSD (with the smallest ES of −0.10 for ST and the largest ES of −0.87 for CBT).
Examination of the funnel plot of all psychological interventions compared with WL indicated asymmetry with missing trials in areas of non-significance, even though the corresponding regression test was only borderline positive (p = 0.053; online Appendix 6).
Heterogeneity remained high in all strata except for trials with adequate concealment of allocation and trials with a large trial size. A large amount of heterogeneity between effect estimates from individual trials complicated the interpretation of results in most subgroups of trials. Therefore, we highlight the results of the two subgroups of studies with moderate heterogeneity, which allow clearer conclusions to be drawn.
Trials with adequate concealment of allocation
In the 10 trials with adequate concealment of allocation, CBT was used as the specific psychological intervention in five trials, EMDR in one trial, CT in four trials, ET in four trials and OPI in two trials. WL was used as control intervention in four trials and ST in one trial. The test of interaction was non-significant (p = 0.98). The five interventions, all based on specific psychological components, had large ESs. The ES of ST as a non-specific control intervention was moderate. The only significant difference between two psychological interventions was that EMDR outperformed ST (see online Appendix 4). Table 1 shows that the differences between ESs in the adequately versus inadequately concealed trials were only small. Between-trial heterogeneity was moderate (τ 2 = 0.04).
Large-sized trials
In the seven large-sized trials, CBT was used as the specific psychological intervention in five trials, ET in three trials, and traumatic incident reduction (TIR) therapy classified as OPI in one trial. WL was used as control intervention in five trials and ST in two trials. ESs of CBT and ET compared with WL were smaller than in the initial network meta-analysis (ES = –0.86 and −0.75 respectively) whereas the ES of ST as one of the non-specific control interventions did not change (ES = –0.61). However, the lower number of included trials reduced the precision of the estimates and the ES for ST was no longer statistically significant. For TIR therapy classified as OPI, the benefit was smaller (ES = –0.37) but 95% CrIs of all psychological interventions under investigation were largely overlapping: pair-wise comparisons showed that none of the three specific psychological interventions was superior to ST (online Appendix 4). Between-trial heterogeneity was moderate (τ 2 = 0.08).
Reduction of the number of intervention categories
Table 2 shows the results from different models in which the number of intervention categories was reduced subsequently. Combining single categories into broader categories did not affect either ESs or heterogeneity statistics. In all models we found no superiority of any specific psychological intervention over any other specific psychological intervention (Fig. 3 b, c). However, we found evidence for the superiority of specific over non-specific psychological interventions. When all specific interventions were summarized into one broad category and compared to the non-specific psychological interventions, we found a moderate superiority of −0.55 (95% CrI −0.83 to −0.28, τ 2 = 0.29) of specific over non-specific psychological interventions. The goodness-of-fit of either modelling all interventions separately or using the grouping was comparable (DICs between 102.53 and 104.72) with a slight decrease in the models with fewer categories (Table 2).
CBT, Cognitive behavioral therapy; CBTb, broad CBT category; CBTc, CBT with focus on cognitions; CrI, credibility interval; CT, cognitive therapy; DIC, goodness-of-fit using the deviance information criterion; EMDR, eye movement desensitization and reprocessing; ES, effect size; ET, exposure therapy; OPI, other psychological intervention; SM, stress management; ST, supportive therapies; τ 2, variability between trials; WL, wait-list.
Discussion
Main findings
Our network meta-analysis of different psychological interventions in patients with PTSD symptoms, which integrated direct and indirect evidence, suggests that different specific psychological interventions for treating PTSD have similar benefits. We found no evidence for the benefit of differentiating between different types of specific psychological interventions with respect to their effectiveness. We found evidence for the presence of small-study bias (i.e. the overestimation of intervention effects in trials with small to moderate trial size). Effect estimates were based on robust evidence with respect to a reasonable number of trials and the size of the trials only for psychological interventions that are based on cognitive behavioral intervention components. Our results from large-sized trials, however, indicate that STs that are based on non-specific intervention components might be equally effective as psychological interventions developed specifically for the treatment of PTSD symptoms (i.e. CBT and ET). It is important to note, however, that interventions in the ST category may vary considerably in their actual content. Therefore, the conclusion that ST, CBT and ET may be equally effective for patients with PTSD symptoms seems premature and deserves more examination.
Strengths and weaknesses of the study
This study has several strengths. We performed an extensive literature search (Egger et al. Reference Egger, Juni, Bartlett, Holenstein and Sterne2003). To minimize bias and transcription errors, data extraction was performed electronically and independently by two investigators (Egger et al. Reference Egger, Dickersin and Davey Smith2001). Components used for quality assessment are validated and reported to be associated with bias (Jüni et al. Reference Jüni, Altman and Egger2001; Wood et al. Reference Wood, Egger, Gluud, Schulz, Jüni, Altman, Gluud, Martin, Wood and Sterne2008; Nüesch et al. Reference Nüesch, Trelle, Reichenbach, Rutjes, Bürgi, Scherer, Altman and Jüni2009). Our network meta-analysis integrated all available evidence on the effectiveness of psychotherapy from direct and indirect comparisons into one analysis while fully preserving randomization.
This research has some limitations. Like standard meta-analysis, network meta-analysis assumes the included trials are drawn from the same population. This assumption implies, first, that heterogeneity between ESs of individual trials is small, and second, that direct and indirect effect estimates do not differ significantly (i.e. no inconsistency between directly and indirectly estimated ESs). If present, both heterogeneity and inconsistency complicate the interpretation of results. Although we found no evidence for inconsistency in any analysis, heterogeneity was reduced to a moderate amount only in some subsets of trials (i.e. trials with adequate concealment of allocation and large-sized trials).
We included only published trials in our analysis. In view of the skewed funnel plot, we suggest that including unpublished material would probably have resulted in smaller estimated benefits than those observed in our study.
We did not analyse the possibly moderating effect of additional characteristics of the patient sample, such as chronicity of the symptoms, the type of trauma or whether it was a veteran or civilian sample. There is some evidence that such characteristics moderate relative effects between specific and non-specific psychological interventions for PTSD (Gerger et al. Reference Gerger, Munder and Barth2013). Nor did our study control for researcher allegiance bias (Munder et al. Reference Munder, Flückiger, Gerger, Wampold and Barth2012), which may have introduced bias in effect estimates. Because data on a comparison level such as allegiance cannot be considered in network meta-analysis, it is likely that researcher preferences influenced the intervention effects found in this study to some extent.
Finally, the association of ESs with the adherence to DSM criteria for PTSD during patient inclusion is based on aggregated data available on the level of trials rather than on individual patient data. This approach is susceptible to the ecological fallacy (Thompson & Higgins, Reference Thompson and Higgins2002).
Comparison with other studies
Our findings of large ESs of specific psychological interventions and moderate ESs of non-specific psychological interventions are in line with findings from previous meta-analyses (Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Bisson & Andrew, Reference Bisson and Andrew2007). Our results confirmed previous findings that the effects of psychological interventions for PTSD are likely to be overestimated (Watts et al. Reference Watts, Schnurr, Mayo, Young-Xu, Weeks and Friedman2013). However, our results extend previous findings because our analyses suggest that such small-study bias was present particularly in specific psychological interventions for PTSD but not in non-specific psychological interventions. This finding is based on exploratory analyses, however, and needs further examination. The finding that STs may be as effective as CBT and ET from our analysis of large-sized trials is confirmed by results from a recent meta-analysis on present-centred therapy, one of the STs in our meta-analysis (Frost et al. Reference Frost, Laska and Wampold2014). The authors found only a small and non-significant ES difference between present-centred therapies and specific psychological interventions for PTSD on PTSD symptoms.
Classifying interventions into mutually exclusive categories is a common procedure in meta-analyses that examine the effectiveness of different types of interventions. In previous meta-analyses we found various numbers of categories in addition to variation in the labelling of individual categories (e.g. Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Bisson & Andrew, Reference Bisson and Andrew2007; Watts et al. Reference Watts, Schnurr, Mayo, Young-Xu, Weeks and Friedman2013). To reduce between-trial heterogeneity, Watts et al. (Reference Watts, Schnurr, Mayo, Young-Xu, Weeks and Friedman2013) subdivided their initial classification into more specific intervention types subsequently. Such an approach has the disadvantage that it results in a large number of intervention categories. As a consequence, the number of single meta-analyses that need to be conducted increases while the number of trials summarized in one meta-analysis decreases. By contrast, our network meta-analyses included all trials at the same time. We started with a larger number of categories and reduced the number of categories (i.e. number of nodes in the network) in subsequent analyses. We found no evidence that classifying specific psychological interventions into single categories according to their underlying theoretical background would reduce heterogeneity or provide a better model fit. This finding can be seen as confirmation of the conclusion drawn by Benish et al. (Reference Benish, Imel and Wampold2008) and Wampold et al. (Reference Wampold, Imel, Laska, Benish, Miller, Flückiger, Del Re, Baardseth and Budge2010), who deny the worth of classifying specific psychological interventions into categories. For the initial classification of psychological interventions, we adhered to the labels given by the authors of primary studies in most cases. However, we cannot rule out the possibility that interventions within the same category differed as to their content, which might have contributed to heterogeneity.
The association of intervention benefits with methodological quality varied for different types of psychological interventions. This inconclusive pattern is in line with previous investigations (Bradley et al. Reference Bradley, Greene, Russ, Dutra and Westen2005; Bisson & Andrew, Reference Bisson and Andrew2007). The results from our analyses are difficult to interpret because a substantial amount of heterogeneity remained unexplained in the stratified analyses. This may be related to the generally unsatisfactory methodological quality and the predominantly small sample size of included trials. Besides this, we calculated interaction effects and the corresponding p values based on a model that assumes the same interaction effect for all comparisons. A more flexible modelling approach would have allowed differential interactions to be detected for different comparisons but data were too scarce for such an approach.
Conclusions
Our network meta-analysis suggests that patients with a formal diagnosis of PTSD and those with subclinical PTSD symptoms benefit from different psychological interventions. Those patients with a formal diagnosis, however, may benefit more from both specific and non-specific psychological interventions. We did not identify any intervention that was consistently superior to many or most other specific psychological interventions. Thus, we agree with the conclusion of Watts and colleagues that ‘factors, such as access, acceptability and patient preference should exert strong and appropriate influence over the choice of treatment’ (Watts et al. Reference Watts, Schnurr, Mayo, Young-Xu, Weeks and Friedman2013, p. e547). Given the availability of effective treatment options and the severity of the disorder, the use of WL controls seems unethical and should be avoided in clinical trials. The effectiveness of EMDR and those interventions summarized in the OPI and ST categories in our analyses seems promising, but robust evidence from large trials with high study quality is lacking to date. In the future, large-sized trials should be conducted that compare such promising interventions with CBT and ET with robust evidence to expand available treatment options for PTSD.
Supplementary material
For supplementary material accompanying this paper visit http://dx.doi.org/10.1017/S0033291714000853.
Acknowledgements
We thank T. Barmettler for help with the literature search and trial retrieval, A. Volz, L. Trösch, R. Zurkinden and T. Tonia for help in the screening process, and H. Schmidt, J. Kummer and K. Abel for help with data extraction. We also thank K. Tal for providing editorial assistance with the manuscript.
J.B. and P.J. received a grant (no. 105314-118312/1) for this work from the Swiss National Science Foundation.
Declaration of Interest
None.