Introduction
The subjective and yet almost unmistakable quality of anger has been defined by psychologists as a feeling related to a cognitive appraisal of perceived wrongdoing and associated with the action tendency to undo or correct that wrongdoing (Smedslund, Reference Smedslund1993). Angry feelings tend to be unpleasant and unwelcome for both recipient and respondent, but it is when they become maladaptive and detrimental to individual wellbeing and/or social harmony that professional assistance is required. There is now substantial evidence, for example, that trait anger, chronic hostility, anger expression, and acute anger episodes are all positively correlated with a number of adverse health outcomes, including cardiovascular disease and atherosclerosis (Williams, Reference Williams, Potegal, Stemmler and Spielberger2010). Angry arousal has also been shown to be associated with elevated pain intensity (Burns et al., Reference Burns, Gerhart, Bruehl, Peterson, Smith and Porter2015), with anger/hostility also implicated in a range of unhealthy appetitive behaviours such as overeating, problem drinking, and smoking (Izawa and Nomura, Reference Izawa and Nomura2006). No less important, however, are the mental health concomitants of anger, with dysregulated anger identified as a key diagnostic criterion in at least five different psychiatric disorders (Fernandez and Johnson, Reference Fernandez and Johnson2016). Finally, and perhaps most obviously, anger arousal is considered to be a key antecedent to both aggressive and violent behaviour (Chereji et al., Reference Chereji, Pintea and David2012; Day and Vess, Reference Day, Vess and Fernandez2013).
At least since the 1980s, the predominant approach to the treatment of problematic anger has been cognitive behavioural therapy (CBT), with meta-analyses of early outcome studies arriving at favourable conclusions about efficacy (e.g. Beck and Fernandez, Reference Beck and Fernandez1998). These have been reinforced by the findings of a number of subsequent reviews, in relation to treatment for children and adolescents (Sukhodolsky et al., Reference Sukhodolsky, Kassinove and Gorman2004), adults (DiGiuseppe and Tafrate, Reference DiGiuseppe and Tafrate2003), children with special education needs (Ho et al., Reference Ho, Carter and Stephenson2010), adults with intellectual disabilities (Nicoll et al., Reference Nicoll, Beail and Saxon2013), non-institutionalized adults (Del Vecchio and O'Leary, Reference Del Vecchio and O'Leary2004), and adult male offenders (Henwood et al., Reference Henwood, Chou and Browne2015). With evidence of the efficacy of CBT now replicated across studies and cross-validated across populations, attention has turned to the many differences in design and methodology within this field.
In this study, a rapid evidence assessment (REA) approach was adopted to evaluate design and methodological patterns in studies of CBT for anger. REA is an emerging form of systematic review that imposes restrictions on the search parameters in order to produce results that allow the timely communication of recent research findings (Burton et al., Reference Burton, Butler, Hodgkinson, Marshall, Hogard, Ellis and Warren2007). It examines specific questions/hypotheses and it does so in a way that is reproducible (Varker et al., Reference Varker, Forbes, Dell, Weston, Merlin, Hodson and O'Donnell2015). Unlike meta-analysis, effect sizes are not computed, but empirically based conclusions are integrated in a way that has practical implications for clinical practice and public policy (e.g. Broom, Reference Broom2016; Garrett et al., Reference Garrett, Taverner, Masinde, Gromala, Shaw and Negraeff2014; Phelps et al., Reference Phelps, Varker, Metcalf and Dell2017).
Our focus in this REA is on empirical evaluations of CBT for anger that have been published in the English language and since the beginning of this century. First, we sought to identify different types of studies, according to a system for classifying research designs. Second, we were interested in how the construct of problematic anger has been operationalized and whether social desirability bias (SDB) was accounted for when self-report measures were employed. This, in itself, is an important issue that has the potential to directly influence the validity of evaluation studies in light of a series of research reports that identify significant negative correlations between scores on measures of self-reported anger and measures of SDB (e.g. Delamater and McNamara, Reference Delamater and McNamara1987; Day et al., Reference Day, Mohr, Howells, Gerace and Lim2012). Third, we sought to investigate whether studies assessed readiness for treatment in the participants. As in other areas where psychotherapy has been offered (e.g. substance dependence), participants’ stage of motivation may have a bearing on their response to treatment and this may be more critical in the case of anger than for other negative emotions (Howells and Day, Reference Howells and Day2003). Finally, we revisited the over-arching question that runs through the literature in this field – that of treatment outcome. Because this is an REA rather than a meta-analysis, outcome was evaluated in terms of statistical significance rather than effect sizes; furthermore, clinically reliable change was also examined.
Method
In keeping with a shorter timeframe for REA reviews, the following inclusion criteria were applied: studies published in the English language, studies published in peer-reviewed academic journals from the year 2000 (since when there has been a surge of new studies awaiting synthesis), studies with quantitative rather than purely narrative information, and studies that implemented cognitive behavioural therapy for anger in participants who were 18 years or older in any setting (e.g. criminal justice, health and social services). Studies were also included if anger was a secondary target of treatment.
A three-step search strategy was used. First, a limited search was undertaken using PsycINFO in order to analyse text words contained in the titles and abstracts of articles, as well as specific index terms used to describe articles. Keywords used in the initial search included: (i) anger, aggression, violence, anger disorders, anger dysfunction, anger and dysregulation; in combination with (ii) therapy, psychotherapy, cognitive behavioural therapy, treatment, counselling, intervention, regulation, program, management, treatment outcome, treatment effectiveness, treatment efficacy, improvement, change. Both US and UK variations in spelling were tolerated. All identified keywords and index terms were used in searching multiple databases. The databases that were searched spanned PsycINFO, MEDLINE, CINAHL, Criminal Justice Abstracts, Social Work Abstracts, and the Cochrane Database. All searches were then screened for any material that met the inclusion criteria above. Finally, the reference lists of all relevant articles were hand-searched for any additional studies that might be relevant.
To rank order studies according to research design, the Maryland Scientific Methods Scale (SMS; Farrington et al., Reference Farrington, Gottfredson, Sharman, Welsh, Sherman, Farrington, Welsh and MacKenzie2002; Sherman et al., Reference Sherman, Farrington, Welsh and MacKenzie2002, Reference Sherman, Farrington, Welsh and MacKenzie2006) was applied. According to the SMS, five levels can be distinguished: (1) studies in which correlations between an intervention and outcome are reported; (2) studies that used a pre- and post-test design for a target group only (no control group); (3) studies that compared pre- and post-test measures for experimental and comparison groups; (4) studies that controlled for confounding variables and in which pre- and post-test measures have been compared between experimental and control groups; and (5) studies in which comparisons between experimental and control groups occurred after randomizing participants to groups (randomized controlled trials, RCTs). The assessment method was then classified in one of four categories: (i) questionnaire; (ii) self-monitoring; (iii) observation; and (iv) interview. Within the questionnaire category, the specific anger-assessment instrument was identified. Finally, measures of readiness and SDB were also identified, with specific mention of the instrument used.
Results
The initial search yielded a total of 5313 hits (3698 after duplicates were removed). A manual review of the titles according to the inclusion criteria resulted in 476 potentially relevant studies. This was followed by a manual review of abstracts, with 50 studies identified and the full-text articles subsequently accessed. Of these, four were excluded because the dependent measures were not explicitly related to anger. Two were excluded because of ambiguity regarding the cognitive behavioural content of the treatment, one was excluded because no statistical analyses of data were completed due to the small number of participants, and one was excluded because it was the full version of a report that was subsequently published in an academic journal. The final collection of 42 studies dating from the years 2000–2017 was then coded and analysed within the framework of rapid evidence assessment (Table 1).
Table 1. Summary of studies

Within groups refers to intervention groups only. RCT, randomized controlled trial; PP, pre–post design; TAU, treatment as usual; NTC, no-treatment control; WLC, wait-list control. STAXI, State-Trait Anger Expression Inventory; TAS, Trait Anger Scale; PI, Provocation Index; DPI, Dundee Provocation Inventory; DAS, Driving Anger Scale; PACS, Profile of Anger Coping Skills; BAAQ, Brief Anger Aggression Questionnaire; WARS, Ward Anger Rating Scale; ACI, Anger Control Inventory; WAKS, Watt Anger Knowledge Scale; SAM, Short Anger Measure; SCQ, Stages of Change Questionnaire (Anger); STRS, Serin Treatment Readiness Scale (regarding anger); ADS, Anger Disorders Scale.
On the first issue of SMS rankings, 25 of the 42 outcome studies were designed as RCTs, and thus attained the highest ranking of 5. As can be seen from Fig. 1, about a quarter of the studies used comparison groups but without random assignment (ranking of 4). The remaining studies were distributed more or less equally between rankings of 3 and 2, with no study relegated to the lowest ranking of 1. The vast majority (36 studies) utilized questionnaires to assess change. Some of these studies also incorporated other methods but, overall, self-monitoring, experience sampling, and behavioural observation were relatively rare. Of the questionnaires employed, the STAXI and/or its successor, the STAXI-2, were by far the most common (k = 28). As shown in Fig. 2, several other psychometrically sound instruments (e.g. Anger Disorders Scale and Profile of Anger Coping Scales) were each used in no more than two studies, while nine studies used miscellaneous anger assessment questionnaires.

Figure 1. Study design ranking based on the Maryland Scientific Methods Scale (SMS)

Figure 2. Anger assessment questionnaires used in studies
With regard to readiness for treatment, six studies alluded to issues of ‘motivation’ or ‘desire’ to change (especially studies by the Deffenbacher group; Deffenbacher et al., Reference *Deffenbacher, Huff, Lynch, Oetting and Salvatore2000, Reference *Deffenbacher, Filetti, Lynch, Dahlen and Oetting2002). The determination of motivation in these studies seems to have been based largely on clinical judgement. Only three studies actually measured readiness to change: Howells et al. (Reference *Howells, Day, Williamson, Bubner, Jauncey, Parker and Heseltine2005) used the Stages of Change Questionnaire (SCQ) and Serin Treatment Readiness Scale (STRS); Heseltine et al. (Reference *Heseltine, Howells and Day2010) used the STRS; and Henwood et al. (Reference *Henwood, Browne and Chou2016) used the Anger Readiness to Change Questionnaire (ARC-Q). Similarly, only a handful of studies recognized the risk of SDB (e.g. Gonzalez-Prendes and Hernandez Jozeforwicz-Simbeni, Reference *Gonzalez-Prendes and Hernandez Jozeforwicz-Simbeni2009), and this was actually measured and controlled for in only one study (Howells et al., Reference *Howells, Day, Williamson, Bubner, Jauncey, Parker and Heseltine2005). Finally, with regard to the outcome of treatment, all but one of the 42 studies [that of Howells et al. (Reference *Howells, Day, Williamson, Bubner, Jauncey, Parker and Heseltine2005), which was a study within a correctional population] found statistically significant effects of CBT on anger. Of the 21 studies that went on to further evaluate clinically significant/reliable change, all but one (Howells et al., Reference *Howells, Day, Williamson, Bubner, Jauncey, Parker and Heseltine2005) reported clinically significant levels of change.
Discussion
This REA of the design and methodology of studies applying CBT for anger reveals several trends and norms in this field of research this century. First, we note that the RCT design has come to be a sort of gold standard in treatment evaluation research, thus earning a top ranking of 5 in the Maryland SMS system. This type of design was utilized in well over half the studies published on CBT for adult anger. The choice of control group, however, did vary across studies. Some used a ‘no-treatment’ control group, while others used a ‘wait-list’ control group. More than a third of studies utilized quasi-experimental designs involving other comparison groups (e.g. ‘treatment as usual’), perhaps indicating the practical difficulties in providing ‘simulated’ or ‘sham’ treatment to a control group in an applied setting. Recruiting a wait-list control group may also pose particular challenges for participants who are keen to receive treatment without delay.
A key observation is that self-report questionnaires continue to be the most common method of measuring treatment change. This is perhaps unsurprising given that the phenomenology of anger identifies it as a subjective feeling communicated from a first person perspective (Smedslund, Reference Smedslund1993). What is somewhat unexpected, however, is how few studies attempted to correlate client self-report with observational data or even self-monitored behaviours. There are several different ways that anger can be assessed, including: angry experience (e.g. trait anger scales); angry behaviour (e.g. self-reported anger expression scales or observed anger in role plays); and somatovisceral changes (e.g. galvanic skin response, high blood pressure or increased heart rate). Methodologies such as ecological momentary assessment (EMA; Kirchner and Shiffman, Reference Kirchner, Shiffman, MacKillop and de Wit2013; Stone and Shiffman, Reference Stone and Shiffman1994) can also help shed light on the domain specificity of anger by capturing what transpires in naturalistic settings; with collateral input from significant others, EMA can extend the external validity of self-report data obtained in clinical settings. However, multiple assessment methods are rarely employed. Of the self-report questionnaires, the STAXI and its successor, the STAXI-2, remain well-entrenched. Yet, rival instruments such as the Novaco Anger Scale (Novaco, Reference Novaco2003) (and its predecessor) have an equally long history. In the clinical context, a number of newer instruments also draw attention to a variety of facets of anger experience and expression that are not captured by the STAXI (cf. review by Fernandez et al., Reference Fernandez, Day, Boyle, Boyle, Saklofske and Matthews2015). The Anger Disorders Scale, for example, permits a number of dysfunctional anger profiles to be identified, but is rarely used in evaluation research.
Related to this, the fundamental issue of SDB in self-report is rarely considered, even though anger is more socially unacceptable than depression or anxiety and, therefore, more prone to impression management. Several studies acknowledge this issue, yet only one (Howells et al., Reference *Howells, Day, Williamson, Bubner, Jauncey, Parker and Heseltine2005) formally assessed and corrected for SDB in relation to anger. It is interesting to note that this is the only study that reported neither a statistically significant nor a clinically significant improvement following treatment. Such a finding is arguably due to the correction for SDB. Conversely, the significant findings reported by most studies in this field may be a product of ‘uncorrected’ SDB. The importance of measuring SDB is demonstrated in van de Mortel's (Reference Van de Mortel2008) review of questionnaire-based studies listed on CINAHL in 2004 and 2005. Although the topic of that study extended beyond anger, only 31 (0.2%) studies measured SDB. Of these, almost half reported that social desirability influenced the results. Negative associations between SDB and anger in particular have also been demonstrated (e.g. Delamater and McNamara, Reference Delamater and McNamara1987; Welte and Russell, Reference Welte and Russell1993), with self-deception and impression management contributing to under-reporting of anger (Dutton and Hemphill, Reference Dutton and Hemphill1992). However, there was also evidence of SDB diminishing over the course of treatment in the Dutton and Hemphill study as participants learned about appropriate anger expression. It has also been suggested that SDB is particularly influential in individuals whose attendance in programs are mandated or encouraged by family, friends, or employers (Gonzalez-Prendes and Hernandez Jozeforwicz-Simbeni, Reference *Gonzalez-Prendes and Hernandez Jozeforwicz-Simbeni2009).
Another covariate of interest is the treatment motivation of the client. Also called ‘readiness’, this has often been conceptualized within the framework of the transtheoretical stages of change model (Prochaska and DiClemente, Reference Prochaska and DiClemente1982). With the advent of a readiness measure (the ARC-Q in 2003), it has been possible to assign anger treatment-seekers to precontemplation, contemplation, and action stages of change. Yet, only three studies so far this century have availed themselves of this methodology while several others appear to have speculated or exercised clinical judgement when evaluating the impact of readiness/motivation on treatment outcome.
On the whole, the evidence about treatment outcome is highly encouraging, with statistically significant reduction in anger reported in all but one of the 42 identified studies, and clinically reliable change reported in all but one of the 21 studies where it was evaluated. Putting aside the SDB issue broached earlier, this means that the outcomes were probably not related to chance and they are non-trivial in a practical sense. As noted by Hanson and Wallace-Carpretta (Reference Hanson and Wallace-Capretta2000), even if pre- and post-treatment scores are significantly different (statistically), behavioural change is unlikely when post-treatment scores remain within the clinical range. The extent to which an individual has improved or deteriorated based on participation in an intervention is important for subsequent decision-making processes not only about program effectiveness, but may also have important implications for an individual's future. Therefore, it is important that judgements about change in individual behaviour over the course of treatment are based on valid and reliable assessments. With the development of scientific methods to assess clinically significant changes in individuals (e.g. the Reliable Change Index; Jacobson and Truax, Reference Jacobson and Truax1991), it is important that studies go beyond questions of ‘what is statistically significant?’ to ‘what is clinically meaningful?’.
To conclude, we have undertaken a systematic review by rapid evidence assessment of studies published this century on CBT for anger in adults. In doing so, we have uncovered certain trends and norms in the design, noted the patterns in operationalization and methodology of these studies, while also charting new trajectories for the future of this field. As a whole, the RCT has fast become the prevailing design and the STAXI is the dominant measure of anger. The vast majority of these studies evidenced statistically significant outcomes and about half addressed and reported clinically reliable change, although this picture is open to further elaboration as additional methodological variables come under investigation in the future.
Acknowledgements
The authors thank the various students in our respective laboratories who assisted with data extraction. Special thanks to Dr Yilma Woldgabreal for statistical assistance.
Funding: This study was funded in part by a grant from the Dean, College of Liberal and Fine Arts, University of Texas at San Antonio.
Ethical statement: The research reported in this manuscript was conducted in accordance with the Ethical Principles of Psychologists and Code of Conduct as set out by the American Psychological Association. No ethical approval was required by the Institutional Review Board, because the research involved reviewing existing publications rather than any direct interaction with human subjects.
Conflicts of interest: The authors have no conflicts of interest with respect to this publication.
Comments
No Comments have been published for this article.