INTRODUCTION
Various deficits in executive functioning have been identified in the eating disorder literature, including working memory (Kemps et al. Reference Kemps, Tiggemann, Wade, Ben-Tovim and Breyer2006), and response inhibition (Southgate, Reference Southgate2005). Set shifting, the ability move back and forth between multiple tasks, operations, or mental sets (Miyake et al. Reference Miyake, Freidman, Emerson, Wizki, Howerter and Wager2000), is a major component of executive functioning. Problems in set shifting may manifest either as cognitive inflexibility (e.g. concrete and rigid approaches to problem solving and stimulus-bound behaviour) or response inflexibility (e.g. perseverative or stereotyped behaviours).
Three recent theoretical papers have suggested that a problem with set shifting may be a part of the risk factors for developing an eating disorder (Southgate et al. Reference Southgate, Tchanturia and Treasure2005; Tchanturia et al. Reference Tchanturia, Campbell, Morris and Treasure2005; Steinglass & Walsh, Reference Steinglass and Walsh2006) which may be linked to compulsive traits, rigidity and perfectionism (Tchanturia, Reference Tchanturia, Morris, Surguladze and Treasure2002, Reference Tchanturia, Brecelj, Morris, Rabe-Hesketh, Collier, Sanchez and Treasure2004a, Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasureb). There is also suggestion that set shifting may be part of the eating disorder endophenotype, as deficits in set shifting have been found in both affected and unaffected sister pairs (Holliday et al. Reference Holliday, Tchanturia, Landau, Collier and Treasure2005).
The aim of this systematic review and meta-analysis was to collate and summarize the literature on set-shifting ability in people with eating disorders.
METHOD
The ‘QUOROM statement’ for meta-analyses was followed.
Searching
Papers were located using the electronic databases PsycINFO, Medline and Web of Science, by additional hand searches through reference lists and specialist eating disorder journals, and through direct contact with academic institutions with an interest in this area. Journals were searched up to December 2005. Search keyword terms were; neuropsychology, set shifting, flexibility, rigidity, mental flexibility, cognitive rigidity, perseveration, wisconsin card sorting test, trail making test, brixton, haptic, catbat, eating disorder, anorexia nervosa, and bulimia nervosa. No date restrictions were applied to the selection of literature.
Any study employing the set-shifting tasks Trail Making Test (TMT), Wisconsin Card Sorting Test (WCST), Brixton task, Haptic Illusion, CatBat task, or the set-shifting subset of the Cambridge Neuropsychological Test Automated Battery (CANTAB) with an eating disorder population was eligible for inclusion. All selected tasks require shifting between mental sets and strategies, although the specific operations involved may differ.
The Trail Making task (Kravariti et al. Reference Kravariti, Morris, Rabe-Hesketh, Murray and Frangou2003)
Participants numerically connect numbered circles on a page in a ‘dot-to-dot’ fashion (trail A), and then alternatively link numbers and letters, i.e. 1–A–2–B–3–C (trail B). This task can be administered using pen and paper (Reitan, Reference Reitan1958) or, more recently, a computerized version is available (which includes an alphabetic sequence task). Time taken to complete trail B (switching task) is the measure of set-shifting ability.
Wisconsin Card Sorting Test (WCST; Computer version 4 Psychological Corporation)
Subjects are instructed to match stimulus cards with one of four category cards; a single red triangle, two green stars, three yellow crosses and four blue circles. The sorting rule changes unpredictably during the course of the task. The WCST can be administered manually by a clinician or using a computer program. The number of perseverative errors is used as a measure of set-shifting ability.
The Brixton task (Burgess & Shallice, 1997)
Participants are asked to predict the movements of a blue circle on a computer screen, which changes location after each response. Occasionally, the pattern of movement changes and the participant has to abandon the old concept in favour of a new one. The total number of errors is a measure of set-shifting ability.
The Haptic Illusion task (Uznadze, Reference Uznadze1966; Tchanturia et al. Reference Tchanturia, Serpell, Troop and Treasure2001)
This is a perceptual, set-shifting task using three wooden balls. After 15 trials with balls of different sizes in their hands (with eyes closed – fixation stage), participants judge the relative size of two, same-sized balls (critical stage). The number of trials where illusions are experienced (same-sized balls perceived as different sizes) is a measure of perceptual inflexibility.
The CatBat task (Eliava, Reference Eliava1964; Tchanturia et al. Reference Tchanturia, Morris, Surguladze and Treasure2002)
Participants are asked to fill in missing letters in a written short story. In the first part of the story the context requires a ‘C’ (for cat), then the context changes and ‘B’ (for bat) becomes most appropriate. The number of perseverative errors (‘C’ where ‘B’ is appropriate) is the measure of set-shifting ability.
CANTAB IDED set-shifting subtest (Downes et al. Reference Downes, Roberts, Sahakian, Evenden, Morris and Robbins1989)
The Cambridge intra-extra dimensional (IDED) set shift consists of stimuli (colour-filled shapes and white lines) that appear in four rectangles on a computer screen. The subject must learn the correct stimuli for selection, based on audio and visual feedback. After six correct trials (maximum 50 trials) subjects move to the next stage and the rule shifts. Total number of errors was used as the measure of set-shifting ability.
Selection
A total of 22 studies were selected following the above search criteria. Upon inspection of the full manuscripts, three of these papers were excluded (Fox, Reference Fox1981; Ferraro et al. Reference Ferraro, Wonderlich and Jocic1997; Bayless et al. Reference Bayless, Kanz, Moser, McDowell, Bowers, Andersen and Paulson2002), as raw data (mean and standard deviation) was not presented and was unavailable from the authors on request. A further four papers were excluded as they did not contain a healthy control group, and therefore the effect size could not be calculated (Touyz et al. Reference Touyz, Beumont and Johnstone1986; Lauer et al. Reference Lauer, Gorzewski, Gerlinghoff, Backmund and Zihl2002; Kitabayashi et al. Reference Kitabayashi, Ueda, Kashima, Okamoto, Kooguchi, Narumoto, Wada, Yamashita and Fukui2004; Frieling et al. Reference Frieling, Roschke, Kornhuber, Wilhelm, Romer, Gruss, Donsch, Hillemacher, de Zwaan, Jacoby and Bleich2005). A total of 15 papers are included in the systematic review. One of the selected papers was in a foreign-language journal (Koba et al. Reference Koba, Shrie and Nabeta2002), and another was initially in review (Steinglass et al. Reference Steinglass, Walsh and Stern2006), however, on submission of the current paper, this study was published.
Data abstraction
Descriptive statistics (mean, standard deviation and sample size) for eating disorder and control groups were extracted from the papers. If this data was missing it was requested from the author.
Quantitative data synthesis
Analyses were carried out in stata 9.1 (StataCorp, College Station, TX, USA) using the user-contributed commands for meta-analyses metan (Bradburn et al. Reference Bradburn, Deeks and Altman1998) and metabias (Steichen, Reference Steichen1998).
The mean difference in scores between eating disorder and healthy control groups was standardized by calculating Cohen's d, the difference between the two raw means divided by the pooled standard deviation (Rosenberg et al. Reference Rosenberg, Adams and Gurevitch2000). The standard error of each study's standardized effect size was calculated from the estimated effect and the group sizes of the two groups using the method of Cooper & Hedges (Reference Cooper and Hedges1994), which is implemented in metan.
Cohen's d effect sizes are defined as negligible (⩾−0·15 and >0·15), small (⩾0·15 and >0·40), medium (⩾0·40 and >0·75), large (⩾0·75 and >1·10), very large (⩾1·10 and >1·45) and huge (⩾1·45).
A meta-analysis was conducted for the TMT, WCST, CatBat and Haptic tasks (comparing eating disorder and healthy control groups). The four meta-analyses were conducted in the following way: The standardized effects of set-shifting ability for each task was pooled using a random-effects model, which assumes in addition to within-group variability that the mean effects differ across studies (between study heterogeneity). Random-effects models produce wider confidence intervals and are more conservative than fixed-effects models but are regarded to be more realistic due to the variety of case mix and settings (Everitt, Reference Everitt2003). The assumption of homogeneity of true effect sizes was assessed formally using Cochran's Q test of homogeneity. However, this test is not very powerful with small sample sizes and as a sample size independent measure of inconsistency I 2 was calculated [I 2=(Q – df)/Q; Higgins et al. Reference Higgins, Thompson, Deeks and Altman2003].
Research with statistically significant results is potentially more likely to be submitted and published than studies with non-significant results. The presence of such a publication bias for the study was assessed informally by visual inspection of funnel plots [a plot of a study's precision (1/standard error) against effect size] and formally by its statistical analogue, Begg's adjusted rank test (Begg & Mazumdar, Reference Begg and Mazumdar1994), and Egger's test (Egger et al. Reference Egger, Smith, Schneider and Minder1997), which are implemented in metabias.
Because of a small sample size, we only present an average standardized effect size weighted by the inverse of the variance for the Brixton task.
Study characteristics
All studies employed an experimental cross-sectional design. All samples included an anorexia nervosa (AN) and healthy control population, with four studies also including bulimia nervosa (BN) patients (Jones et al. Reference Jones, Duncan, Brouwers and Mirsky1991; Tchanturia et al. Reference Tchanturia, Serpell, Troop and Treasure2001, Reference Tchanturia, Brecelj, Morris, Rabe-Hesketh, Collier, Sanchez and Treasure2004a; Murphy et al. Reference Murphy, Nutzinger, Paul and Leplow2002), and three also including AN recovered or weight-restored patients (Jones et al. Reference Jones, Duncan, Brouwers and Mirsky1991; Tchanturia et al. Reference Tchanturia, Morris, Surguladze and Treasure2002, Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasure2004b). Additionally, Murphy et al. (Reference Murphy, Nutzinger, Paul and Leplow2002) included OCD patients, and Holliday et al. (Reference Holliday, Tchanturia, Landau, Collier and Treasure2005) included a healthy sister comparison, however, these results will not be explored here. Little information on diagnosed co-morbidity was given, however, a number of studies reported histories of substance abuse (e.g. Jones et al. Reference Jones, Duncan, Brouwers and Mirsky1991) and depression (Jones et al. Reference Jones, Duncan, Brouwers and Mirsky1991; Thompson, Reference Thompson1993; Ohrmann et al. Reference Ohrmann, Kersting, Suslow, Lalee-Mentzel, Donges, Feibich, Arolt, Heindel and Pfleidrer2004; Fowler et al. Reference Fowler, Blackwell, Jaffa, Palmer, Robbins, Sahakian and Dowson2005). (See Tables 1 and 2 for further information regarding co-morbidity.)
Table 1. Demographic and effect size comparison of set-shifting tasks: anorexia nervosa versus healthy control groups

AN, Anorexia nervosa; HC, healthy control; BMI, body mass index; IBW, ideal body weight; n.r., not reported; Anx, anxiety; Dep, depression; OCD, obsessive–compulsive disorder; SA, substance abuse.
Table 2. Demographic and effect size comparison of set-shifting tasks: recovered/weight-restored anorexia nervosa versus healthy control groups, and bulimia nervosa versus healthy control groups

BMI, Body mass index; IBW, ideal body weight; ANrec, Anorexia nervosa recovered; HC, healthy control; NR, Not reported; Anx, anxiety; Dep, depression; OCD, obsessive–compulsive disorder; SA, substance abuse.
The case mix studied showed wide variation: age, body mass index (BMI), diagnosis and duration of illness were noted for each sample, in order to assess clinical heterogeneity. ‘Recovered AN’ were classified as those who had maintained a stable BMI of 19–24 for a minimum of 1 year (Tchanturia et al. Reference Tchanturia, Morris, Surguladze and Treasure2002, Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasure2004b). ‘Weight restored AN’ were classified as those who had maintained weight for a minimum of 6 months (mean=47·1 months, s.d.=31·6; Jones et al. Reference Jones, Duncan, Brouwers and Mirsky1991). ‘Broad AN’ were classified as those groups where not all participants fulfilled the criteria for AN on weight. It was not possible to note how many cases of AN had previous episodes of BN, or vice versa (an exception being Koba et al. Reference Koba, Shrie and Nabeta2002, where 50% of participants also had BN symptoms). Results from these subpopulations were kept separate in the analysis. (See Tables 1 and 2 for age, BMI, duration of illness, and co-morbidity information for each sample.)
RESULTS
Trail Making test (TMT)
The TMT was the most commonly employed measure (Witt et al. Reference Witt, Ryan and George1985; Jones et al. Reference Jones, Duncan, Brouwers and Mirsky1991; Thompson, Reference Thompson1993; Kingston et al. Reference Kingston, Szmukler, Andrews, Tress and Desmond1996; Mathias & Kent, Reference Mathias and Kent1998; Murphy et al. Reference Murphy, Nutzinger, Paul and Leplow2002; Tchanturia et al. Reference Tchanturia, Brecelj, Morris, Rabe-Hesketh, Collier, Sanchez and Treasure2004a, Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasureb; Holliday et al. Reference Holliday, Tchanturia, Landau, Collier and Treasure2005; Steinglass et al. Reference Steinglass, Walsh and Stern2006). A meta-analysis of TMT shifting performance revealed a small pooled standardized mean difference of 0·36 (see Fig. 1). There was no evidence of heterogeneity [χ2(13)=16·68, p=0·21] between the studies, i.e. between AN and BN and with different states of severity and recovery. The effect sizes across studies were found to be consistent (I 2=0·11). Begg's funnel plot suggests that little publication bias was present (see Fig. A2 in online Appendix) and both Begg's and Egger's tests for publication bias were non-significant (p=0·91, p=0·97 respectively). Analysis for correction of publication bias (trim-and-fill method) revealed little difference in the results, therefore uncorrected data is presented here.

Fig. 1. Forrest plot for TMT meta-analysis. ■, Anorexia nervosa (AN); □, AN Recovered; , AN Broad;
, Bulimia nervosa.
Wisconsin Card Sort Test (WCST)
Five papers (Thompson, Reference Thompson1993; Fassino et al. Reference Fassino, Piero, Daga, Leombruni, Mortara and Rovers2002; Koba et al. Reference Koba, Shrie and Nabeta2002; Ohrmann et al. Reference Ohrmann, Kersting, Suslow, Lalee-Mentzel, Donges, Feibich, Arolt, Heindel and Pfleidrer2004; Steinglass et al. Reference Steinglass, Walsh and Stern2006) employed the WCST with an AN population. The meta-analysis of WCST persisting errors produced a pooled standardized mean difference of 0·62 (see Fig. 2). There was no evidence for heterogeneity [χ2(4)=3·73, p=0·44] (including one study with broad criteria) or publication bias (Begg's p=0·81; Egger's p=0·64) (see Fig. A3 in the online Appendix), and effect size was consistent across studies (I 2=−0·07).

Fig. 2. Forrest plot for WCST meta-analysis. ■, Anorexia nervosa (AN); □, AN Recovered; , AN Broad;
, Bulimia nervosa.
The only paper employing the WCST with a BN population was not included in the meta-analysis as raw data was unavailable (Ferraro et al. Reference Ferraro, Wonderlich and Jocic1997), however, the authors noted a significant deficit in BN compared to healthy control performance on this task. The BN group also displayed significantly more variance in their scores than controls.
Brixton task
Three published studies employed the Brixton task (Tchanturia et al. Reference Tchanturia, Brecelj, Morris, Rabe-Hesketh, Collier, Sanchez and Treasure2004a, Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasureb; Holliday et al. Reference Holliday, Tchanturia, Landau, Collier and Treasure2005), all of them from our research group. No meta-analysis was calculated for the Brixton task, as there were only four data-points across studies and eating disorder groups. An average standardized effect size of 0·21 was calculated for the Brixton task. It should be noted that wide variation in effect size was noted across samples employing this task (see Tables 1 and 2). The only group in which the confidence interval did not overlap with zero were people acutely ill with AN (Tchanturia et al. Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasure2004b).
Haptic Illusion
The Haptic Illusion is another measure that has only been used by our research group. A meta-analysis of errors on this task yielded a large pooled standardized mean difference of 1·05 (see Fig. 3). There was no evidence for heterogeneity [χ2(7)=7·11, p=0·42] between BN or AN samples, or in AN samples with broad criteria or weight recovery. Evidence of publication bias was found (Begg's p=0·03, Egger's p=0·01), however, this is within the 95% confidence interval limits (see Fig. A3 in the online Appendix), and the trim-and-fill method did not predict any change in the data. Also, given the large overall effect size, it can be concluded that this finding is reliable. Effect size was consistent across studies (I 2=0·016).

Fig. 3. Forrest plot for Haptic task meta-analysis. ■, Anorexia nervosa (AN); □, AN Recovered; , AN Broad;
, Bulimia nervosa.
CatBat task
The CatBat task is the third measure that has only been used by our research group (Tchanturia et al. Reference Tchanturia, Morris, Surguladze and Treasure2002, Reference Tchanturia, Brecelj, Morris, Rabe-Hesketh, Collier, Sanchez and Treasure2004a, Reference Tchanturia, Morris, Brecelj, Collier, Nikolaou and Treasureb; Holliday et al. Reference Holliday, Tchanturia, Landau, Collier and Treasure2005). A meta-analysis of CatBat performance revealed a medium pooled standardized mean difference of 0·45 (see Fig. A1 in the online Appendix). Heterogeneity was non-significant. No evidence for publication bias was found (Begg's p=0·71, Egger's p=0·47).
CANTAB set shifting
Only one study was found that employed the CANTAB IDED shifting subtest in an eating disorder population (Fowler et al. Reference Fowler, Blackwell, Jaffa, Palmer, Robbins, Sahakian and Dowson2005). No difference was found between 25 AN and 25 healthy control participants, with a small effect size of 0·17.
DISCUSSION
This paper reviewed 15 studies that administered at least one of six neuropsychological set-shifting tasks in eating disorder populations. A consistent deficit in set-shifting ability was found across diagnoses, state of illness and most of the set-shifting measures. It was possible to conduct a meta-analysis of studies for TMT, WCST, Haptic, and CatBat tasks. The size of the pooled effect size varied between tasks, from small (TMT B), to medium (WCST and CatBat task), to large (Haptic task). The Brixton task has been less widely used and may only show an effect in the acute state. The set-shifting subtest of the CANTAB was used once and had an effect size close to negligible.
The limited amount of data from the recovered/weight-restored subgroups of AN suggested that the deficit in set shifting in some tests (TMT, Haptic, CatBat) remains as a trait and might be a candidate endophenotype. Further research is required to investigate this possibility. Likewise, the available data from people with BN was restricted, but suggests that the deficit in set shifting measured with TMT, CatBat task, and Haptic task is similar to that of AN whereas the Brixton task showed no effect.
There appears to be a difference in the potency of the set-shifting tasks that have been used. The Haptic task clearly has the highest, most consistent effect sizes. It is interesting to note that was the only task employed where set shifting was measured perceptually. Grunwald et al. (Reference Grunwald, Ettrich, Assmann, Dahne, Krause, Busse and Gertz2001a, Reference Grunwald, Ettrich, Krause, Assmann, Dahne, Weiss and Gertzb) assessed Haptic performance in AN (before and after weight recovery) and control women by asking them to reproduce a tactile pattern they had traced with their fingers. The drawings of AN women were of considerably poorer quality regardless of state of illness, suggesting a general deficit in somatosensory or Haptic processing. It is possible that as these Haptic tasks are initially more difficult, administering a Haptic task as a measure of set shifting served to magnify effect size, thereby producing the larger effect sizes seen for the Haptic task in this paper.
From the limited data available the deficit in set shifting is found across eating disorder diagnoses. Interestingly this deficit is also found in other psychiatric conditions. We searched for meta-analysis/systematic review/review and ADHD, psychiatric, manic depressive, psychosis, etc. and found that a set-shifting deficit is not specific to the eating disorder population. In a systematic review of cognitive deficits including set shifting in euthymic patients with bipolar disorder, effect sizes were between 0·5 and 0·8 (Robinson et al. Reference Robinson, Thompson, Gallagher, Goswami, Young, Ferrier and Moore2006) and in adult ADHD effect size was 0·65. Set-shifting abnormalities have also been found in the first-degree relatives of people with bipolar I disorder (Clark et al. Reference Clark, Sarna and Goodwin2005) and with schizophrenia (Snitz et al. Reference Snitz, Macdonald and Carter2006). Effect size for the WCST was small for people with OCD (Henry, Reference Henry2006), who display a larger effect size on the CANTAB IDED shift (Chamberlain et al. Reference Chamberlain, Fineberg, Blackwell, Robbins and Sahakain2006). Thus, it appears that weak set shifting is an endophenotype that broadly increases the risk of many forms of psychiatric illnesses.
A number of limitations in the current literature have been identified throughout this review. First, the majority of studies employ small samples. Second, we excluded longitudinal datasets as, to our knowledge, there is no evidence that these tasks are reliable if used repeatedly. Third, the Haptic, Brixton, and CatBat tasks have been employed only by our research group to date. Replication by other research groups among differing samples is required in order to validate the findings presented here. Finally, it is unfortunate that only one study employed the CANTAB IDED shifting task. An explanation for the small effect size of this task could found in the case mix of the Fowler et al. (Reference Fowler, Blackwell, Jaffa, Palmer, Robbins, Sahakian and Dowson2005) study. This population differs markedly from the other populations presented here, as it is an adolescent group with a short duration of illness. This is relevant because the diagnosis of AN is unstable in the early phase, as many cases recover or evolve into BN. These uncertainties exemplify the need for further work in this area.
Nevertheless this research underpins new forms of treatment in which the set-shifting deficit is a candidate trait for cognitive remediation therapy. Training the patient to adopt a more flexible thinking style in approaching everyday tasks could also encourage flexible thinking around eating disorder pathology. Such an intervention is currently being trialled in our group, and case studies thus far are promising (Davies & Tchanturia, Reference Davies and Tchanturia2005; Tchanturia et al. Reference Tchanturia, Whitney and Treasurein press).
This review highlights a consistent set-shifting deficit in the eating disorder population. Many interesting questions remain. Are these traits linked to Axis I and II co-morbity such as obsessive–compulsive traits? Are these traits also present in first-degree relatives of those with eating disorders and might they be an endophenotype? Are these traits linked to a specific genotype? To what degree do these traits become exaggerated in the acute phase of the illness? How do these general risk factors for psychopathology interact with other variables to produce eating disorder psychopathology? Do these traits affect outcome? Can these traits be a useful focus for treatment? We suggest that research examining this potential endophenotype or intermediate phenotype may lead to interesting new developments in the field.
ACKNOWLEDGEMENTS
We thank David Roberts for his assistance with figure design, and Michiko Nakazato for her assistance with foreign correspondence.
DECLARATION OF INTEREST
None.
NOTE
Supplementary information accompanies this paper on the Journal's website (http://journals.cambridge.org).
APPENDIX

Fig. A1. Forrest plot for CatBat Task meta-analysis.

Fig. A2. Begg's funnel plot (assessing publication bias) for TMT meta-analysis.

Fig. A3. Begg's funnel plot (assessing publication bias) for Haptic meta-analysis.