Introduction
Existing literature suggests a significant role of gender in bipolar disorder (BD) (APA, 1994) as it impacts on clinical course and outcome. Although not all studies report gender differences (Kessing, Reference Kessing2004; Baldassano et al. Reference Baldassano, Marangell, Gyulai, Nassir, Ghaemi, Joffe, Kim, Sagduyu, Truman, Wisniewski, Sachs and Cohen2005; Kawa et al. Reference Kawa, Carter, Joyce, Doughty, Frampton, Wells, Walsh, Olds, Kennedy, Boydell, Kalidindi, Fearon, Jones, van Os and Murray2005), it generally appears that female BD patients are at increased risk for depression (Curtis, Reference Curtis2005; Suominen et al. Reference Suominen, Mantere, Valtonen, Arvilommi, Leppämäki and Isometsä2009), rapid cycling and dysphoric or mixed mania (Arnold, Reference Arnold2003; Berk & Dodd, Reference Berk and Dodd2004), while the risk of alcohol and substance abuse may be higher for males (Frye et al. Reference Frye, Altshuler, McElroy, Suppes, Keck, Denicoff, Nolen, Kupka, Leverich, Pollio, Grunze, Walden and Post2003; Friedman et al. Reference Friedman, Shelton, Elhaj, Youngstrom, Rapport, Packer, Bilali, Jackson, Sakai, Resnick, Findling and Calabrese2005; Suominen et al. Reference Suominen, Mantere, Valtonen, Arvilommi, Leppämäki and Isometsä2009). Some studies report that BD men have an earlier age of onset (Robb et al. Reference Robb, Young, Cooke and Joffe1998; Kennedy et al. Reference Kennedy, Boydell, Kalidindi, Fearon, Jones, van Os and Murray2005) and worse psychosocial outcome than BD women (Robb et al. Reference Robb, Young, Cooke and Joffe1998; Raymont et al. Reference Raymont, Bettany and Frangou2003; Friedman et al. Reference Friedman, Shelton, Elhaj, Youngstrom, Rapport, Packer, Bilali, Jackson, Sakai, Resnick, Findling and Calabrese2005). In contrast, one study investigating clinical features in BD-II patients reported earlier age of onset for women and more Axis I co-morbidity compared with men (Benazzi, Reference Benazzi2006).
It is now widely accepted that cognitive abnormalities are a key feature of BD. Most studies have focused on patients with BD, type 1, (BD-I) (APA, 1994) and have reported widespread cognitive impairment during acute episodes of either polarity (Quraishi & Frangou, Reference Quraishi and Frangou2002). Additionally, a number of quantitative meta-analyses (Robinson et al. Reference Robinson, Thompson, Gallagher, Goswami, Young, Ferrier and Moore2006; Torres et al. Reference Torres, Boudreau and Yatham2007; Arts et al. Reference Arts, Jabben, Krabbendam and van Os2008; Bora et al. Reference Bora, Yucel and Pantelis2009; Stefanopoulou et al. Reference Stefanopoulou, Manoharan, Landau, Geddes, Goodwin and Frangou2009) have provided evidence for trait deficits of medium to large effect size in verbal memory and learning, concept formation and perseveration and in response inhibition during inter-episode intervals. To our knowledge only one previous study exists to address directly the issue of gender differences in cognition. Barrett et al. (Reference Barrett, Kelly, Bell and King2008) examined letter fluency, spatial working memory, planning and cognitive set shifting in 12 male and 14 female BD out-patients compared with healthy controls. Male patients performed worse than female patients in measures of spatial working memory, indicative of poor retention of visuospatial information. However, male patients in this study were both older and more symptomatic than female patients. The potential effect of gender was also explored in the meta-analysis by Arts et al. (Reference Arts, Jabben, Krabbendam and van Os2008), who reported smaller effect sizes for concept formation and perseveration in studies with higher male:female ratios.
Therefore, existing information regarding gender effects on cognitive function in BD is both incomplete and contradictory. The aim of this study was to expand the available evidence base by examining whether gender influences cognitive function in key domains known to differentiate BD patients from healthy controls. We therefore assessed 86 BD-I patients and compared them in terms of gender on general intellectual ability, memory, concept formation and response inhibition. To minimize potential referral bias and the confounding effect of symptoms, all patients were recruited from a secondary care catchment area-based psychiatric service and were evaluated when in remission.
Methods
Data for the current analysis were drawn from the Maudsley Bipolar Disorder Project cohort (Donaldson et al. Reference Donaldson, Goldstein, Landau, Raymon and Frangou2003; Raymont et al. Reference Raymont, Bettany and Frangou2003; Frangou, Reference Frangou2005; Frangou et al. Reference Frangou, Donaldson, Hadjulis, Landau and Goldstein2005). Details of the sample recruitment and assessment are also outlined below.
Subjects
Patients
Patients were recruited from the out-patient secondary care services of the South London and Maudsley NHS Trust, based on the following inclusion criteria: diagnosis of BD-I according to DSM-IV criteria (APA 1994); aged between 18 and 70 years and meeting DSM-IV criteria for remission in the preceding 6 months; on stable medication (same type and dose) for at least 3 months.
Healthy participants
Healthy individuals, without a personal or family history of psychiatric disorders, were recruited via advertisement in the local newspaper and were matched to patients, on age, gender and education.
Exclusion criteria for both groups included: (a) current or lifetime history of substance abuse or dependence as defined in the DSM-IV; (b) concomitant medical disorders; (c) history of head injury with loss of consciousness.
The study was approved by the Ethics Committee of the Institute of Psychiatry and was conducted in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) for experiments involving humans. Written informed consent was obtained from all participants after complete description of the protocol.
Assessments
All assessments took place on the same day. The same trained psychiatrist and psychologist conducted the clinical and cognitive assessments respectively of all participants.
Clinical assessment
The diagnosis of BD-I or absence of any diagnosis (in healthy participants) was confirmed using the Structured Clinical Interview for Axis I DSM-IV disorders (First et al. Reference First, Spitzer, Gibbon and Williams1996). The Brief Psychiatric Rating Scale (BPRS; Overall & Gorham, Reference Overall and Gorham1962) was used to assess levels of psychopathology in BD-I patients and healthy participants as it allows for evaluation of non-pathological experiences. Depressive and manic symptoms were rated using the Hamilton Depression Scale (Hamilton, Reference Hamilton1960) and Young Mania Rating Scale (YMRS; Young et al. Reference Young, Biggs, Zieglar and Meyer1978) respectively. The general impact of illness severity was assessed with the global assessment of functioning (GAF), where higher scores indicate better function (APA, 1994). Demographic data systematically obtained from all participants included gender, age, years of education, paternal and personal (best ever) socio-economic status, current employment, marital status and ethnicity.
Information about patients' past psychiatric history (hospitalization, polarity of first episode, age at first depressive or manic episode, number of previous episodes), current treatment, duration of illness and age of onset was collected from personal interviews with patients, their carers and medical notes. Age of onset was defined as the age when patients first met full DSM-IV criteria for an affective episode of either polarity. Medication on the day of neuropsychological testing was also recorded.
Cognitive assessment
Although data from the Maudsley Bipolar Project were available on multiple cognitive domains (Donaldson et al. Reference Donaldson, Goldstein, Landau, Raymon and Frangou2003; Frangou et al. Reference Frangou, Donaldson, Hadjulis, Landau and Goldstein2005), this analysis focused on performance on cognitive tasks where the largest effect size between BD patients and controls has been noted in meta-analytic studies (Arts et al. Reference Arts, Jabben, Krabbendam and van Os2008; Bora et al. Reference Bora, Yucel and Pantelis2009; Stefanopoulou et al. Reference Stefanopoulou, Manoharan, Landau, Geddes, Goodwin and Frangou2009) to maximize the likelihood of detecting a gender effect (if present) and to minimize potential type I errors. Thus, we analysed data from the Wechsler Adult Intelligence Scale, Revised (WAIS-R; Wechsler, Reference Wechsler1981), the Wechsler Memory Scale, 3rd edition (WMS-III; Wechsler, Reference Wechsler1998), the Hayling Sentence Completion Task (HSCT; Burgess & Shallice, Reference Burgess and Shallice1997) and the Wisconsin Card Sorting Test (WCST; Heaton, Reference Heaton1981).
The WAIS-R yielded an estimate of Full Scale Intelligence Quotient (FSIQ) as a measure of current general intellectual ability.
The WMS-III focuses on immediate, delayed and working memory, tested in two modalities, auditory and visual, and in two task formats, recall and recognition. Five outcome variables from the test were considered: auditory and visual immediate memory, auditory and visual delayed memory and auditory recognition delayed memory.
The HSCT has two sections, in each of which the examiner reads out 15 sentences with the last word missing. In the first section (response initiation) the sentence must be completed with a single word that is contextually predicted (e.g. He posted the letter without a … stamp). In the second section (response inhibition), the word given has to be totally unconnected (e.g. He posted the letter without a … banana). Errors occur when participants fail to suppress the predicted word (category A errors) or supply a semantically connected word (category B errors). The main outcome measures used here were the overall scaled score, total category A error score and total category B error score. The WCST (36 cards computerized version) requires subjects to sort a deck of cards on the basis of a series of unknown categories. Feedback is given after each match to enable identification of the correct matching rule. The card-sorting category changes after a number of correct responses. Number of categories achieved and number of perseverative errors are considered measures of rule discovery and cognitive set shifting respectively.
Statistical analysis
Group comparisons in demographic and clinical variables and FSIQ were undertaken using Student's t test or Pearson's χ2 as appropriate. Subsequently, three multivariate analyses of variance (MANOVA) were conducted for each neuropsychological test. For the WCST, number of perseverative errors and categories achieved were used as dependent variables, for HSCT the overall score and the number of category A and category B errors and for the WMS-III the scores for auditory immediate and delayed memory, visual immediate and delayed memory and auditory recognition delayed. For all three MANOVA, diagnosis (patient versus control) and gender (male versus female) were used as using independent factors while FSIQ and total BPRS scores were used as covariates. Post-hoc Bonferroni pairwise comparisons were conducted only when the overall model showed a significant effect of diagnosis or a gender by diagnosis interaction. The relationship between clinical variables (number of episodes, age of onset, GAF score) and any significant findings from the above analyses was explored using Pearson's correlation coefficients.
Results
Demographic and clinical features
Data from 86 patients with BD-I (50 female) and 46 healthy controls (25 female) were examined in this study. The mean age of patients was 46.75 (11.19) years and that of healthy participants was 42.65 (11.30). There was no group difference in age [t=–0.41, degrees of freedom (df)=130, p=0.66], gender distribution (χ2=0.17, df=1, p=0.67) or educational level (χ2=0.53, df=2, p=0.76). In addition, there was no significant age difference between male and female participants, regardless of diagnostic status (F=3.0, df=1, 128, p=0.83) nor was there a gender by diagnostic status interaction (F=0.39, df=1, 128, p=0.53). Clinical and medication details of the patients are shown in Table 1.
Table 1. Demographic and clinical details of patients with bipolar disorder

Values are mean (s.d.).
a p=0.01; all other p values were >0.08.
The mean BRPS scores of healthy controls was 25.77 (3.30), which just failed to reach statistical significance compared with patients' scores (t=1.80, df=125, p=0.07). Within the patient group there were no gender differences in age of onset, polarity of first episode, duration of illness, number of episodes or hospitalizations, Hamilton Depression Rating Scale, YMRS and GAF total scores (all p values greater than 0.08) but there were more women with a history of psychosis during mood episodes (χ2=6.12, df=1, p=0.01). No differences were found in the type of medication prescribed at the time of testing (all p>0.33).
Cognitive data
Participants' scores on the each of the cognitive tests used are shown in Table 2.
Table 2. Cognitive test performance in controls and bipolar disorder (BD) type I patients

Values are mean (s.d.).
Patients and controls did not differ statistically in FSIQ (t=1.55, df=125, p=0.12).
We did not find an effect of diagnosis (F=2.17, df=2, 74, p=0.12) or gender (F=0.35, df=2, 74, p=0.70) or gender by diagnosis interaction (F=0.51, df=2, 74, p=0.60) for WCST categories and perseverative errors. There was a significant effect of FSIQ (F=8.13, df=2, 74, p=0.001) but not of BPRS (F=0.19, df=2, 74, p=0.82). For HSCT, there was a significant effect of diagnosis (F=4.62, df=3, 74, p=0.005) but no effect of gender (F=1.17, df=3, 74, p=0.15) or gender by diagnosis interaction (F=0.84, df=3, 74, p=0.47). In addition, there was no effect FSIQ (F=1.96, df=3, 74, p=0.12) nor of BPRS (F=0.10, df=3, 74, p=0.95). For the WMS-III, we found a significant effect of diagnosis (F=4.12, df=6, 69, p=0.001) and gender (F=2.81, df=6, 69, p=0.05) and of their interaction (F=2.02, df=6, 69, p=0.05). There was a significant effect of FSIQ (F=8.39, df=6, 69, p<0.0001) but not of BPRS (F=2.0 df=6, 69, p=0.06).
At the univariate level, the gender by diagnosis interaction was significant for immediate visual and auditory memory and auditory delayed (p<0.04) but not for visual delayed memory (p=0.75), auditory recognition delayed (p=0.22). Post-hoc Bonferroni tests revealed that female BD-I patients did not differ from female controls in any WMS-III variables (p>0.90 for all tests). In contrast, male BD-I patients performed worse than male controls in auditory (p=0.002) and visual (p=0.002) immediate memory, while the differences in auditory delayed (p=0.04) were at the margins of statistical significance. Male BD-I patients performed worse than female patients in immediate memory (p=0.05) and auditory delayed memory (p=0.04), but both these findings would not survive Bonferroni correction.
Abnormalities in immediate memory test performance may reflect either encoding or retrieval problems. In order to define more accurately the profile of deficits emerging from the above analysis, we calculated two further WMS-III variables, namely, the subjects' learning slope and the retention score from the verbal paired associates subtest, which are considered measures of encoding. MANOVA with these two measures as independent variables diagnosis (patients versus controls) and gender as fixed factors and FSIQ and BPRS as covariates revealed an effect of FSIQ (F=6.01, df=2, 73, p=0.004) and diagnosis (F=2.01, df=2, 73, p=0.05) but no effect of BPRS (F=0.51, df=2, 73, p=0.60) or gender by diagnosis interaction (F=1.47, df=2, 73, p=0.23).
To reduce the number of correlations, we only used patients' scores on immediate memory (which is a composite of the visual and auditory immediate memory scores) in examining the effect of age of onset, number of episodes and GAF scores. These analyses were performed separately for male and female patients. A total of six correlations were performed and the threshold for significance was set to p<0.008. In men and women, the WMS-III immediate memory score correlated positively with GAF scores (men: r=0.53, p=0.001; women: r=0.36, p=0.01) and age of onset, the latter only in men (r=0.56, p=0.001) and not in women (r=0.11, p=0.44).
Discussion
In this study, gender by diagnosis interactions in BD-I were present in immediate memory, both auditory and visual, but not in general intellectual ability, concept formation and perseveration or response inhibition. Male patients performed worse in immediate memory compared with healthy male participants, while this was not the case for female BD-I patients. Compared with female BD-I patients, males underperformed in immediate memory and auditory delayed memory. Immediate memory correlated with patients' overall level of functioning and this association reached statistical significance for men. This finding underscores the importance of memory function in the outcome of BD and resonates with similar findings already reported (Martinez-Aran et al. Reference Martinez-Aran, Vieta, Colom, Torrent, Sanchez-Moreno, Reinares, Benabarre, Goikolea, Brugue, Daban and Salamero2004; Mur et al. Reference Mur, Portella, Martinez-Aran, Pifarre and Vieta2009). However, since male and female patients in our sample did not differ in their overall GAF scores, our data also support existing evidence that cognitive function is not a unique contributing factor to outcome (Mur et al. Reference Mur, Portella, Martinez-Aran, Pifarre and Vieta2009).
In the WMS-III, immediate memory scores are thought to reflect episodic memory encoding and effortful retrieval. Memory encoding refers to the processes involved in transforming the experience of an event into a memory trace that is accessible to conscious recollection. Effortful or explicit retrieval reflects conscious recollection. It is often difficult to differentiate encoding from retrieval abnormalities since successful retrieval depends on the efficacy of encoding. The absence of a significant gender by diagnosis effect on the learning slope and percentage retention indicates that retrieval rather than encoding deficits are greater in BD-I men than women. This is further supported by the finding of gender differences in delayed auditory memory in BD.
Neurocognitive studies are limited when considering the possible neural underpinnings of observed cognitive changes. However, cognitive tasks similar to those employed here have been used in functional imaging studies, making it possible and, we would argue, useful to incorporate neuroimaging findings in discussing our results. The evidence to date suggests that encoding and retrieval engages prefrontal and medial temporal cortices. Functional imaging studies suggest that effective encoding and retrieval are associated with increased activation in the hippocampus and surrounding structures (Kelley et al. Reference Kelley, Miezin, McDermott, Buckner, Raichle, Cohen, Ollinger, Akbudak, Conturo, Snyder and Petersen1998; Eldridge et al. Reference Eldridge, Engel, Zeineh, Bookheimer and Knowlton2005) and in dorsal and ventral prefrontal cortex (Fletcher & Henson, Reference Fletcher and Henson2001). Frontotemporal dysregulation has been previously noted in remitted BD patients (Deckersbach et al. Reference Deckersbach, Dougherty, Savage, McMurrich, Fischman, Nierenberg, Sachs and Rauch2006; Robinson et al. Reference Robinson, Bearden, Monkul, Tordesillas-Gutiérrez, Velligan, Frangou and Glahn2009) and our data suggest that this impairment may also be modulated by gender.
Neither female nor male patients were completely asymptomatic or medication free at the time of assessment. Although the lack of group differences in medication status and level of psychopathology is encouraging with regard to the validity of our findings, the theoretical possibility that they may contribute to our results cannot be discharged. Both patient groups were comparable in terms of age of onset, duration of illness, number of episodes or hospitalizations and GAF scores. It is therefore unlikely that our findings relate to potential differences in illness severity between men and women with BD-I.
In summary, our results support the notion that gender may modulate the degree of immediate memory dysfunction in BD and its impact on overall level of function.
Acknowledgements
Wellcome Trust (067427/z/02/z) Senior Fellowship in Basic Biomedical Science to V. Kumari.
Declaration of interest
None.