Published online by Cambridge University Press: 01 July 2004
In the present study, the correspondence between clinician-assessed and self-reported neurocognitive performance was contrasted with scores obtained from psychometric neuropsychological tests in 148 psychiatric in-patients. Results revealed that self-reported cognitive functioning was strongly associated with depressive symptomatology but was only poorly related to psychometric neurocognitive performance, particularly in schizophrenia. After illness denial was controlled for, the overall association between subjective and objective test performance was slightly increased but still failed to reach significance in six out of eight analyses. In approximately 20% to 40% of all cases, clinicians judged memory performance to be normal despite substantial impairment revealed by neuropsychological test results (attention parameters: 7–51%). Since (ecological) validity and reliability have been demonstrated for many neurocognitive paradigms, the present results question the validity of non-psychometric neurocognitive assessment and call for a complementation of clinical judgment with neurocognitive assessment. Reasons for decreased sensitivity of self-reported and clinician-assessed neurocognitive functioning are discussed. (JINS, 2004, 10, 623–633.)
Neurocognitive dysfunction, such as problems with attention and memory, are evident in many psychiatric disorders (Austin et al., 2001; Bates et al., 2002; Nixon & Phillips, 1999; Moritz et al., 2002a). However, except for Korsakoff's syndrome, dementia, and developmental disorders for which the establishment of severe neurocognitive disturbances is obligatory for diagnosis (e.g., World Health Organization, 1993), the systematic assessment and consideration of neurocognitive disturbances for treatment and rehabilitation of mentally ill patients has been neglected until recently.
Within the last decade a large body of empirical evidence has indicated that neurocognitive disturbances are important determinants of functional outcome variables such as psychosocial and work functioning (e.g., Brekke et al., 1997; Gold et al., 2002; Green, 1996). In a meta-analysis, Green (1996; see also Green et al., 2000, for an update) has demonstrated that memory dysfunction strongly predicts functional outcome in schizophrenia (see also Velligan et al., 2000). In addition, there is an increasing recognition of the role of neuropsychological dysfunction on a number of treatment-related variables such as insight (e.g., Chen et al., 2001; Rossell et al., 2003) and coping skills (Wilder-Willis et al., 2002). Neurocognitive dysfunction may also exert a negative impact on medication compliance (Donohoe et al., 2001), as several psychotropic agents, particularly benzodiazepines (e.g., Rammsayer et al., 2000; Tonne et al., 1995) and anticholinergic medication (e.g., Nishiyama et al., 1998) share known potential adverse effects on neurocognition. When such side-effects remain unnoticed by clinicians, drug discontinuation is a possible consequence, especially if the patient believes that the adverse side-effects outweigh the benefits of drug treatment.
Although it is acknowledged that psychometric neurocognitive assessment (e.g., Auditory Verbal Learning Test, Wisconsin Card Sorting Test) is the gold standard for determining the presence and extent of cognitive dysfunctions (Lezak, 1995), this type of assessment is clearly not routine in all psychiatric facilities. For a number of cognitive tests ecological validity has been established (i.e., association of impaired test scores with real-life problems; e.g., Helmstaedter et al., 1998, for the Auditory Verbal Learning Test), and necessary psychometric requirements have been met. The aim of the present study was to explore the convergence of information obtained from psychometric tests with both the clinician's and the patient's perspective. To date, the evidence regarding the correspondence between objective neuropsychological functioning and self-report complaints in psychiatric patients is mixed (e.g., van den Bosch & Rombouts, 1998; Cuesta et al., 1996), while studies conducted with non-psychiatric clinical populations have consistently failed to find such a relationship (Kopelman et al., 1998; Lannoo et al., 1998; Newman et al., 1989). Horner and colleagues (1999) reported an association between neurocognitive complaints and indices of depression and vulnerability to stress in 86 patients entering a substance abuse program, whereas no significant relationship emerged between subjective and psychometric test performance. Two recent studies have investigated the relationship between neurocognitive test performance and clinical ratings using psychopathological rating scales (Scale for the Assessment of Negative Symptoms, SANS; Positive and Negative Syndrome Scale PANSS). Harvey et al. (2001) have come to the conclusion that clinical rating scales are a poor proxy for the assessment of cognitive dysfunction. In their analysis, the negative scale of the PANSS was an even better indicator for test performance than the cognitive subscale of the PANSS. While the study conducted by Vadhan et al. (2001) was somewhat more optimistic regarding the validity of clinical judgment, again only modest associations occurred between clinical ratings (SANS attention subscale) and neurocognitive performance. As both studies employed clinical rating scales that contain explicit anchors for rating neurocognition (e.g., counting backwards, detection of similarities between two objects), the relationship between objective assessment and clinical judgment may have been even over-estimated.
For the present study, we compiled estimates of neurocognitive performance from the three most relevant sources of information: subject, clinician, neurocognitive psychometric test results. For this purpose, we administered standard neurocognitive tests to a large sample of psychiatric inpatients and required clinicians, as well as patients, to rate the neurocognitive status on a questionnaire.
The goal of the study was to gain insight into whether “clinical eye” and subjective evaluation are useful heuristics to classify patients as impaired versus unimpaired. Knowledge of convergence amongst the three perspectives and the elucidation of possible judgmental biases may help to provide estimates for the reliability and validity of non-psychometric assessment. Detection of discrepancies may guard against false-positive or false-negative diagnostic inferences. Although the primary focus of this study was on the memory domain (verbal learning as assessed with two tasks, prospective memory, remembering a name), psychomotor speed, selective attention, and divided attention were also subject to investigation.
One-hundred and forty-eight in-patients, who were consecutively referred to the clinical neuropsychology unit of the university hospital for psychiatry and psychotherapy in Hamburg (Germany), took part in the present investigation. Participating individuals underwent a two hour battery of neuropsychological tests, which is part of our hospital's routine in-patient assessment. Subsequent to testing and scoring, a written report documenting individual test results in relation to normative data (neurocognitive impairments as well as spared functions and resources) and advice for further treatment in cases where dysfunctions were detected was forwarded to the doctors in charge. Prior to referral to our unit, clinicians in charge were required to specify the diagnostic rationale (e.g., suspected memory problems, questionable ability of the patient to operate motor vehicles) and to assess the neurocognitive status of the patient by means of a short questionnaire (see Appendix). Prior to testing, patients completed a questionnaire that covered a number of attention and memory problems.
Representative of the core population of our hospital, approximately one third of the sample was diagnosed with a schizophrenia spectrum disorder (n = 53; 36%) and another third with a depressive illness (n = 46; 31%). Twenty-three patients (15%) suffered from an anxiety disorder and 13 patients (9%) were admitted because of substance abuse (mostly alcohol abuse). The remaining patients (9%) were classified otherwise (e.g., suspected dementia). Since we were interested in a representative clinical population, standard exclusion criteria as frequently used in basic research (e.g., presence of brain damage and drug/alcohol abuse) were not applied.
The primary method of evaluating the neurocognitive level of patients was through comparisons with normative scores derived from large population samples, as documented in the test manuals. To provide an additional estimate of baseline performance, we recruited a healthy control group (n = 33). Control participants were sought from various sources (e.g., college students, nurses from other hospitals) via word-of-mouth and advertisements. Controls were screened for absence of any psychiatric or neurological disorders. One control subject was later excluded because of a suspected episode of past depression. Characteristics of patients and controls are presented in Table 1.
Participants were administered a large battery of neurocognitive tasks. Along with additional instruments employed for specific diagnostic questions the following fixed set of tasks was administered:
(AVLT; Lezak, 1995) The Auditory Verbal Learning Test (AVLT) was administered in its German version (Heubrock, 1992). In this task, the experimenter reads a list of 15 words (List A) which the participant is requested to repeat in loose order. After List A has been presented five times, the subject is asked to reproduce words from a newly presented list (List B). Following this, the subject is instructed to recall the words from List A without renewed presentation. 30 min later, the subject is asked to repeat the words from List A. During the free recall periods, the experimenter records the number of correctly repeated words, the number of response repetitions, and the number of intrusions. Normative scores for the German version of the task are available for different age ranges (Verbaler Lern und Merkfähigkeitstest; Helmstaedter et al., 2001). A T-score lower than or equal to 40 was considered indicative of impairment (i.e., at least 1 SD below the mean of the normative sample). The AVLT allows for the computation of several memory parameters (see Lezak, 1995). Learning was measured by the sum of correctly recalled words on Trials 1 to 5. Long-term memory was assessed by the number of correctly reproduced words after the 30-min delay.
(RBMT; Wilson et al., 1992) The RBMT is a test battery that assesses diverse aspects of everyday memory, such as remembering a name, prospective memory, short-term and long-term prose recall, and orientation. The tasks have been validated on a number of studies (e.g., Perez & Godoy, 1998). For the present study, we were especially interested in name learning (Items 1 and 2), prospective memory (Items 3 and 4), and verbal learning (prose recall; Item 6). Differentiation of impaired versus unimpaired patients was based on the screening scores provided in the test manual. We also computed normative scores for the RBMT total score (screening scores range from 1 to 12; scores lower than 10 designate impairment).
(Brickenkamp, 1978) Test d2 is a letter cancellation test that taps selective attention/concentration. In this task, the subject is instructed to cross out the letter d whenever it is accompanied by two small lines; d's with more or less than two lines or any stimuli containing the character p serve as distracters. Subsequent to a practice trial, 14 rows with target and distractor stimuli are presented. The subject is given 20 s to complete each row. The test is scored for errors and number of crossed out stimuli within the allotted time. For the present study, we calculated normative scores for the parameter total minus errors. This measure subtracts erroneous responses from the total amount of correct responses. Age-adjusted normative scores were derived from a large population sample (Brickenkamp, 1978). Validity of the test has been confirmed with correlational studies employing construct-related tasks (Brickenkamp, 1978). Scores were considered impaired when subjects performed at least one standard deviation below the mean of the age-equivalent group.
(TAP; Zimmermann & Fimm, 1994) For this test, the participant has to perform two tasks concurrently. The space bar has to be pressed whenever asterisks form a rectangle on a 4 × 4 dot matrix (optical target), and whenever two tones of the same frequency are repeated (acoustic target; for most trials, a high and a low tone alternate). One hundred optical and 200 acoustic trials are presented. There are 16 targets for each modality. Norm values derived from a large sample of participants are available for median reaction times and number of omissions. Number of omissions was taken as the main dependent variable in the present investigation. Scores were considered impaired when subjects performed at least 1 standard deviation below the mean.
(TMT; Reitan, 1992) Psychomotor retardation was assessed with the TMT Part A (Reitan, 1992; Adult Version). In this task, the subject has to connect encircled numbers in ascending order as quickly as possible. Part B assesses set-shifting and requires alternation between numbers and letters, again in ascending order. For the present study, subjects were divided into impaired and unimpaired performers by means of the norm values described in Reitan (1992).
The self-report questionnaire consists of 76 items with a focus on everyday cognitive problems. We complemented the FEDA (Fragebogen erlebter Defizite der Aufmerksamkeit; Questionnaire for Self-Experienced Deficits of Attention; Zimmermann et al., 1991), a questionnaire that taps attentional difficulties, with a set of self-constructed items. Items had to be endorsed on a 5-point Likert scale (very frequently, frequently, sometimes, seldom, never). We only allocated items from the self-report scale to a specific domain when complaints could not easily be explained by other problems (e.g., “Sometimes I have to read whole paragraphs twice in order to get the meaning”; such difficulties may reflect attention difficulties, problems with abstract-logical thinking or dyslexia). Long-term memory was tapped with seven items (e.g., “You forget important conversations you have just had”). Prospective memory was tapped with seven items (e.g., “You entirely forget to do something you have promised or planned”). Selective attention was tapped by 10 items (e.g., “I have problems concentrating on things that interest me (e.g., a movie)”). Divided attention was covered by four items (e.g., “I cannot use the telephone and observe something at the same time”). Nine items measured psychomotor retardation (e.g., “I need double the time that others do to fulfill a job”). One item each measured the forgetting of medication intake (“You forget the intake of new medication”) and the remembering of names (“You are poor at learning new names”). A pilot study conducted by Zimmermann and colleagues (1991) has confirmed satisfactory internal consistency and validity of the original scale (see also Theml & Romero, 1991). Additionally, analyses of internal consistency for the present sample revealed satisfactory to excellent reliability (verbal memory: α = .86; prospective memory: α = .87; selective attention: α = .91; divided attention: α = .72; psychomotor retardation: α = .87).
Prior to referral to our unit, doctors in charge were asked to complete a questionnaire comprised of 22 items assessing the neurocognitive status of their patient. Items had to be endorsed on a 4-point Likert scale regarding the presence and severity of neurocognitive problems in their patients (i.e., problem definitely present, problem probably present, problem probably absent, problem definitely absent). The items cover a variety of neurocognitive domains including memory (e.g., immediate and delayed memory, forgetting of drug intake), attention (selective and divided), spatial processing, speed of information processing, and orientation. One item each was constructed for divided attention, memory for names, and medication intake (see Appendix). Selective attention, disorientation, prospective memory and verbal long-term memory were assessed with two items.
Prior to neurocognitive testing, patients completed the Paranoid Depression Scale (PDS) developed by von Zerssen and Koeller (1976). This self-report questionnaire assesses depressive (16 items, e.g., “I often cry”) and psychotic symptoms (16 items, e.g., “People permanently control and spy on me”). The scale also contains a subscale measuring illness denial that covers common somatic and psychological problems (8 items; e.g., “At times, I have had a cold”). The rationale of the latter scale is to reveal tendencies to downplay or deny minor health problems. All items had to be endorsed on a 4-point Likert scale (absolutely, mostly, somewhat, not at all).
Two strategies for data analysis were adopted. First, we correlated raw scores obtained from the psychometric neurocognitive tests with the corresponding clinician-assessed and self-report items. Subsequently, we divided patients according to scores from psychometric measures, clinical ratings, and subjective assessment into patients with and without cognitive problems. Patients 1 standard deviation or more below the mean of the norm population on Test d2 (Brickenkamp, 1978), TAP (Zimmermann & Fimm, 1994) and AVLT (Helmstaedter et al., 2001) were considered impaired. For the RBMT and TMT, we divided subjects according to cut-off scores described in the corresponding manuals (Reitan, 1992; Wilson et al., 1992). Clinician-rated impairment was considered present if the clinician judged a neurocognitive symptom to be definitely present or probably present. Patients were considered unimpaired if the clinicians endorsed the problem to be probably absent or definitely absent. Patients were considered subjectively impaired if their scores on the corresponding FEDA subscale were at least 1 standard deviation below the scores of the healthy control group. Correspondence was assessed via cross-tables.
Samples did not differ on any sociodemographic characteristics except for age. In comparison with the healthy control group, patients were impaired on all neurocognitive measures, except for the number of false responses in the TAP divided attention subtask and remembering the surname in the RBMT (see Table 1). Results remained essentially unchanged when age, a known confound of neurocognitive results, was treated as covariate (i.e., status of significance remained identical). According to norm values for the RBMT total score, only 33% of the patients displayed intact memory (screening score: 10–12) relative to 82% of the healthy controls. Forty percent of the psychiatric sample were slightly impaired (screening score 7–9), 24% were moderately impaired (screening score 3–6), and 3% were heavily impaired (screening score 0–2). A total of 35–72% of the patients displayed performance problems according to RBMT subscale screening scores and normative scores of the AVLT. With the exception of prospective memory, where almost half of all healthy subjects achieved less than 1 screening score in one or both prospective memory tasks (48%), memory disturbance was evident in less than every 6th healthy subject (9–15%). Approximately every 3rd patient displayed impairment on the d2 (healthy controls 3%). According to normative scores for the TAP, 39% of the patient sample exhibited impairment (healthy controls 3%). Impairment on the TMT–A was evident in 46% of the patients when normative values (Reitan, 1992) were applied (see Table 2), while again only 3% of the healthy subjects displayed malperformance.
Except for prospective memory, patients' subjective reports and psychometric assessment did not correlate at all (see Table 2). Similarly, self-reported and clinician-rated memory functioning correlated significantly for only one index (learning names). Significant correlations emerged between clinician-rated memory performance and the corresponding neurocognitive tests for all memory measures (r = .25 to .37, at least p < .01). However, when patients were divided into impaired and unimpaired groups according to “clinical eye” and neurocognitive performance, correspondence was markedly lowered (see Table 3). The correspondence between impairment (categorized data) according to subjective versus neurocognitive assessment failed to reach significance for any memory task, including prospective memory for which significant correlations were obtained. Depending on the task applied, clinicians failed to identify 20–43% of the dysfunctions revealed through cognitive testing, the corresponding rate for self-report evaluation ranged from 17–45% (see Table 4).
Since we had no objective indicator for measuring regular intake of medication, we could only correlate clinical ratings and self-reported problems. A non-significant relationship emerged (r = .14; p = .08).
A modest correlation emerged between Test d2 scores and clinical ratings (r = .29; p < .005). Again cross table statistics indicated independence of decisions. Neither test score nor clinical rating corresponded with self-reported selective attention problems (r < .20; NS). In 7% of the cases, clinicians judged selective attention as normal, whereas patients achieved scores in the impaired range. On the other hand, dysfunction claimed by clinicians was not verified through neurocognitive tests in 46% of the cases. Accordingly, convergence was obtained in only 47% of the cases (p > .2). There was no correspondence between any of the measures tapping divided attention. For this task, approximately one-third (31%) of the dysfunctions indicated by test performance remained undetected by the clinician, and one-fifth remained undetected by patients (18%). The correlation between self-reported slowing and time needed on TMT–A reached significance. This corresponds to the results obtained with the categorized data. In 45% of the cases there was disagreement with clinician-assessed slowing and normative scores on the TMT–A. Neither the correlation nor the classification approach yielded significant results.
Finally, we calculated the correspondence between psychometric and non-psychometric assessment for overall impairment. In approximately 4 out of 5 cases, where at least one neurocognitive deficit was detected by psychometric assessment, clinicians (86%) and patients (81%) judged at least one neurocognitive domain to be dysfunctional (irrespective of whether the disturbed domains were the ones suspected by the clinician/patient to be dysfunctional). In 8.1% of the cases clinicians judged at least one neurocognitive problem to be present, while none of the psychometric tests signaled a dysfunction (patients: 8.5%).
To gain insight into whether psychiatric sub-samples differ with respect to their self-awareness of neurocognitive dysfunction, correlations between FEDA subscales and neurocognitive test performance were recalculated for the schizophrenic, depressive, and anxiety patient groups separately (the remaining groups were too small to allow meaningful conclusions). For the depressive [prospective memory: r = .32 (p = .002), selective attention: r = .62 (p < .001), psychomotor slowness: r = .47 (p < .001)] and anxiety group (prospective memory: r = .65, p = .003; psychomotor slowness: r = .52, p < .05; divided attention: r = .54, p < .05) three out of eight correlations yielded significance, whereas for the schizophrenic sample none of the correlations achieved significance (prospective memory: r = .05, correlation significantly different from anxiety patients; selective attention: r = .01, correlation significantly different from depressive patients; psychomotor slowness: r = .18, correlation significantly different from anxiety patients and at trend level from depressive patients; divided attention: r = .02, correlation significantly different from anxiety patients). The self-report and clinical ratings did not correspond for any of the domains in the three groups.
In a subsidiary analysis, we inspected whether patients with depression differ from those with schizophrenia in their biases to detect neurocognitive disturbances. It was expected that patients with depression display a tendency for false-positive judgments (i.e., subjective impairment without neurocognitive correlate), while schizophrenic patients tend to over-estimate their neurocognitive status (i.e., “objective” impairment without subjective recognition). Our assumption was essentially confirmed for depressive patients, who showed a markedly enhanced false-positive rate on most parameters except for prospective memory (long-term memory, AVLT: 26.2% false-positive vs. 4.8% false-negative; long-term memory, RBMT: 21.4% vs. 19%; prospective memory: 7.1% vs. 45.2%; learning names: 31.0% vs. 23.8%; psychomotor speed: 15.2% vs. 6.5%; selective attention: 35.5% vs. 6.5%; divided attention: 32.1% vs. 10.7%). For schizophrenic patients, the difference between false-positive and false-negative judgments was attenuated in comparison with depressive patients (long-term memory, AVLT: 21.6% false-positive vs. 27.5% false-negative; long-term memory, RBMT: 27.5% vs. 19.6%; prospective memory: 10.4% vs. 54.2%; learning names: 20.8% vs. 18.8%; psychomotor speed: 32.1% vs. 15.1%; selective attention: 28.9% vs. 22.2%; divided attention: 30% vs. 25%).
When correlations between expert ratings and neurocognitive measures were re-run for the depressive and schizophrenic samples separately, three out of eight correlations turned out to be significant in schizophrenic patients (name learning: r = .29, p < .05; prospective memory: r = .34, p < .05; delayed story recall: r = .36, p < .01), and three out of eight correlations were significant or nearly significant in the depressive group (selective attention: r = .36, p < .05; delayed verbal learning: r = .30, p = .05; learning curve: r = .28, p = .07).
In a final set of analyses, we investigated whether either psychotic or depressive psychopathology exerted any impact on neurocognitive performance. In addition, the influence of illness denial on the association between test performance and self-reported neurocognitive complaints was investigated. Not surprisingly, as the PDS and the FEDA are self-report measures, all correlations between the two instruments' subscales achieved significance. Of interest, pairwise comparisons revealed that depression (r = .35–.66) and illness denial (r = .30–.52) correlated more strongly than paranoid behavior (r = .18–.38) with all of the subscales. None of the correlations between the PDS subscales and the neurocognitive measures achieved significance (p > .1).
When illness denial was controlled for in a partial correlation, the association between delayed recall in the AVLT with the corresponding FEDA subscale reached trend level (r = .15, p = .09), whereas for divided attention (r = .19, p = .05) and selective attention (r = .24, p = .01) the association between test measures and subjective performance now achieved significance (for depression and psychotic symptoms no moderator effect emerged).
Similar to studies conducted with non-psychiatric populations (e.g., Kopelman et al., 1998; Lannoo et al., 1998; Newman et al., 1989), the present results suggest that psychiatric patients' memory and attention complaints show little correspondence to objective neurocognitive measures. While at least modest convergence between both sources of information was obtained for anxiety and depressive patients in terms of psychomotor speed, prospective memory and attention, neurocognitive complaints by schizophrenic patients were not verified by objective performance. This finding accords to the clinical observation that patients with schizophrenia display a fundamental lack of self-awareness or insight.
The results clearly suggest that patients' neurocognitive self-evaluation does not represent a reliable source for estimating individual neurocognitive functioning, particularly in schizophrenia. In a subsidiary analysis, we found that depressive patients exhibited a judgmental bias in reporting neurocognitive complaints that were unverified by objective assessment (particularly in the domain of attention), which is in agreement with the result that subjective depressive symptoms are strongly correlated with subjective but not objective neurocognitive complaints. In schizophrenic patients, no such marked false-positive bias was observed.
Despite significant correlations between clinical judgment and psychometric tests in six out of eight cognitive domains, subsequent analyses on the correspondence between classificatory decisions (impaired vs. not impaired) revealed a somewhat disappointing pattern of results. For memory parameters, 20–43% of the dysfunctions identified by neuropsychological tests were not confirmed by clinicians. Disagreement between the two sources ranged from approximately one-third of all cases (RBMT prose recall: 37%) to over one-half (Test d2: 53%). Since the employed neurocognitive tasks are not only face-valid but also fulfill basic psychometric properties, this modest correspondence between psychometric tests and questionnaire data clearly indicates that judgment of neurocognitive status in psychiatric patients should not rest on self-report complaints or “clinical eye” decisions alone. Nevertheless, in 4 out of 5 cases, both clinicians and patients detected at least one dysfunctional neurocognitive domain in patients with dysfunctional psychometric scores (these domains were, however, not necessarily the same). Therefore, while the clinicians and the patients are poor at estimating the specific area and extent of neurocognitive impairment, they show some sensitivity for overall impairment. It also needs to be emphasized that the discrepancy between objective phenomena and clinical judgment is by no means confined to the domain of neurocognition, but has long been described for other aspects of medical and psychopathological assessment (e.g., Meehl, 1997).
Although the correspondence between psychometric and non-psychometric judgment was better in clinicians than in patients, in four out of seven analyses clinicians did not adequately detect neurocognitive impairment that was indicated by test performance. A number of factors may contribute to the apparently decreased sensitivity of clinical judgment to detect neurocognitive deficits. First, many clinicians are reluctant to ask patients to recall information previously exchanged because such patronizing behavior may undermine the therapeutic alliance. Second, the clinician is often entirely absorbed with the treatment of core psychopathological symptoms, thus, cognitive dysfunctions as well as comorbid psychiatric disorders (Zimmerman & Mattia, 1999) and somatic problems (see Le Fevre, 2001; Strome, 1989) are often overlooked. Finally, neurocognitive dysfunction is often concealed by patients because of embarrassment about forgetfulness and other neurocognitive problems. In line with the latter argument, the association between self-report and objective neurocognition was moderated by illness denial. When the tendency to conceal common complaints was controlled for, the association between self-report and objective measures achieved significance for selective and divided attention, and a trend emerged for delayed recall in the AVLT. Yet, the impact of illness denial was small and large proportions of test variance remain unexplained.
Because of only modest relations between clinical judgment and test performance (see Harvey et al., 2001, Vadhan et al., 2001 for similar results), the present findings emphasize the need for clinical judgments to be complemented with additional information (see also Vadhan et al., 2001). If a discrepancy between psychometric and self-report or clinical rating measures occurs, the reason for such inconsistent inferences should be identified since divergent results may bear clinical relevance. Besides the possible confounds of clinical judgment just outlined, impaired performance undetected by a clinician may, for example, originate from successful coping strategies adopted by the patient which help him/her master everyday life (e.g., writing down appointments or new names).
Although psychometric assessment is clearly the method of choice for determining a patient's neurocognitive status, clinicians should by no means slavishly rely on normative scores but should carefully consider contextual factors that might contribute to abnormal test scores. For example, some patients display poor test performance despite normal cognitive functioning because of test anxiety, poor motivation, vision and/or hearing difficulties, or compensation neurosis (i.e., malingered neurocognitive problems). Such potential confounds need to be documented in final reports, and diagnostic conclusions should either be postponed until re-assessment or be drawn very cautiously. Despite the poor correspondence between the patients' perspective and both clinical ratings and psychometric test performance, we want to emphasize that since subjective cognitive dysfunction has been found to be a strong predictor of poor symptomatic outcome in first-episode schizophrenics (Moritz et al., 2000, 2002b), neurocognitive dysfunction reported by patients should be taken seriously even if unverified by clinicians or psychometric tests. For example, if the patient attributes his/her self-reported dysfunction to medication this clearly represents a risk-factor for medication non-compliance, especially in view of studies that self-reported cognitive deficits correlate with conventional neuroleptic dosage (Krausz et al., 2000; Moritz et al., 2002c). Clearly, more research needs to be devoted to the question of what subjective cognitive complaints mean. In line with a large body of research, it was found that particularly depression is related to self-report measures of neurocognition (e.g., Antikainen et al., 2001; Giovagnoli et al., 1997; Moore et al., 1997; Newman et al., 1989; Smith et al., 1996; Wagle et al., 1999).
Depending on the domain tested, 35–72% of the patients in the present study displayed neurocognitive impairment. The greatest disturbance (72%) was found for prospective memory, a function that has received little attention so far (the screening scores in the RBMT manual might, however, be overly strict, as many healthy subjects also received impaired scores). Once neurocognitive problems are detected, there are a number of strategies for dealing with such dysfunctions in psychiatric patients. With regard to memory problems, clinicians should repeat essential information frequently, regularly re-assure that patients grasp the core contents of therapy, give the most essential information in written form and—where appropriate—involve relatives in the setting. Additionally, there is evidence that cognitive remediation programs are efficacious for some patients (see Kurtz et al., 2001, for a review and meta-analysis). Also, clinicians may want to evaluate whether medications that are potentially harmful to memory, such as benzodiazepines and anticholinergic agents, are necessary or whether they could at least be diminished in dosage (see Moritz et al., 2003). In any case, the presence of memory and other neurocognitive problems should not be disregarded as a minor problem given its possible impact on (medication) compliance, insight, treatment, and functional outcome (see Introduction).
Finally, we would like to mention a few shortcomings of the present study. First, cognitive dysfunction, as rated by clinicians, was measured with only a few items for which reliability has not yet been sufficiently determined. Further, the self-report and the clinical rating scales were composed of different items. The present approach was adopted, as we wanted to rely on a published self-report questionnaire (FEDA) whose items cover private everyday situations which are not suitable for clinician rating. In addition, the low correspondence between test-assessed and clinician rated memory performance for binary judgments (impaired vs. unimpaired) may partially stem from the decreased sensitivity of categorized data.
Nevertheless, we think that these limitations cannot fully explain why different sources of information regarding diagnostic decisions did not converge. There is also reason to believe that the present study may have even over-estimated the correspondence between clinical judgment and neurocognitive testing, since clinicians were explicitly asked to assess the neurocognitive status of patients prior to referral. Thus, otherwise overlooked cognitive symptoms might have caught the clinician's attention.
While the present study did examine the correspondence between the three sources of information (subject, clinician, test) for attention and memory, we have not provided any clinical or self-report estimates of executive functioning or spatial processing. Although we believe that the interrelation will be as poor as for the other domains, this certainly needs direct empirical testing. However, unlike memory and attention, adequate items are harder to determine for these domains since these functions bear greater complexity, and in the case of executive functioning are less well defined.
The authors would like to thank Deike Heeren for help with data collection. We also thank Carrie Cuttler, Monika Sölva, Lisa Huschka, Gabriele Weineck and three anonymous reviewers for helpful comments on an earlier version of the manuscript.