Published online by Cambridge University Press: 28 January 2005
Patients with mild to moderate Alzheimer's disease and normal controls were tested on a retrograde amnesia test with semantic content (Neologism and Vocabulary Test, or NVT), consisting of neologisms to be defined. Patients showed a decrement as compared to normal controls, pointing to retrograde amnesia within semantic memory. No evidence for a gradient within this amnesia was found, although one was present on an autobiographic test of retrograde amnesia that had a wider time scale. Several explanations for these results are presented, including one that suggests that extended retrograde amnesia and semantic memory deficits are in fact one and the same deficit. (JINS, 2005, 11, 40–48.)
Memory deficits are central to Alzheimer's disease (AD), and are among the first signs of the affliction (APA, 1994; Brandt & Rich, 1995; Deweer et al., 2001). These deficits are typically assessed with standardized tests in which patients are required to study some material, and are subsequently tested on it. Such tests mostly reflect anterograde episodic memory (Spaan et al., 2003). AD patients show large deficits on these tests (Lambon Ralph et al., 2001; Spaan, 2003; Thompson et al., 2002) but anterograde episodic memory is not the only type of memory affected. AD patients also perform poorly on tests of semantic memory such as verbal fluency tests and confrontation naming tasks (Kazui et al., 2003; Salmon et al., 1992), and tasks that tap semantic priming (Brandt & Rich, 1995). Naming deficits have even been suggested to be present in preclinical stages of AD (Petersen et al., 1999).
In addition, patients with Alzheimer's disease typically develop retrograde amnesia as indexed by a loss of autobiographical memories (Kopelman et al., 1989), a loss of dated public knowledge (Beatty et al., 1988; Kopelman et al., 1989; Leplow et al., 1997), and an inability to recognize famous faces (Thompson et al., 2002). In many patient groups with amnesia and in experimental animals, retrograde amnesia conforms to Ribot's law, stating that recent memories are more vulnerable to brain damage than remote memories (Kim & Fanselow, 1992; Ribot, 1881; Squire, 1992). This law results in a temporal gradient, referred to as the Ribot gradient, in which patients show great deficits on test items measuring memory for recent memories, and smaller ones on items measuring remote memories. Such a Ribot gradient has also been found in patients with Alzheimer's disease, although it tends to be shallower than that in other groups and is not found consistently (Beatty et al., 1988; Brown, 2002; Deweer et al., 2001; Kopelman et al., 1989).
Semantic memory tests such as confrontation naming are, by their nature, tests of retrograde memory: the semantic memories queried have all been acquired a long time before testing, and most will have been acquired long before the onset of the disease. However, the time at which these semantic memories are acquired is not known. It is therefore unclear whether semantic memory deteriorates across the line, or whether it preferentially affects more recent semantic memories, in line with Ribot's law.
Recently, tests have been developed that make assessment of temporal gradients in semantic memory loss possible. Verfaellie et al. (1995) constructed a test consisting of neologisms, words that had entered the language recently. Five neologisms were chosen from each five-year period since 1960, and had to be defined by the patient. Verfaellie et al. (1995) found that patients with Korsakoff's disease performed worse on this test than alcoholic controls. Moreover, their performance suggested a temporal gradient, with their knowledge of neologisms from recent periods being worse than that of neologisms from remote periods. Although a learning deficit may have contributed to these problems (the meaning of some recent neologisms may never have been stored), the fact that the deficits extended over more than thirty years suggests that genuine loss of remote memories also plays a role. The test thus seemed to detect graded retrograde amnesia for unambiguously semantic material.
Meeter and de Wilde (2001) have constructed a similar test for the Dutch population. The Neologism and Vocabulary Test contains neologisms that entered the language in the seventies, eighties and nineties, such as “Viagra” for the nineties, “mouse pad” for the eighties, and “intercity train” for the seventies. These neologisms have to be defined by the patient. In the current study, we used this test to investigate semantic memory deficits in a group of AD patients and age-matched controls. In addition, we studied the correlation between the test and other cognitive capacities.
Two groups of older adults participated in this study. The first group consisted of 16 patients visiting outpatient geriatric or neurological departments, and diagnosed with mild to moderate probable Alzheimer's disease (AD group). The second group consisted of 15 normal older adults, matched to the first group on age and educational level (normal control, NC group).
Patients were recruited through two academic hospitals, and one general hospital. All patients underwent a comprehensive neuropsychological examination, with a subgroup also undergoing magnetic resonance imaging (MRI) of the brain. Final diagnosis was made in a consensus meeting where all the available clinical data and the results of the ancillary investigations were reviewed. A diagnosis of probable AD was based upon the National Institute of Neurological and Communicative Disorders and Stroke—Alzheimer's Disease and Related Disorders Association (NINCDS—ADRDA) criteria (McKhann et al., 1984). Patients were excluded if they were younger than 65; if their score on the Mini Mental State Examination (MMSE, Folstein et al., 1975) was below 15; or when they had a somatic, psychiatric, or neurological disease other than Alzheimer's that could lead to cognitive dysfunction. Brain damage unrelated to AD, visual impairment, speech dysfunction, intellectual disability, insufficient level of Dutch language, and a stay outside the Netherlands since 1970 of longer than one year were also grounds for exclusion.
Normal controls were recruited via senior citizen organizations, and were matched to patients with respect to age and education. Exclusion criteria for normal controls were the same as those for patients with Alzheimer's disease, with as an additional exclusion ground the reporting of subjective memory complaints.
The AD group consisted of ten men and six women, while the NC group was made up of six men and nine women. The two groups did not differ in age (on average 75.3 years for the AD group, 76.3 for the NC group), nor in their level of educational attainment. On the seven-point scale customary in the Netherlands (Heslinga et al., 1983), the AD group's mean educational level was 4.3, while that of the NC group was 4.6. These means translate to about ten years of formal schooling. Average MMSE score for Alzheimer's patients was 23.9 (SD = 3.9; range 18–29). Six patients (37.5%) used rivastigmine. All patients except one were diagnosed in the two years before testing.
Central to the study was the Neologism and Vocabulary Test (NVT, Meeter & de Wilde, 2001), a test for semantic retrograde amnesia. In addition, we included a standard test of episodic anterograde amnesia, a test of episodic retrograde amnesia, and tests to measure language, executive, and general cognitive function. Premorbid intelligence was also estimated.
The NVT consists of neologisms that entered the language at different times. Patients are read 44 words that they have to define. Eleven are neologisms that, according to etymological dictionaries, entered Dutch in the 1970s, 11 are neologisms from the 1980s, and 11 from the 1990s. The remaining 11 words are baseline words that were included to check for a general decline in vocabulary knowledge, unspecific for recent periods. These reference words were matched on recall probability—matching on other characteristics was impossible as neologisms do not feature in published word lists (e.g., CELEX, the largest Dutch corpus from which frequency norms were derived, was closed in 1990). Examples of items are “afkicken” (Dutch for “kick a drug habit”, 1970s), “walkman” (1980s), “Viagra” (1990s), and “rugzak” (backpack, reference word).
A word is counted as correctly defined when the participant has shown a hint of knowing the word in his or her definition. If the patient provides a faulty definition or none at all, the correct definition is presented with three lures for a recognition trial; lures were chosen to have plausible links to the target word. The low threshold in counting a word as correct was chosen so as to minimize the influence of general intelligence on the test. For example, all answers that included either the word “radio” or “cassette” were counted as correct for the “walkman” item. Lures for this item were, in translation, “someone who lays roofs”, “powerful walky-talky”, and “son of gods in German mythology”, which were presented along with the correct definition (given as “portable cassette player with head phone”).
The item pool of the NVT was constructed using dictionaries of neologisms. Items were first sifted by eliminating those that were not answered correctly by all participants in a small pilot group. Slightly more neologisms than ended up in the final test were presented to a stratified sample of 70 adults, whose age ranged from 33 to 88 (stratification was done with respect to age, sex, and level of education). Items answered correctly in their open format by more than 90% of participants were included in the test. This resulted in the present test, on which the normal controls in the sample scored 90% correct. There were no differences in the score on the different periods in the test, and no gender, age, or education effects were found (de Wilde, 2001).
To test episodic anterograde memory, the eight-word list-learning test was used (Lindeboom & Jonker, 1989). For this test, eight unrelated words are read to the subject five times. Immediately after each presentation recall is tested, with the total number of words recalled after the five trials being used as the score. In addition, recall was again tested after a delay of 10 min followed by a recognition test in which the eight words were intermixed with eight distracters (Schmand, 1997).
As a test of episodic retrograde amnesia, the Dutch adaptation of the Autobiographical Memory Interview (AMI, Kopelman et al., 1989, 1990; Meeter & Murre, 2003) was partially administered. This test consists of “personal semantic” questions (precise questions about factual information), and nine “incident” questions, in which respondents must generate an anecdote from different periods in their life of the respondent. Only these last questions were put to the participants in this study because of time considerations. These nine questions, though referring to precisely dated events, are grouped into three broader periods: childhood (0–18 years), young adulthood (18–32), and recent time periods (last 5 years). Scoring of each generated incident is on a three-point scale, and based on the descriptive richness of the account of an incident and its specificity in time and place.
The Dutch version of the National Adult Reading Test (NART, Nelson & O'Connell, 1978) was used as an estimator of premorbid intelligence. The NART consists of a list of words with irregular spelling that must be read aloud. The number of correctly pronounced words can be used to estimate intelligence quotient (IQ). This estimate is relatively stable, even after cerebral damage (Bright et al., 2002). However, recent evidence suggests that it declines in dementia, and that it may thus underestimate premorbid IQ in patients with more severe dementia (Cockburn et al., 2000; Schmand et al., 1998). Formulas have been proposed for correcting NART-based IQ estimates with help of MMSE scores, but here we will report raw NART scores.
The meander appeals to cognitive flexibility and self-monitoring (Lindeboom & Jonker, 1989; Luria, 1966). The meander consists of an alternating line pattern printed on a sheet of paper, and the subject is asked to continue this line pattern on the sheet with a pencil. Impairment is seen in perseverations (difficulty in switching) and in stereotypical behavior. This task may be sensitive to prefrontal dysfunction (Lezak, 1995). For the present study, we used a simplified scoring system in which lines generated by participants were classified as either correct or incorrect. Correct answers consist of an alternating pattern without errors.
To assess language disorders, we included two subtests of a screening instrument for aphasia. “Sentence construction” is a test in which participants have to construct a sentence to describe each of ten pictured events (Deelman et al., 1981). Sentences are scored as incorrect if they contain errors in syntax or semantics. In a second task, object naming, participants had to provide the names of 18 pictured objects (Deelman et al., 1981). Scores on the Mini Mental State Examination (MMSE, Folstein et al., 1975) were also used in the study.
All tests were administered in one session except the MMSE, which was administered separately by hospital staff. Test administration took place at the participant's home, and lasted about one hour. Tests were administered in the following order: eight-word list learning test, items Autobiographical Memory Interview, recall and recognition eight-word list learning test, Neologism and Vocabulary Test, meander, NART, and language subtests. When participants showed signs of tiredness, a break was scheduled. After completion of the study, all participants were informed of its results.
Table 1 presents the mean scores of the AD and NC groups on each test in the battery. We compared scores of both groups with one-sided tests, correcting the degrees of freedom for inequality of variance if necessary. No difference was found for the NART, t(28.96) = 0.93, p > 0.1, or for the meander, χ2(1) = .168, p > .1, while on the object naming task normal controls all had perfect score, making analysis impossible. The AD group performed worse on all other tests. On the eight-word test, patients with Alzheimer's disease showed deficits in immediate recall, t(29) = 6.84, p < .001, in delayed recall, t(23.9) = 6.73, p < .001, and in recognition t(23.9) = 4.54, p < .001. They also had lower scores than normal controls on the sentence construction task, t(29) = 3.65, p = .001.
Large group differences were found on the two tests of retrograde amnesia. In a period-by-group ANOVA, we found main effects of group on both the AMI, F(1,29) = 72.4, p < .001, and the NVT, F(1,29) = 24.9, p < .001. On the latter test, effects of demographic variables were also analyzed. No effect was found of age, r = .06; p > .1, of sex, t(25.97) = .34, p > .1, of educational attainment, F(5,25) = 1.54, p > .1, or, within the patient group, of the taking of rivastigmine, t(7.11) = .82, p > .1.
A clear gradient were found on the AMI (Figure 1a), with an interaction between group and period, F(1.64,47.5) = 4.271, p = .026, indicating stronger deficiencies for the AD group on more recent periods. The main effect of period was also significant, F(1.64,47.5) = 14.8, p < .001, with worse performance for recent periods.
On the NVT, a main effect of period was also present, F(1.67,48.4) = 29.4, p < .001, favoring remote periods. No interaction between group and period was found, F(1.67,48.4) = 1.22, p > .1, suggesting that there was no gradient on the NVT. The picture changed, however, when reference words were taken into the analysis as a fourth period. Main effects of both group, F(1,29) = 26.7, p < .001, and period, F(3,87) = 50.3, p < .001, were still present, but now the interaction between group and period was also significant, F(3,87) = 4.54, p = .005. This suggests that although patients with Alzheimer's disease did not have a specific deficit on one period of the NVT, their knowledge of neologisms was impaired relative to that of old vocabulary.
Controls scored lower on the neologisms than on the reference words, suggesting that neologisms were, in the population from which our sample was taken, more difficult than reference words. The lower score of the AD group on neologisms might thus have been a generic difficulty with defining challenging words, not related to time of introduction of a word. To control for this possibility, we divided reference words and neologisms into brackets on the basis of the number of normal controls that defined them correctly. We identified nine reference words and ten neologisms that were defined correctly by all participants in the NC group, and two reference words and 12 neologisms that were defined correctly by between 99% and 75% of participants in the NC group. We then calculated the proportion of these words that participants in the AD group defined correctly (Figure 2), and tested whether the AD group still showed evidence of a selective impairment on neologisms relative to reference words within these brackets. Patients indeed had a lower score on neologisms relative to reference words for those words that all normal controls had defined correctly, t(15) = 2.74; p = .008 (left panel in Figure 2). They were also impaired on neologisms relative to reference words for those words that between 75% and 99% of normal controls had answered correctly, t(15) = 1.98; p = .033.
Internal consistency of the NVT, as measured in the AD group, was .78 for the test as a whole. Analysis of subtests yielded Cronbach's alpha's of .62 for the 1970s, .48 for the 1980s, .50 for the 1990s, and .31 for the reference words. These values were marginally higher for the multiple-choice version (e.g., 82 for the test as a whole). Reliability was thus satisfactory for the test as a whole, but not for all subtests separately.
Correlations between the different tasks were analyzed separately in each group, and again in both groups taken together. As a note of caution, given the small group sizes only large correlations (>.49) were significant in group analyses.
In the AD group, the NVT was not significantly correlated with measures of either anterograde or retrograde amnesia (see Table 2). The test did correlate with the NART, which has a semantic component to it (see Discussion). The reference words correlated with both the object naming and the MMSE, suggesting that this section was more sensitive to broad damage in basic skills. In the normal control group, the NVT correlated with the NART and, in the multiple-choice version, with the AMI and the Meander (see Table 3). In the two groups taken together, the NVT correlated strongly with episodic memory tests, both the anterograde eight-word test and the retrograde AMI (see Table 4). The correlation with the NART was also evident in this analysis.
The present study shows that AD patients exhibit retrograde amnesia for unambiguously semantic material. Their ability to define neologisms was significantly impaired as compared to normal controls, and it was more impaired than their ability to define reference words. An analysis of subgroups of words suggested that time and not generic difficulty was the important factor in this deficit.
In the NVT, neologisms are ordered into decades by the year that they entered the language. These years are only indicative of when memories for the word could have started to be formed: the date that a word enters a language is not the date that most people first experience it. Moreover, words usually do not leave the language, so that memories for the word can continuously have formed since the moment the word entered the language to the onset of amnesia. Nevertheless, Verfaellie et al. (1995) reported a gradient in the remote memory of patients with Korsakoff's disease using a neologisms test. Gradients can thus be observed with such retrograde amnesia tests. Presumably, a memory deficit that disproportionately affects recent memories would affect neologisms that have been encountered only recently, while sparing those that were also stored in remote time periods.
Here, no gradient could be established within the NVT, though one was found in episodic retrograde amnesia. The inconsistent findings are in line with the literature, as gradients are sometimes found in patients with Alzheimer's disease (Beatty & Salmon, 1991; Beatty et al., 1988; Brown, 2002; Kopelman et al., 1989; Moscovitch, 1982), and sometimes not (Leplow et al., 1997; Thompson et al., 2002; Wilson et al., 1981). It should be noted that the periods tested in the NVT all fall between the ‘recent’ period on the AMI and the other two periods. The time range of the NVT may thus have been too small to find a gradient.
The present findings also highlight the importance of semantic memory disturbances in Alzheimer's disease. This is consistent with previous findings with other semantic memory tasks (Petersen et al., 1999; Salmon et al., 1992). The finding that patients with Alzheimer's disease do not benefit as much from cueing as normal controls (Cahn et al., 1995) has also been interpreted as indicative of semantic memory problems.
An interesting finding is the strong correlation between the NVT and the NART. Recent evidence suggests that NART scores decline as dementia progresses, and that it may thus underestimate premorbid IQ in patients with more severe dementia (Cockburn et al., 2000; Schmand et al., 1998). The decline of the NART-based estimate of IQ was related to deterioration of semantic memory as reflected in verbal abstraction and category fluency (Schmand et al., 1998). The current result corroborates that finding, suggesting that the NART is indeed sensitive to declines in semantic memory. It does not exclude an alternative explanation, however, namely that semantic memory tests are sensitive to crystallized intelligence as purportedly measured by the NART (the fact that the NVT and NART also correlate within the normal control group supports this alternative). Indeed, tests of retrograde amnesia typically correlate with intelligence (Kapur et al., 1998).
In many ways, it is not surprising that retrograde amnesia should extend to semantic memory. Nearly all tests of semantic memory are tests of remote memory in the sense that they measure old memories. Moreover, all remote memory tests contain semantic items, at least following some definitions of semantic memory. If semantic memory is “the component of long term memory which represents our knowledge of objects, facts and concepts, as well as words and their meanings” (Garrard et al., 1997), then all tests of public knowledge are tests of semantic memory (as they query facts about the news), and many tests of autobiographical memory contain semantic items (the so-called “personal-semantic” questions, Kazui et al., 2003; Kopelman et al., 1989). Nevertheless, some semantic memories are more semantic than others. Semantic memories are “usually overlearned and not temporally specific” (Garrard et al., 1997). Knowledge of news events need not be overlearned, and is to some extent temporally specific. Although some “personal-semantic” questions in the AMI lack this specificity (such as items in which subjects must generate addresses on which they lived), others are specific about time (such as the many questions about a specific wedding in the “young adult” section of the AMI, Kopelman et al., 1989). Lexical knowledge is the prototype of semantic memory; tests that use word meanings as its material, such as the NVT, are therefore unequivocally semantic. The current results thus seem to point to the existence of semantic retrograde amnesia.
Semantic retrograde amnesia is only one explanation for the present results. Three alternative accounts can be envisioned. Since AD has an insidious onset the deficits may not reflect retrograde, but anterograde amnesia. To explain why AD patients also show deficits on the most remote neologisms, one might claim that words need constant repetition in order to remain in lexical memory. Thus recent anterograde amnesia might produce a loss of words learned a long time before. This explanation seems unlikely, however, as several studies have suggested that knowledge enters a state of permastore after some four years (Bahrick, 1984, 1992; Conway et al., 1991).
An alternative suggestion would be that word definition is a task relying partly on episodic memory. Although at first glance this seems unlikely, a similar hypothesis has recently been leveled for the quintessentially semantic task of category fluency: category members may often be generated with help of episodic memories (Hayes et al., 2001).
A third explanation would question whether there is a neuropsychological dissociation in remote memory between episodic and semantic memories. In a recent review, Squire and Zola (1998) argued that the case for such a dissociation is not as strong as is commonly thought. Episodic amnesia combined with intact semantic knowledge may in most cases be a dissociation between a learning deficit leading to an inability to form new episodic memories, and intact old memories. Although patients with normal semantic learning in the face of episodic amnesia have also been reported (e.g., Vargha-Khadem et al., 1997), Squire and Zola (1998) argue that residual episodic memory may explain the acquisition of semantic memories in these patients. Patients with semantic memory deficits and intact episodic memory are at most very rare. It was for a time thought that patients with semantic dementia presented with such a pattern (Graham & Hodges, 1997; Hodges et al., 1992; Snowden et al., 1996), but recently it has become clear that episodic memory is by no means intact in semantic dementia (Graham et al., 2000; Westmacott et al., 2001).
It may be that extended retrograde amnesia is in fact a semantic memory deficit—this is the portent of one view on temporally graded retrograde amnesia, semantization (Cermak, 1984; Meeter & Murre, 2004; Rosenbaum et al., 2001). In this view, episodic memories become semantized with time, which is equivalent to losing their temporal and contextual specificity. This would predict that extensive retrograde amnesia could only occur in the presence of semantic memory deficits, and that loss of semantic and remote episodic memories would stretch back in time to the same extent. The current results do not speak to this conjecture, as the episodic retrograde amnesia test used had a different time scale than the NVT.
The authors thank Dr. B. Schmand (AMC, Amsterdam) and Drs. C. van der Kloet-Quak (Kennemer Gasthuis, DEO, Haarlem, the Netherlands) for their help in recruiting participants. This work was supported by grants to the third author by the Stichting Alzheimer en Neuropsychiatriefonds, Amsterdam.