INTRODUCTION
Older adults are often referred to neuropsychologists when there is concern regarding memory difficulties. Although an evaluation includes a broad range of cognitive abilities, neuropsychologists assessing older adults pay special attention to the assessment of memory. Clinicians are often asked to determine if there is a deficit in memory ability on formal testing and whether this decline is attributable to the early onset of a degenerative process. The performance on memory measures can be a key component of identifying a degenerative disease in the very early stages, especially when there has not been the opportunity for a serial assessment to determine progressive decline.
Several descriptive terms associated with formal criteria have been put forth to identify persons with early impairments who do not meet a diagnosis of dementia, including age-associated cognitive decline (AACD; Levy, 1994), mild cognitive disorder (MCD; World Health Organization, 1993), and mild neurocognitive disorder (MND; American Psychiatric Association, 1994). However, one of the most widely used terms in the clinical research literature has been mild cognitive impairment (MCI). Although there is still no universally-accepted criteria for MCI, Petersen and colleagues at the Mayo clinic defined MCI as: “(1) memory complaint; (2) normal activities of daily living; (3) normal general cognitive function; (4) abnormal memory for age; and (5) not demented” (p. 304, Petersen et al., 1999). These criteria for MCI were later adopted by the Quality Standards Subcommittee of the American Academy of Neurology (Petersen et al., 2001). Despite the absence of a definition for abnormal memory (i.e., which test, how abnormal is the performance), many researchers have come to define this as a memory score falling more than 1.5 standard deviations (SDs) below the mean for a person's age, and at times, for age and education. The definition of MCI has evolved slightly over the years, with the previously mentioned criteria being more specifically termed amnestic MCI to highlight memory as the impaired cognitive domain (i.e., non-amnestic MCI has also been used to describe impairments in other cognitive domains). These criteria for amnestic MCI have been employed in several studies (e.g., Boeve et al., 2003; Griffith et al., 2006; Jack et al., 2005; Kantarci et al., 2005; Kryscio et al., 2006) and are the topic of recent international consensus papers (Gauthier et al., 2006; Portet et al., 2006).
The objective assessment of memory abilities in older adults usually involves administering multiple measures of memory, whether in the form of individual tests (e.g., Hopkins Verbal Learning Test-Revised, Benedict et al., 1998; California Verbal Learning Test–2nd Edition, Delis et al., 2000; Rey Complex Figure Test, Meyers & Meyers, 1995) or a memory-specific battery of tests (e.g., the Wechsler Memory Scale–Third Edition; Psychological Corporation, 1997). Clinicians can expect, based on the normal distribution of standard scores, that a certain percentage of healthy older adults will obtain a low score on any given memory test (e.g., 5% of healthy people will score at or below the 5th percentile). Understanding the base rates of low memory scores when several measures are considered in combination is far more complicated, however.
Palmer and colleagues illustrated the base rates of low memory scores across a flexible battery of memory tests (Palmer et al., 1998). They examined 132 neurologically and psychiatrically healthy older adults between the ages of 50 and 79 (M = 63.8 years, SD = 7.7) using memory tasks involving story learning (WMS-R Logical Memory; immediate and delayed recall, percent retention), recall of simple geometric designs (WMS-R Visual Reproduction; immediate and delayed recall, percent retention), recall of a complex figure (Rey Osterrieth Complex Figure; 3-minute delayed recall, percent retention), word recognition (Warrington's Recognition Memory Test–Words), and face recognition (Warrington's Recognition Memory Test–Faces). When performance on the five memory measures (i.e., 10 age-corrected normative scores) was examined collectively, nearly 40% had one or more low test scores and nearly 17% had two or more low test scores (i.e., ≤ 1.3 SD below the mean or ≤ 10th percentile; see Table 1). Notably, 13% of the healthy older adults had one or more memory tests with a score in the frankly impaired range (i.e., ≤ 2 SD below the mean; ≤ 2nd percentile). Despite rigorous exclusion criteria, Palmer et al. (1998) illustrated that low memory scores are common in healthy older adults when multiple tests are administered.
Number of “impaired” memory scores on a flexible test battery administered to healthy older adults (Palmer et al., 1998)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-78043-mediumThumb-S1355617707070531tbl001.jpg?pub-status=live)
Without knowledge of base rates of low scores, it is possible to over-interpret isolated low scores and, in turn, make erroneous diagnoses. de Rotrou and colleagues used the term “accidental MCI” (de Rotrou et al., 2005) when referring to those patients who were diagnosed with MCI at one time point, but their test scores were found to have “normalized” at a later time. de Rotrou et al. (2005) reported that 48% of those with MCI at the first assessment did not meet criteria for MCI at a one-year follow-up. In the Canadian Study of Health and Aging (CSHA), nearly one-third of the 1790 participants with MCI no longer met diagnostic criteria at the five-year follow-up (Fisk et al., 2003). Ganguli et al. (2004) reported that 55% of their MCI sample no longer met MCI criteria after a two-year interval. Devanand et al. (1997) found 44% and Larrieu et al. (2002) reported over 40% had normal test performance at follow-up. Whether these presumed false positives are the result of individual variability on testing, resolution of non-dementing causes of poor performance (e.g., depression), or measurement error, broadly defined, is uncertain.
Because clinicians administer multiple tests and the results are interpreted in combination, it is critical to be informed of the base rates of low scores across a battery of memory tests. The purpose of this study was to illustrate how often healthy older adults obtain low memory scores, when multiple memory tests were interpreted simultaneously. To illustrate this important psychometric concept, we examined the base rates of low memory scores in healthy older adults on the Memory Module of the Neuropsychological Assessment Battery (NAB; Stern & White, 2003a). However, based on previous research examining the base rates of low scores on co-normed batteries (i.e., de Rotrou et al., 2005; Heaton et al., 2004; Heaton et al., 1991; Palmer et al., 1998), it is very likely that this concept is applicable to all fixed and flexible test batteries and is not specific to the NAB.
METHOD
Participants
Participants for the present study were healthy community-dwelling older adults from the United States (N = 742), selected from the NAB standardization sample. The present sample ranged in age from 55 to 79 years (M = 68.1 years, SD = 6.6 years) and had between 5 and 23 years of education (M = 13.5 years, SD = 2.9 years).
In the process of gathering the normative sample, exclusion criteria were employed in order to prevent the inclusion of persons who could have a neurological disease, acquired injury, psychiatric illness, treatment/medication, or physical impairment (i.e., color blindness, visual loss, hearing impairment, or physical disability) that would negatively impact test performance. In addition, 14 people were excluded from the standardization sample post hoc based on poor performance on the Orientation test. All participants for the standardization sample were recruited and tested at five sites across the United States, including Rhode Island Hospital, University of Florida Health Sciences Center, Indiana University, University of California at Los Angeles School of Medicine, and the Psychological Assessment Resources (PAR) offices in Lutz, Florida. These sites were chosen to represent the four geographical regions of the country (Northeast, Midwest, West, and South). A senior neuropsychologist was employed at each of these sites and served as the principal investigator. The publisher oversaw recruitment to ensure that the sampling plan matrix was closely matched. PAR obtained approval from an external research ethics board and all data were collected in compliance with regulations of the institution. In addition, each of the standardization sites obtained approval from an external research ethics board with local jurisdiction, and all data were obtained in compliance with regulations of the respective institutions.
Measures
The NAB (Stern & White, 2003a) is a comprehensive modular battery of tests across five cognitive domains, including Attention, Language, Memory, Spatial, and Executive Functions. This study examined performance on the NAB Memory Module, which consists of multiple measures, including List Learning, Shape Learning, Story Learning, and Daily Living Memory (i.e., medications instructions and name/address/phone number). Descriptions of each memory measure are provided in Table 2. Immediate and delayed recall scores are included for each test, providing a total of 10 demographically corrected T-scores (note: recognition scores are given as percentiles, rather than T-scores). Of these 10 T-scores, 9 are considered primary and contribute to the NAB Memory Index, which is a composite score for the learning and memory tests (i.e., the List Learning List B Immediate Recall T-score does not contribute to the Memory Index score). For additional information regarding the tests and the domain scores on the NAB, please refer to the manuals (Stern & White, 2003b; White & Stern, 2003).
Descriptions of the NAB Memory tests
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-54448-mediumThumb-S1355617707070531tbl002.jpg?pub-status=live)
The demographically corrected norms for the NAB, developed using a statistical regression method called continuous norming (Gorsuch, 1983), were used for this study. Continuous norming uses a regression equation, based on the entire sample, to produce demographically corrected scores. An advantage of continuous norming is that it eliminates the inaccuracies introduced by traditional tabled norms because normative scores are derived from the entire sample (N = 1448) and each person is compared against their exact age, education, and gender group through a process of analytic smoothing. Traditional tabled norms, in contrast, involve creating distinct groups for each demographic variable, plotting and normalizing the distributions of raw scores, visually inspecting the distribution, and manually “smoothing” minor sampling fluctuations and irregularities. Many common neuropsychological measures use traditional tabled norms (for an example, see page 39 of the WAIS-III/WMS-III technical manual; Psychological Corporation, 1997).
Zachary and Gorsuch (1985) highlighted the potential inaccuracies when correcting for demographic variables with traditional tabled norms in contrast to continuous norming. They presented a hypothetical example using the WAIS-R that involved retesting a person after one month and shifting from one age group to the next (i.e., 34 years, 11 months to 35 years, 0 months). When they used the same sum of scaled scores, the result was an increase of up to 6 points in Full Scale IQ simply by moving to the next age group (see Table 6 from Zachary & Gorsuch, 1985). Using a polynomial regression analysis of the WAIS-R data, they created continuous norms for the WAIS-R that eliminated this “jump” in Full Scale IQ from one age group to the next.
The NAB normative tables were developed with the continuous norming method to minimize potential errors in the derived normative scores when accounting for demographic variables. White and Stern (2003) described the steps taken to calculate each NAB primary score using continuous norming, including (1) using polynomial regression to determine the lines (or curves) of best fit for the progression of means and SDs across groupings of the norming variables; (2) estimating the means and SDs of scores for each normative variable group; and (3) calculating normative scores based on the estimates obtained in the first two steps. The use of continuous norming is not unique to the NAB. Continuous norming has also been used to create normative data for many common neuropsychological tests, including the Wisconsin Card Sorting Test (WCST; Heaton et al., 1993), the Wisconsin Card Sorting Test–64 Card Version (WCST-64; Kongs et al., 2000), and the Canadian standardization of the Wechsler Adult Intelligence Scale–Third Edition (WAIS-III; Psychological Corporation, 2001).
Intellectual abilities were estimated using the Reynolds Intellectual Screening Test (RIST; Reynolds & Kamphaus, 2003), an abbreviated administration of the Reynolds Intellectual Assessment Scales (RIAS). The RIST is normed across the lifespan (i.e., ages 3 through 94). The RIST normative sample (N = 2438) was recruited from 41 states and stratified to represent the United States population based on age, gender, ethnicity, education, and region. The RIST was designed as a brief intellectual screening test. The T-scores of the subtests Guess What (which assesses verbal reasoning, vocabulary, language development, and one's general fund of knowledge) and Odd Item Out (which measures nonverbal reasoning, spatial ability, and visual imagery) are summed to produce a composite index score (M = 100, SD = 15). The composite index score was used for this study. For information regarding the reliability and validity of the RIST, please refer to the manual (Reynolds & Kamphaus, 2003).
Analyses
The base rates of impaired memory scores were calculated by using four cutoff scores that might be routinely used in clinical practice, including: (1) more than 1 SD below the mean; (2) below the 10th percentile; (3) at or below the 5th percentile; and (4) more than 2 SDs below the mean. For the Memory Index score, these cutoffs correspond to Index scores falling less than 85, below 81, at or below 76, and less than 70, respectively. For the memory test T-scores, these cutoffs correspond to T-scores falling less than 40, below 37, at or below 34, and less than 30, respectively. For the memory tests, all 10 T-scores were examined simultaneously.
Analyses of base rates of low memory scores were conducted for the entire older adults sample (ages 55–79). Analyses were also conducted for 4 age groups (i.e., 55–59, 60–64, 65–69, 70–74, and 75–79) and for varying levels of education (i.e., ≤11, 12, 13–15, and 16+ years). Base rates of low memory scores were also calculated for different levels of intellectual abilities, including low average (RIST = 80–89), average (RIST = 90–109), high average (RIST = 110–119), and superior/very superior (RIST = 120+).
RESULTS
Mean scores and standard deviations for the Memory Index and 10 Memory tests are presented in Table 3. The means and standard deviations for the Memory Index for the entire older adults sample and across the five age groups closely approximated the theoretical normal distribution for an index score (i.e., M = 100, SD = 15). The means and standard deviations for the 10 memory tests for the entire sample and the five age groups closely approximated the theoretical normal distribution for a T-score (i.e., M = 50, SD = 10). The average scores for the other NAB domain scores (Attention, Language, Spatial, and Executive Functions) by age group also are presented in Table 3 for comparison purposes. The mean scores for all other domains closely approximated the theoretical normal distribution for an index score (i.e., M = 100, SD = 15).
Mean scores (standard deviations) on the NAB Memory Index, Memory tests, and other cognitive abilities for older adults
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-77120-mediumThumb-S1355617707070531tbl003.jpg?pub-status=live)
The base rates of low Memory Index scores in the older adults standardization sample are presented in Table 4. In the total sample, the base rates of low Memory Index scores approximated the expected values based on the theoretical normal distribution (i.e., 14.4% had a score more than 1 SD below the mean, 8.8% were <10th percentile, 4.4 were ≤5th percentile, and 1.5% were more than 2 SDs below the mean). When the sample was divided into several smaller age ranges, the base rates of low Memory Index scores remained fairly consistent across the five age groups. In addition, the percentage of Index scores falling below the cutoff scores for each age group did not differ from the “expected” percentages. For example, 19.2% of older adults between 75 and 79 years of age obtained a Memory Index score more than 1 SD below the mean, which is not statistically different from the theoretical normal distribution (i.e., 16%; χ2(1) = 0.55, p = .46). The lack of substantial differences in base rates of low Index scores across the age groups was likely the result of using demographically corrected scores. The base rates of low Memory Index scores across levels of education, which are also presented in Table 4, did not vary from the “expected” percentages based on a theoretical normal distribution. Again, this was likely the result of using the demographically corrected scores.
Base rates of low Memory Index scores in the NAB older adults sample (55–79 years): Percentage of subjects scoring below the cutoffs
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-23519-mediumThumb-S1355617707070531tbl004.jpg?pub-status=live)
The base rates of low Memory Index scores varied dramatically by level of intelligence. The percentage of older adults who obtained a low Memory Index score (i.e., <1 SD below the mean) was 44.7% for those with low average RIST scores (RIST = 80–89), compared to 4.8% for those with high average RIST scores (RIST = 110–119) (χ2(1) = 40.60, p < .001). When considering the base rates of impaired Memory Index scores (i.e., <2 SD below the mean) across intellectual abilities, 8.2% of older adults with low average RIST scores obtained an impaired Memory Index, compared to 0.8% of older adults with average RIST scores [RIST = 90–109] [χ2(1) = 18.42, p < .001]. None of the older adults with high average (RIST = 110–119) or superior/very superior RIST scores (RIST = 120+) had a Memory Index more than 2 SDs below the mean.
The base rates of low memory test scores in older adults, when simultaneously considering the 10 memory tests, are presented in Tables 5 to 7. In the total sample (55–79 years old), over half (i.e., 55.5%) had one or more scores below 1 SD from the mean, and 18.5% had three or more scores below 1 SD from the mean. One or more memory test scores at or below the 5th percentile was found in 30.8% of older adults, and one or more extremely low memory scores (below 2 SDs) was found in 16.4% of healthy older adults. The base rates of low memory test scores across the five age groups (see Table 5) and across the four levels of education (see Table 6) had minimal variance and closely resembled the base rates in the entire sample.
Base rates of low memory test scores by age groups in older adults on the NAB
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-09129-mediumThumb-S1355617707070531tbl005.jpg?pub-status=live)
Base rates of low memory test scores by level of education in older adults on the NAB
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-33609-mediumThumb-S1355617707070531tbl006.jpg?pub-status=live)
The base rates of low memory scores varied significantly by level of intellectual abilities (Table 7). In older adults with low average RIST scores (RIST = 80–89), 80.1% had one or more and 31.8% had 5 or lower memory scores (i.e., more than 1 SD below the mean). In contrast, in older adults with high average RIST scores (RIST = 110–119), 46.4% had one or more [χ2(1) = 26.0, p < .001] and 1.2% had 5 or more [χ2(1) = 51.4, p < .001] memory scores more than 1 SD below the mean. With ≤5th percentile as a cutoff, 56.5% of older adults with low average intellectual abilities (i.e., RIST = 80–89) had one or more low memory scores compared to 18.0% with superior to very superior intellectual abilities (i.e., RIST = 120+; χ2(1) = 29.6, p < .001).
Base rates of low memory test scores by level of RIST in older adults on the NAB
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409223614-21964-mediumThumb-S1355617707070531tbl007.jpg?pub-status=live)
DISCUSSION
The purpose of this study was to examine how often healthy older adults get low scores on a battery of co-normed memory tests. To illustrate this principle, we examined the base rates of low memory scores in a large sample of healthy older adults (N = 742) selected from the NAB standardization sample. The tables provided in this study are ready to be used in everyday clinical practice. Regardless of the cutoff scores used (i.e., <1 SD, <10th percentile, ≤5th percentile, and <2 SDs), low memory scores are common when multiple memory measures are administered. As seen in Table 5, 55.5% of the total sample obtained one or more scores more than 1 SD below the mean, and 16.4% obtained one or more scores <2 SDs from the mean. Blackford and LaRue (1989) correctly speculated that “… in a memory battery with many measures, the chances are substantial that at least one score will fall into the impaired range” (p. 303).
The present study replicates and extends the results presented by Palmer and colleagues. Palmer et al. (1998) reported that nearly 40% of healthy older adults between 50 and 79 years old had one or more low memory scores and approximately 17% had two or more low memory scores (i.e., below the 10th percentile). They reported that nearly 13% had one or more memory scores 2 SDs below the mean. The base rates of low memory scores in the present study are quite similar to those reported by Palmer et al. (1998). As seen in Table 5, 38.8% of older adults in the present study obtained one or more and 18.6% obtained two or more memory scores below the 10th percentile (i.e., at or below 1.3 SDs below the mean). Approximately 16% had one or more memory scores 2 SDs below the mean. Similar to Palmer et al. (1998), the present study employed exclusion criteria to ensure the older adults were neurologically and psychiatrically healthy. Although the advantages of the larger sample size and use of age- and education-corrected data in the present study are obvious strengths, the inclusion of base rates across varying levels of intellectual abilities provides critical new information that was not available from Palmer et al. (1998).
The base rates of low scores did not vary by age or education in the present study, likely the result of using demographically corrected normative scores. However, performance on neuropsychological measures varies by intelligence and this must be considered when interpreting low memory test scores (see Table 7). In the present study, over 80% of older adults with low average intellectual abilities have one or more low memory test scores (i.e., more than 1 SD below the mean). Even when more stringent cutoff scores are used with older adults with low average intellectual abilities, 56.5% have one or more memory scores at or below the 5th percentile and 33% have one or more below 2 SDs.
Including an estimate of intellectual abilities in this data set illustrates two important points. First, the risk of over-interpreting a low memory test score in older adults with lower intellectual abilities is very high. Clinicians have been aware of this issue since the WAIS-III and WMS-III were co-normed (Psychological Corporation, 1997) and some recent normative data sets for cognitive measures have accounted for intellectual abilities (i.e., Mayo's Older Americans Normative Study; Steinberg et al., 2005a, 2005b, 2005c, 2005d).
Second, clinicians must guard against over-interpreting average (or lower) memory scores in people with high average or superior intelligence. It is tempting to assume that performance on neuropsychological tests has to be commensurate with intellectual abilities in healthy adults. Dodrill (1997, 1999) discussed the misconception (referred to as a “myth” of neuropsychology) that healthy adults with high intelligence (a) should have high neuropsychological abilities and (b) should not obtain low scores. In the present study, healthy older adults with superior to very superior intellectual abilities did obtain some low memory scores. For example, in those participants with superior/very superior intellectual abilities, 44% of older adults had one or more low memory scores (i.e., <1 SD).
Isolated low memory test scores are common in healthy older adults and might represent normal human variability on testing, a long-standing relative weakness (without a recent change in functioning), or measurement error, broadly defined. Well before the first publication of the MCI criteria, Blackford and LaRue (1989) cautioned against the over-interpretation of isolated low memory scores. This same concern has been reported by other authors over the past decade (e.g., de Rotrou et al., 2005; Palmer et al., 1998). Conclusions drawn on isolated low memory scores can lead to false positive clinical inferences or diagnoses. Although clinicians consider several sources of information (i.e., history, self-report, informant report, medical information from other investigations, and differential diagnoses) before concluding the presence of a deficit, there is always the potential to over-interpret an isolated low memory test score. In the research literature on MCI, it is possible that some portion of the 40% to 55% of patients (i.e., de Rotrou et al., 2005; Devanand et al., 1997; Fisk et al., 2003; Ganguli et al., 2004; Larrieu et al., 2002) who no longer had a memory impairment at follow-up might be considered false positives during their first assessment. It is a remarkable finding in the research literature that large minorities of people who seem to meet criteria for MCI do not have it when tested at a later date.
In the present study, the potential to inaccurately label a person as having amnestic MCI (i.e., “abnormal” memory score, which is commonly based on a cutoff of 1.5 SDs below the mean; Petersen et al., 1999) based on a single low memory score is high and varies greatly depending on how the sample is examined. This can be illustrated using ≤5th percentile as a cutoff (i.e., <1.5 SDs below the mean). In the entire sample, nearly 31% of healthy older adults have one or more “abnormal” memory scores (see Table 5). Across different levels of intellectual abilities, between 18.0 and 56.5% (Table 7) have one or more abnormal memory scores. Of course, it is also possible that a clinician will fail to identify memory problems (i.e., false negatives) by being too conservative when interpreting test scores. There is always a delicate balance between sensitivity and specificity; unfortunately, neuropsychology has only a modest number of empirical studies that provide clinically useful and precise sensitivity and specificity analyses for a battery of tests used with specific patient populations.
It could be argued that some of the participants in this study, especially those in the upper age range, were at risk for having a dementia and their performance on the memory measures could reflect early changes in memory. Although this is possible, fairly rigorous exclusion criteria were used with this sample to ensure only neurologically and psychiatrically healthy older adults were included. Thus, the base rates of low memory scores might be under-estimated given the healthy status of this older adult sample (in contrast to older adults with a variety of medical and minor psychiatric problems). Unfortunately, a definitive answer to this critique is not knowable.
The results of this study are not specific to the NAB and likely represent a poorly understood psychometric phenomenon across neuropsychological measures. Low memory scores in healthy older adults are common and vary with intellectual abilities. Although the WAIS-III and WMS-III were co-normed in 1997, this type of analysis has yet to be published with those test batteries. The base rate data presented in this article also greatly enhance the routine use of the NAB Memory Module with older adults in clinical settings. Using a demographically corrected, co-normed battery of tests is a psychometrically sophisticated clinical assessment strategy. The tables provided in this study can facilitate the simultaneous interpretation of multiple memory test scores.
It is important that neuropsychologists be well informed of the base rates of low memory scores in healthy older adults, use normative data corrected for age and education, and consider level of intelligence when interpreting scores. Over-interpretation of isolated low memory scores in older adults, without considering base rates or additional information, can impact diagnostic accuracy.
ACKNOWLEDGMENTS
The authors thank Jennifer Bernardo for assistance with manuscript preparation. Portions of this manuscript were presented at the annual conference of the National Academy of Neuropsychology in San Antonio, Texas (October 2006). The manuscript is new and original, is not currently under review by any other publication, and has not been previously published either electronically or in print. Dr. Brooks and Dr. Iverson have no known, perceived, or actual conflict of interest with this research. Dr. White is the co-creator of the NAB and Vice-president of Research for Psychological Assessment Resources, Inc.