Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-06T16:19:44.110Z Has data issue: false hasContentIssue false

Reliable change on the Dementia Rating Scale

Published online by Cambridge University Press:  18 May 2007

OTTO PEDRAZA
Affiliation:
Department of Psychiatry and Psychology, Mayo Clinic, Jacksonville, Florida
GLENN E. SMITH
Affiliation:
Department of Psychiatry and Psychology, Mayo Clinic, Rochester, Minnesota
ROBERT J. IVNIK
Affiliation:
Department of Psychiatry and Psychology, Mayo Clinic, Rochester, Minnesota
FLOYD B. WILLIS
Affiliation:
Department of Family Medicine, Mayo Clinic, Jacksonville, Florida
TANIS J. FERMAN
Affiliation:
Department of Psychiatry and Psychology, Mayo Clinic, Jacksonville, Florida
RONALD C. PETERSEN
Affiliation:
Department of Neurology, Mayo Clinic, Rochester, Minnesota
NEILL R. GRAFF-RADFORD
Affiliation:
Department of Neurology, Mayo Clinic, Jacksonville, Florida
JOHN A. LUCAS
Affiliation:
Department of Psychiatry and Psychology, Mayo Clinic, Jacksonville, Florida
Rights & Permissions [Opens in a new window]

Abstract

A central role for neuropsychological evaluations is the measurement of change in cognitive functioning over time. However, change scores obtained from repeated neuropsychological assessments may be affected by normal variability because of measurement error and practice effects caused by repeated measurements. The current study uses reliable change estimates to establish normative rates of change on the Dementia Rating Scale from baseline to first follow-up testing among 1080 cognitively normal adults aged 65 and older. Results showed that a 6-point decline by European Americans or a 9-point decline by African American adults within a 9–15 month test-retest interval represents reliable change. Within a 16–24-month test-retest interval, a 7-point decline among European Americans or an 8-point decline among African American adults represents reliable change. In addition, preliminary cross-validation was performed in a clinical comparison sample of another 22 older adults. The findings are discussed in the context of potential clinical and research applications. (JINS, 2007, 13, 716–720.)

Type
BRIEF COMMUNICATION
Copyright
© 2007 The International Neuropsychological Society

INTRODUCTION

Establishing whether cognition changes over time is one of the primary roles of neuropsychological assessment. Repeated evaluations provide an index of cognitive stability or change following pharmacologic or behavioral treatments, surgical interventions for epilepsy or Parkinson's disease, resolution of toxic exposure or metabolic abnormalities, and cognitive rehabilitation or remediation after traumatic brain injury. In addition, repeated neuropsychological assessments can provide quantitative information on patterns of cognitive decline in neurodegenerative conditions, primary and secondary brain neoplastic processes, and cerebrovascular compromise.

Change scores obtained from repeated neuropsychological assessments, however, are a function of (a) the individual's true cognitive ability level at each testing point, (b) practice effects caused by repeated administrations of the same test, and (c) normal variability in test scores caused by measurement error. Each or all of these elements may vary according to additional factors such as individual characteristics of the test-taker, the duration of test-retest interval, and the disease process itself.

The guiding purpose for examining change scores is the identification of meaningful change over time. To validly attribute changes in test scores to an individual's true cognitive ability, it is necessary to account for potential practice effects and measurement error. To this end, a group of statistical procedures collectively known as reliable change indices (RCI) have been developed to characterize the expected distribution of change scores and to estimate the likelihood that an individual's score represents true change (i.e., beyond chance fluctuations caused by the unreliability of the test).

The methods for optimally calculating reliable change have been debated extensively in this journal (e.g., Hinton-Bayre, 2000, 2004, 2005a; Maassen, 2004a; Maassen, 2004b; Temkin, 2004) and elsewhere (e.g., Hinton-Bayre, 2005b; Maassen, 2005; Maassen et al., 2006; McGlinchey & Jacobson, 1999). The reader is referred to these sources for a thorough discussion of the methodological and theoretical issues. Briefly, the debate has largely hinged on two issues: (a) whether modifications of the original Jacobson and Truax RCI method are preferable to a standardized regression-based approach (SRB), and (b) which standard error estimate is preferable for use as the denominator in the RCI methods. Regarding the former, Temkin et al. (1999) and Heaton et al. (2001) showed that the RCI method with correction for practice effects (RCIp) performed comparably to more complex regression-based methods, leading the authors to recommend the use of RCIp. Indeed, in a recent letter to this journal, one of the proponents of the SRB approach noted that the two methods are usually in agreement and most studies thus far have found minimal incremental accuracy outside of using the pretest score (Hinton-Bayre, 2005a). Given these observations, the RCIp method was deemed preferable for the current study.

The standard error of measurement of the difference (SEdiff) establishes a prediction interval around the test-retest change score when multiplied by a value from the z-distribution. As described by Iverson et al. (2003) and summarized by Maassen (2004a, 2005), whenever the posttest variance can be estimated and is not assumed to be equal to the pretest variance, the expression

is preferable.

The goal of the current study is to provide normative rates of change from baseline to first follow-up testing on a commonly used measure of global cognitive functioning. The Dementia Rating Scale (DRS) is often used in clinical and research settings for early detection, differential diagnosis, and staging of dementia (Jurica et al., 2001). The DRS yields a maximum score of 144 points based on performance in five subtests. Mayo's Older Americans Normative Studies (MOANS) and Mayo's Older African American Normative Studies (MOAANS) have provided age- and education-corrected normative data for DRS scores (Lucas et al., 1998; Rilling et al., 2005); however, there are no reports in the literature regarding what magnitude of change constitutes a meaningful difference from a baseline level of performance. Heaton et al. (2001) noted that normative rates of change may not generalize from nonclinical to clinical samples, particularly if baseline test performance is dissimilar between groups. Therefore, this study also sought to evaluate the generalizability of the obtained reliable change values in a clinical comparison sample of individuals who were cognitively normal at baseline, but received a clinical diagnosis on follow-up assessment.

METHOD

Participants

Normal cognition

Participants with normal cognition included 1080 adults who took part in the MOANS and MOAANS series. Study criteria and recruitment protocols for these projects have been published previously (Ivnik et al., 1990; Lucas et al., 2005). Briefly, cognitively normal older adults were defined as community-dwelling, independently functioning individuals examined by their primary care physician within 1 year of study entry and who met the following criteria: (1) normal cognition based on self, informant, and physician reports; (2) capacity to independently perform activities of daily living based on informant report; (3) no active or uncontrolled CNS, systemic, or psychiatric condition that would adversely affect cognition, based on physician report; and (4) no use of psychoactive medications in amounts that would be expected to compromise cognition or for reasons indicating a primary neurologic or psychiatric illness. Substantial efforts have been made to follow MOANS and MOAANS participants longitudinally and only those who remained cognitively normal upon all follow-up assessments were included in the normal cognition sample. Every person was seen for at least one annual follow-up examination, although many had additional subsequent exams: 86.2% were seen a total of 3 times, 63.8% a total of 4 times, and 36.3% a total of 5 times.

Clinical comparison sample

Reliable change estimates derived from the normal sample were applied to a clinical comparison sample of 22 participants (21 European Americans, 1 African American) who were evaluated within an equivalent 9–24 month interval. The clinical comparison sample was 59.1% male and had a mean baseline age of 82.3 years (SD = 6.3, range: 70–96). All participants in this sample had normal cognition at baseline but received a clinical diagnosis upon their first follow-up assessment. Fifteen (68%) of them were subsequently diagnosed with mild cognitive impairment (MCI) or Alzheimer's disease (AD) based on consensus evaluation from a neurologist, family medicine physician, and neuropsychologist. Six (27%) participants had an interval stroke and one was diagnosed with Parkinson's disease between baseline and first follow-up. Clinical diagnosis was not assigned based on DRS scores.

Statistical Analyses

Within-group test-retest interval differences were evaluated using paired t-tests. Practice effects were defined as the mean difference between follow-up and baseline DRS raw scores. Test-retest reliability (rxy) was calculated as the correlation between baseline and follow-up DRS raw scores. Between-group comparisons were made using unpaired t-tests or χ2 tests.

A reliable change index adjusted for practice effects (RCIp) was calculated with a 90% prediction interval (P.I.) based on the standard error as shown in Equation (1). The 90% P.I. was obtained by multiplying SEdiff by the corresponding value from the z-distribution (±1.64). To adjust for practice effects, the resulting value was added to, or subtracted from, the mean difference between follow-up and baseline DRS raw scores. RCIp cutoff values were rounded to the next whole number for ease of interpretation and use. To further maximize its clinical utility, RCIp data are presented separately for participants who were retested within 9–15 months and for those retested within 16–24 months of their baseline evaluation. We chose to divide the sample in this fashion to provide clinicians with separate normative data for patients who return for follow-up testing close to one year after baseline, versus patients who return following a longer test-retest interval.

Recent population-based studies have found that older African American experience a more rapid rate of cognitive and functional decline than European Americans, even when sociodemographic factors are controlled statistically (Black & Rush, 2002; Moody-Ayers et al., 2005). In light of this literature, we derived reliable change values for the full normative sample and separately for European American and African American participants.

All data were obtained in full compliance with study protocols approved by the Mayo Clinic Institutional Review Board.

RESULTS

As shown in Table 1, European American participants were on average significantly older, more educated, and constituted a greater proportion of men. They were retested an average of 2 months later than their African American counterparts. There was no significant group difference in family history of dementia.

Demographic characteristics and Dementia Rating Scale (DRS) scores

European American participants obtained significantly higher mean DRS scores than African Americans at baseline and at follow-up assessment. However, mean DRS practice effects were not significantly different between European American (M = 1.3, SD = 4.7) and African American participants (M = 1.7, SD = 6.1). Moreover, mean within-group DRS scores at each time point were in the average clinical range (scaled scores = 10) when using respective published norms. Mean DRS practice effects were not significantly different between men (M = 1.3, SD = 5.2) and women (M = 1.4, SD = 4.5), and were not associated with age (r = −.01, p = .72), years of education (r = −.01, p = .87), or family history of dementia (rpb = .05, p = .09).

The mean DRS scores for the clinical comparison sample were 130.5 (SD = 7.7, range: 113–139) at baseline and 117.7 (SD = 16.7, range: 77–142) at follow-up testing. The mean difference in test-retest scores (M = −12.7, SD = 13.8) was significantly different (t(21.1) = 4.77, p < .001, adjusted for unequal variances) when compared to the mean test-retest difference obtained by participants with normal cognition (M = 1.3, SD = 5.0). When examining only those who obtained a follow-up diagnosis of MCI/AD, their mean test-retest difference (M = −8.5, SD = 9.1) was also significantly different (t(14.12) = 4.21, p < .01, adjusted for unequal variances) compared to participants with normal cognition.

Reliable change results are presented in Table 2. Among all participants, a decline of at least 7 points or an improvement of at least 10 points constituted reliable change at each test-retest interval. Within the 9–15 month test-retest interval, a decline of at least 6 points or an improvement of at least 9 points constituted reliable change for European American participants, whereas a decline of at least 9 points or an improvement of at least 12 points constituted reliable change for African Americans. Within the 16–24 month test-retest interval, a decline of at least 7 points or an improvement of at least 10 points constituted reliable change for European Americans, whereas a decline of at least 8 points or an improvement of at least 14 points constituted reliable change for African Americans. Reliable change data were largely comparable between those with and without a family history of dementia.

Reliable change indices

As used earlier, RCI values define the boundaries of statistically significant change for individual test scores. RCI values can also be used to estimate the frequency of individual follow-up scores that should fall above or below the adjusted prediction interval. Thus, approximately 5% of follow-up scores are expected to fall above and 5% expected to fall below the 90% P.I. Across the 9–15 month test-retest interval, 22 (5.2%) cognitively normal European American adults declined by at least 6 points and 14 (3.3%) improved by 9 points or more. During the same time interval, 4 (2.2%) cognitively normal African American adults declined by at least 9 points and 11 (6.2%) improved by more than 12 points. Within the 16–24 month interval, 20 (4.5%) cognitively normal European American adults declined by at least 7 points, whereas 22 (4.9%) improved by more than 10 points. Finally, among cognitively normal African American adults, 1 (3.1%) declined by at least 8 points and 1 (3.1%) improved by more than 14 points.

In contrast, 16 out of 22 clinical comparison participants (72.7%) declined beyond the respective RCIp cutoff values. No one in the clinical comparison sample showed reliable improvement. When examining only those clinical participants who received a diagnosis of MCI/AD upon follow-up evaluation, 10 out of 15 (66.7%) showed reliable decline, whereas the remainder did not show reliable change.

DISCUSSION

This study sought to improve measurement precision when the Dementia Rating Scale is used to evaluate change in global cognitive functioning. There was no significant association between age, sex, years of education, and family history of dementia on DRS change scores. Among European American older adults, a 6-point decline in DRS scores over a 9–15 month period or a 7-point decline over a 16–24 month period represents statistically significant change. Among African American older adults, a 9-point decline over a 9–15 month period or an 8-point decline over a 16–24 month period represents statistically significant change. The subtle difference in RCIp values between European American and African American participants may be attributed to greater variability in DRS performance by African Americans, which in turn resulted in wider prediction intervals.

Few reliable change studies on normative samples have attempted to cross-validate the obtained results. Using the Wechsler Adult Intelligence Scale, Heaton et al. (2001) showed that RCI values obtained from a sample of 384 nonclinical adult participants did not generalize well to a new sample of 69 stable patients diagnosed with schizophrenia, resulting in larger-than-expected classification errors. The authors suggested that norms for change may generalize better if the two samples demonstrate similar baseline test performance. In an effort to provide preliminary cross-validation of the obtained DRS reliable change indices, we examined change scores in a sample of 22 participants who were initially classified as having normal cognition, but received a clinical diagnosis on follow-up consensus evaluation. In contrast to the proportion of normal cognition participants who demonstrated reliable decline (2.2% to 5.2%), approximately 73% of the clinical comparison sample showed statistically significant decline, thus lending support to the generalizability of the current RCIp criteria for the DRS.

By defining the boundaries of normal chance fluctuation, we can identify with greater precision those individuals whose change scores may be worrisome for progressive cognitive decline. In the context of clinical evaluations, such information can guide test interpretation and diagnostic decisions. In the context of research evaluations, these findings may be used to establish trigger points whenever the DRS is used as a screening measure. For example, a reliable decline on the DRS could “trigger” a comprehensive assessment of that particular research participant.

It is important to note that the DRS reliable change estimates obtained in this study were calculated based on change scores from baseline to first follow-up visit. Thus, it is unclear whether these values will remain valid across various longitudinal time points. Nevertheless, these data may assist clinicians and researchers in determining whether an individual's change on the DRS from baseline to first follow-up represents a reliable indicator of cognitive decline or improvement.

ACKNOWLEDGMENTS

This study was supported by a grant to the first author from the Robert and Clarice Smith Fellowship Program and an NIH Supplement to the Mayo Clinic ADRC (NIA 3P50 AG016574-07S1). The study authors do not have any sources, financial or otherwise, that could result in a conflict of interest pertaining to this manuscript.

References

REFERENCES

Black, S.A. & Rush, R.D. (2002). Cognitive and functional decline in adults aged 75 and older. Journal of the American Geriatrics Society, 50, 19781986.Google Scholar
Heaton, R.K., Temkin, N., Dikmen, S., Avitable, N., Taylor, M.J., Marcotte, T.D., & Grant, I. (2001). Detecting change: A comparison of three neuropsychological methods, using normal and clinical samples. Archives of Clinical Neuropsychology, 16, 7591.Google Scholar
Hinton-Bayre, A. (2000). Reliable change formula query. Journal of the International Neuropsychological Society, 6, 362363.Google Scholar
Hinton-Bayre, A. (2004). Holding out for a reliable change from confusion to a solution: A comment on Maassen's “The standard error in the Jacobson and Truax Reliable Change Index.” Journal of the International Neuropsychological Society, 10, 894898.Google Scholar
Hinton-Bayre, A.D. (2005a). Methodology is more important than statistics when determining reliable change. Journal of the International Neuropsychological Society, 11, 788789.Google Scholar
Hinton-Bayre, A.D. (2005b). Commentary on Maassen's “Reliable change assessment in sport concussion research.” British Journal of Sports Medicine, 39, 488489.Google Scholar
Iverson, G.L., Lovell, M.R., & Collins, M.W. (2003). Interpreting change on ImPACT following sport concussion. The Clinical Neuropsychologist, 17, 460467.Google Scholar
Ivnik, R.J., Malec, J.F., Tangalos, E.G., Petersen, R.C., Kokmen, E., & Kurland, L.T. (1990). The Auditory Verbal Learning Test (AVLT): Norms of ages 55 and older. Psychological Assessment, 2, 304312.Google Scholar
Jurica, P.J., Leitten, C.L., & Mattis, S. (2001). Dementia Rating Scale-2: Professional Manual. Lutz, FL: Psychological Assessment Resources.
Lucas, J.A., Ivnik, R.J., Smith, G.E., Bohac, D.L., Tangalos, E.G., Graff-Radford, N.R., & Petersen, R.C. (1998). Normative data for the Mattis Dementia Rating Scale. Journal of Clinical and Experimental Neuropsychology, 20, 536547.Google Scholar
Lucas, J.A., Ivnik, R.J., Smith, G.E., Ferman, T.J., Willis, F.B., Petersen, R.C., & Graff-Radford, N.R. (2005). Mayo's Older African American Normative Studies: Normative data for commonly used clinical neuropsychological measures. The Clinical Neuropsychologist, 19, 162183.Google Scholar
Maassen, G.H. (2004a). The standard error in the Jacobson and Truax Reliable Change Index: The classical approach to the assessment of reliable change. Journal of the International Neuropsychological Society, 10, 888893.Google Scholar
Maassen, G.H. (2004b). What do Temkin's simulations of reliable change tell us? Journal of the International Neuropsychological Society, 10, 902903.Google Scholar
Maassen, G.H. (2005). Reliable change assessment in sport concussion research: A comment on the proposal and reviews of Collie et al. British Journal of Sports Medicine, 39, 483488.Google Scholar
Maassen, G.H., Bossema, E.R., & Brand, N. (2006). Reliable change assessment with practice effects in sport concussion research: A comment on Hinton-Bayre. British Journal of Sports Medicine, 40, 829833.Google Scholar
McGlinchey, J.B. & Jacobson, N.S. (1999). Clinically significant but impractical? A response to Hageman and Arrindell. Behaviour Research and Therapy, 37, 12111217.Google Scholar
Moody-Ayers, S.Y., Mehta, K.M., Lindquist, K., Sands, L., & Covinsky, K.E. (2005). Black–White disparities in functional decline in older persons: The role of cognitive function. Journal of Gerontology: Medical Sciences, 60A, 933939.Google Scholar
Rilling, L.M., Lucas, J.A., Ivnik, R.J., Smith, G.E., Willis, F.B., Ferman, T.J., Petersen, R.C., & Graff-Radford, N.R. (2005). Mayo's Older African American Normative Studies: Norms for the Mattis Dementia Rating Scale. The Clinical Neuropsychologist, 19, 229242.Google Scholar
Temkin, N.R. (2004). Standard error in the Jacobson and Truax Reliable Change Index: The “classical” approach leads to poor estimates. Journal of the International Neuropsychological Society, 10, 899901.Google Scholar
Temkin, N.R., Heaton, R.K., Grant, I., & Dikmen, S.S. (1999). Detecting significant change in neuropsychological test performance: A comparison of four models. Journal of the International Neuropsychological Society, 5, 357369.Google Scholar
Figure 0

Demographic characteristics and Dementia Rating Scale (DRS) scores

Figure 1

Reliable change indices