Introduction
Since its inception in 1999 (Petersen et al., Reference Petersen, Smith, Waring, Ivnik, Tangalos and Kokmen1999), the concept of mild cognitive impairment (MCI) has evolved. Core clinical criteria for MCI include cognitive impairment in one or more domains compared to appropriate normative data with a suggested deficit level of 1.0–1.5 standard deviations (SD) below normative expectations (Albert et al., Reference Albert, DeKosky, Dickson, Dubois, Feldman, Gamst and Phelps2011). Although current criteria provide guidance on an operational definition of cognitive impairment in MCI, the literature reveals great variability in how MCI has been defined. Conventional methods of diagnosing MCI relying on rating scales and cognitive screening have significant limitations (Smith & Bondi, Reference Smith and Bondi2013); slight alterations to the operational criteria for neuropsychological (NP) impairment in MCI can result in anywhere from 10 to 74% of samples being identified as MCI (Jak, Bondi, et al., Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009).
Heavy reliance on a single neuropsychological test as a marker of cognitive impairment is potentially problematic and holds limited interpretive value. A single impaired score within a neuropsychological battery is common in neurologically normal adults (Heaton, Miller, Taylor, & Grant, Reference Heaton, Miller, Taylor and Grant2004); 26% of the standardization sample of older adults for the Wechsler Memory Scale-III (WMS-III) had one or more impaired memory scores (≥1.5 SDs below the mean) (Brooks, Iverson, Holdnack, & Feldman, Reference Brooks, Iverson, Holdnack and Feldman2008). MCI criteria in Parkinson’s disease have also moved toward comprehensive neuropsychological assessments and inclusion of multiple tests within or at least across cognitive domains for improved diagnostic accuracy (Litvan et al., Reference Litvan, Goldman, Troster, Schmand, Weintraub, Petersen and Emre2012).
Actuarial neuropsychological criteria have expanding support in the literature for improving diagnostic rigor for MCI (Bondi & Smith, Reference Bondi and Smith2014). The specific criteria we have proposed (Jak, Bondi, et al., Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009) require at least two impaired scores (>1 SD below normative expectations) within a cognitive domain and have resulted in MCI diagnoses that better map on to biomarkers for Alzheimer’s disease (AD) and are less prone to false positive errors than conventional approaches (Bondi & Smith, Reference Bondi and Smith2014; Clark et al., Reference Clark, Delano-Wood, Libon, McDonald, Nation, Bangen and Bondi2013; Jak, Urban, et al., Reference Jak, Urban, McCauley, Bangen, Delano-Wood, Corey-Bloom and Bondi2009).
Specifically, expected smaller hippocampal volumes were found in a community sample of older adults identified as MCI via this actuarial approach when compared to healthy control participants; no hippocampal differences were noted between groups when more conventional Peterson/Winblad criteria (impairment on one neuropsychological score defined as at least 1.5 SD below normative expectations) were applied (Jak, Urban, et al., Reference Jak, Urban, McCauley, Bangen, Delano-Wood, Corey-Bloom and Bondi2009). Also in a community-based sample, Clark et al. (Reference Clark, Delano-Wood, Libon, McDonald, Nation, Bangen and Bondi2013) found that, of those classified as MCI based on conventional Petersen/Winblad criteria, nearly half (48%) performed within normative expectations, on average, despite at least one low score. In contrast, those classified as MCI based on the actuarial neuropsychological approach clustered into four distinct groups based on consistent patterns of neuropsychological deficits, with no cluster-derived normal group (Clark et al., Reference Clark, Delano-Wood, Libon, McDonald, Nation, Bangen and Bondi2013).
In the much larger Alzheimer’s Disease Neuroimaging Initiative (ADNI) sample, Edmonds et al. (Reference Edmonds, Delano-Wood, Galasko, Salmon and Bondi2014) also found a “cluster-derived normal” group in which 34% of the sample conventionally characterized as MCI [subjective memory complaint, Mini-Mental State Examination scores ≥24, CDR global score=0.5, impaired Wechsler Memory Scale-Revised (WMS-R) Logical Memory II subtest, not demented] actually performed within normal limits on all other neuropsychological testing. Also in the ADNI sample, Bondi et al. (Reference Bondi, Edmonds, Jak, Clark, Delano-Wood, McDonald and Salmon2014) found that those individuals classified by the actuarial neuropsychological method were more likely to have stable MCI (less than 1% reverted back to normal), progress to dementia, and had greater correspondence with AD biomarkers [APOE ε4, cerebrospinal fluid (CSF) hyperphosphorylated tau, B-amyloid] than did those classified by conventional criteria.
Research and intervention efforts focus prominently on identification of “early” MCI or preclinical AD (Sperling et al., Reference Sperling, Aisen, Beckett, Bennett, Craft, Fagan and Phelps2011). Identifying cognitive criteria that best predict development of dementia is important to more effectively target those individuals for early intervention and to ensure a consistently and accurately characterized population for clinical trials. However, reducing false positive errors is also of utmost importance given the psychological burden of receiving an MCI diagnosis. Therefore, to continue to refine diagnostic and prediction models, we examined progression to dementia in the Framingham Heart Study (FHS) via two diagnostic approaches to MCI: the conventional Petersen/Winblad criteria and the Jak/Bondi actuarial neuropsychological method.
Methods
Participants
The FHS is a longitudinal (since 1948) community study and includes serial exams of Original and, since 1971, Offspring cohorts (Feinleib, Kannel, Garrison, McNamara, & Castelli, Reference Feinleib, Kannel, Garrison, McNamara and Castelli1975), surveillance that extended to the Generation 2 cohort in 1991, and is a separately National Institutes of Health funded initiative until mid-2016. A total of 2551 participants attended the 7th Offspring examination and completed neuropsychological testing (data collected between 1999 and 2005). Participants were followed from the time of neuropsychological testing until development of dementia, death, or December 31, 2013, whichever occurred first. Diagnosis of dementia in the Framingham Offspring Study has been previously described (Chene et al., Reference Chene, Beiser, Au, Preis, Wolf, Dufouil and Seshadri2015), but briefly, participants with clear or questionable deficits in one or more cognitive domains and/or are of dementia severity of mild, moderate, or severe are brought to a dementia diagnostic review meeting.
A team of at least one neurologist and one neuropsychologist review all available information including the neurologic and neuropsychological assessments, results from family interview(s), medical and FHS records, and CT/MRI reports to reach a consensus as to whether Diagnostic and Statistical Manual of Mental Disorders, 4th edition, criteria for dementia are fulfilled and to determine the specific diagnosis. Diagnostic criteria for AD are based on current NIA-AA criteria (McKhann et al., Reference McKhann, Knopman, Chertkow, Hyman, Jack, Kawas and Phelps2011), criteria for vascular dementia are based upon NINDS-AIRENS criteria (Roman et al., Reference Roman, Tatemichi, Erkinjuntti, Cummings, Masdeu, Garcia and Scheinberg1993), and criteria for other types of dementia such as Lewy body disease (which includes PD dementia) and frontotemporal dementia are also carefully specified by published criteria (McKeith, Perry, & Perry, Reference McKeith, Perry and Perry1999; Miller et al., Reference Miller, Ikonte, Ponton, Levy, Boone, Darby and Cummings1997).
Base rates of MCI (2.4–4.2%) and dementia (0.25–0.40%) in individuals in their early 60s are extremely low (Anstey et al., Reference Anstey, Cherbuin, Christensen, Burns, Reglade-Meslin, Salim and Sachdev2008; Hänninen, Hallikainen, Tuomainen, Vanhanen, & Soininen, Reference Hänninen, Hallikainen, Tuomainen, Vanhanen and Soininen2002); therefore, to focus on MCI and conversion to dementia, we excluded 1129 participants <60 years of age at the time of their examination. We additionally excluded participants for the following reasons: prevalent dementia (N=30), prevalent stroke (N=46), missing education and/or neuropsychological test data, and/or lack of follow-up for subsequent dementia (N=160), resulting in a sample size of 1203 for the present analysis. All participants were aged 60 or older (mean age=68.5; SD=5.7), Caucasian, 52% female, and 96% had a high school education or greater (32.4%≥ college degree, 30.4 some college, 32.7 high school degree, 4.5, <high school degree).
Neuropsychological Test Battery
As part of their larger FHS assessment, participants underwent a neuropsychological assessment that tapped major cognitive domains but was limited in overlapping tests because of study prescribed time constraints (Au et al., Reference Au, Seshadri, Wolf, Elias, Elias, Sullivan and D’Agostino2004). From this battery, there were three cognitive domains in which there were at least two tests in that domain, necessary for MCI classification (see below). Memory was determined by performance on the WMS Logical Memory Test delayed and recognition scores and the WMS Visual Reproduction Test, delayed recall and recognition scores (Wechsler, Reference Wechsler1987). Executive Functioning/Attention/Processing Speed was measured by Trailmaking Tests A and B (Reitan & Wolfson, Reference Reitan and Wolfson1985). Language tests included the Boston Naming Test (Kaplan, Goodglass, & Weintraub, Reference Kaplan, Goodglass and Weintraub1983) and Similarities from the Wechsler Adult Intelligence Scale (Wechsler, Reference Wechsler1981). Normative cutoffs were determined by age/education referenced scores.
MCI Classification
Participants were classified at baseline as cognitively normal or MCI according to two different criteria sets that varied the cutoffs for impairment and number of tests required to be in the impaired range. Conventional Petersen/Winblad criteria operationalized impairment as performance on a single cognitive test within a domain greater than 1.5 SD below normative expectations. Jak/Bondi criteria operationalized impairment as performance falling greater than one standard deviation below normative expectations on both tests within a cognitive domain. For all criteria, participants were also classified as MCI subtypes: Single Domain Amnestic if only the memory domain was impaired, as Single Domain Non-Amnestic if the memory domain was not impaired and only one non-memory domain was impaired, as Multiple Domain Amnestic MCI if memory and at least one other domain showed impairment, and as Multiple Domain Non-Amnestic if the memory domain was not impaired and more than one non-memory domain was impaired. Because diagnostic classifications were psychometrically determined and inclusion of subjective memory complaints in diagnosis of MCI corresponds to elevated misclassification rates (Edmonds et al., Reference Edmonds, Delano-Wood, Galasko, Salmon and Bondi2014), there was no requirement of subjective memory/cognitive complaint for either the Petersen/Winblad or Jak/Bondi classifications.
Statistical Analysis
For each neuropsychological test, age and education group (<high school degree, high school degree, some college, ≥college degree) adjusted residuals were calculated to determine normative cutoff values used in each MCI definition. Cox proportional hazards regression was used to calculate hazards ratios (HR) and 95% confidence intervals (CI) for the association of each MCI criteria and occurrence of subsequent dementia, adjusted for baseline covariates of age, sex, and education group. Three multivariable models were constructed: model 1 contained baseline covariates plus the Petersen/Winblad MCI definition; model 2 contained baseline covariates plus the Jak/Bondi MCI definition; model 3 contained baseline covariates plus both Petersen/Winblad and Jak/Bondi MCI definitions. Each MCI definition was categorized as overall MCI (yes vs. no) and by using indicator variables for each MCI subtype (with no overall MCI as the referent group). C-statistics were calculated for multivariable models 1–3 and for a baseline model, containing age at NP assessment, sex, and education group. The p-value for the change in C-statistic when each MCI definition was added to the baseline model was calculated using a Z-test.
Additionally, we examined stability of each MCI definition among a subset of 792 participants who underwent a repeat neuropsychological testing at exam 8, a mean of 6.5±1.2 years after their baseline assessment at exam 7. The beta coefficients derived from the residual calculation at exam 7 were applied to each participant’s exam 8 test scores to determine MCI status at follow-up. A cross-tabulation was performed comparing participants’ MCI status at exam 7 and 8. All statistical analysis was performed using SAS version 9.4 (Cary, NC). A p-value of <.05 was considered statistically significant.
This study was approved by the Institutional Review Board at Boston University, informed consent was obtained from all subjects, and human data included in this manuscript were obtained in compliance with the Helsinki Declaration.
Results
In the baseline study sample (N=1203), 410 participants (34.1%) were identified as having MCI via Petersen/Winblad criteria and 283 participants (23.5%) were identified as MCI via Jak/Bondi criteria (see Table 1). A total of 742 participants (61.7%) were classified as non-MCI by both the Petersen/Winblad and the Jak/Bondi criteria, and 232 participants (19.3%) were classified as having MCI by both criteria. A total of 178 participants (14.8%) were classified as MCI by Petersen/Winblad but non-MCI by Jak/Bondi. Fifty-one participants (4.2%) were classified as having MCI by the Jak/Bondi criteria but non-MCI by Petersen/Winblad.
MCI=mild cognitive impairment; CI=confidence interval.
a MCI definition defined using participants ≥60 years at exam 7 NP battery.
b Multivariable= age at NP, sex, and education group (<high school degree, high school degree, some college, ≥college degree).
c The C-statistic for the multivariable-only model (age, sex, education group) is 0.779 (95% CI [0.721, 0.836]). None of the MCI criteria resulted in any statistically significant improvement in the C-statistic (p-value=.99) as compared to the multivariable-only model.
d Unable to estimate hazards ratio because no dementia cases occurred in this subgroup.
Drawing from exam 7 data, 58 (4.8%) participants developed incident dementia over a mean follow-up period of 9.6±3.3 years. For the overall MCI diagnostic groups, in models 1 and 2, the association with dementia was similar (Petersen/Winblad: HR=2.64; p-value=.0002; Jak-Bondi: HR=3.30; p-value <.0001). When the Petersen/Winblad and Jak/Bondi MCI criteria were included in the same model (model 3), the HR for each definition was attenuated but remained statistically significant only for the Jak/Bondi definition (Petersen/Winblad: HR=1.58; p-value=.18; Jak/Bondi: HR=2.47; p-value=.008). Addition of either or both the Petersen/Winblad and Jak/Bondi MCI criteria to a baseline model containing age, sex, and education group, did not result in any statistically significant increase in the C-statistic (p-value=.99).
When examining MCI subtypes, multi-domain amnestic MCI was significantly associated with incident dementia for both diagnostic approaches (Petersen/Winblad, HR=4.73; p-value <.0001; Jak/Bondi, HR=8.50; p-value <.02). Single-domain non-amnestic MCI was associated with dementia only for the Jak/Bondi definition (HR=2.55; p-value=.005). There was no association between multi-domain non-amnestic MCI and incident dementia for the Petersen/Winblad definition. We were unable to assess the association for the Jak/Bondi definition due to the lack of dementia cases in this subgroup. When the Petersen/Winblad and Jak/Bondi MCI subtypes were included in the same model (model 3), the HRs were attenuated but remained statistically significant for only multi-domain amnestic MCI as defined by the Jak/Bondi criteria (HR=5.98; p-value=.0003). Addition of either or both the Petersen/Winblad and Jak/Bondi MCI subtype definitions did not result in any statistically significant increase in the C-statistic (p-value=.99) relative to the baseline model (age, sex, and education group).
Repeat neuropsychological testing at exam 8 was available for 792 (66%) participants and occurred a mean of 6.5±1.2 years after baseline testing (exam 7). For the Petersen/Winblad criteria, a total of 564 participants (71.4%) remained stable (classified as MCI at both initial and follow-up or were cognitively normal at both assessments). A total of 107 (13.5%) participants reverted from MCI to normal at follow-up, and 119 (15.0%) transitioned from normal to MCI. Applying the Jak/Bondi criteria, 614 (77.5%) participants were stable over the follow-up interval, 90 (11.4%) reverted from MCI to normal, 86 (10.9%) progressed from normal to MCI, and 2 (0.25%) progressed from MCI to dementia (see Table 2).
a A total of 411 participants (n=241 without MCI at exam 7 and n=170 with MCI at exam 7) did not attend exam 8 and have been excluded from the table. The mean time between exams 7 and 8 is 6.5±1.2 years.
b A total of 411 participants (n=280 without MCI at exam 7 and n=131 with MCI at exam 7) did not attend exam 8 and have been excluded from the table. The mean time between exams 7 and 8 is 6.5±1.2 years.
Discussion
In a large community cohort of older adults, we observed that the Petersen/Winblad criteria identified approximately one third of the sample as MCI while the Jak/Bondi criteria classified 24% of the sample as MCI. Despite the difference in the proportion of participants classified as MCI, each MCI definition had a similar magnitude of association with incident dementia. However, only the Jak/Bondi MCI definition remained statistically significantly associated with incident dementia when both diagnostic approaches were included in the same model. The Petersen/Winblad criteria may be over-inclusive, resulting in a high rate of false positive diagnostic errors. Impairment on a single test score has been found previously to inflate (likely artificially) MCI prevalence and reduce specificity (Loewenstein et al., Reference Loewenstein, Acevedo, Small, Agron, Crocco and Duara2009; Trittschuh et al., Reference Trittschuh, Crane, Larson, Cholerton, McCormick, McCurry and Craft2011). Using cluster analytic statistical methods, Bondi et al. (Reference Bondi, Edmonds, Jak, Clark, Delano-Wood, McDonald and Salmon2014), Edmonds et al. (Reference Edmonds, Delano-Wood, Galasko, Salmon and Bondi2014), and Clark et al. (Reference Clark, Delano-Wood, Libon, McDonald, Nation, Bangen and Bondi2013) all found “cluster-derived normal” subgroups despite their MCI diagnoses when using conventional Petersen/Winblad approaches, further highlighting the tendency for false positive errors when using this conventional diagnostic approach.
Association with incident dementia was similar among both the overall MCI diagnostic groups, each having an approximately three-times greater risk of progressing to dementia. This is consistent with others who have found that progression rates are ultimately similar even when components of the diagnostic approach are varied (Fisk, Merry, & Rockwood, Reference Fisk, Merry and Rockwood2003). However, when the Petersen/Winblad and Jak/Bondi MCI definitions were included in the same model, only the Jak/Bondi definition criteria remained statistically significantly associated with incident dementia.
The strength of association between each subtype of MCI and incident dementia did not vary substantially by diagnostic approach. Progression to dementia by the single domain amnestic subtype was similar for both criteria (Petersen/Winblad HR=1.99; p-value=0.06; Jak/Bondi HR=2.55; p-value=.005). Progression to dementia by the single domain non-amnestic subtype was also similar for both criteria (Petersen/Winblad HR=2.50; p-value=.02; Jak/Bondi HR=2.56; p-value=.05). Multi-domain amnestic presentations had overall similar rates of progression to dementia regardless of MCI diagnostic approach, although the strength of the association between each subtype and incident dementia were notably higher when using the Jak/Bondi criteria – 8.5 times greater risk of progression to dementia than the Petersen/Winblad (4.7 times greater risk). Progression to dementia by the single domain non-amnestic subtype was also similar for both criteria (Petersen/Winblad HR=2.50; p-value=.02; Jak/Bondi HR=2.56; p-value=.05). Unlike prior studies highlighting the instability of the single domain non-amnestic MCI subtype (Ganguli et al., Reference Ganguli, Snitz, Saxton, Chang, Lee, Vander Bilt and Petersen2011), we found an association between single domain non-amnestic MCI and incident dementia for both criteria.
A significant strength of the present study is the use of longitudinal data spanning nearly a decade on a well characterized community sample. While legacy epidemiological data are rich in many regards, there were some limitations of the neuropsychological battery that should be acknowledged. Creation of a visuospatial domain was not possible. While this domain of functioning would have optimally been included, visuospatial functioning is the last cognitive domain, except in the case of Lewy body dementia, to be impaired in the prototypical progression of cognitive deficits associated with MCI and AD (Salmon & Bondi, Reference Salmon and Bondi2009). Our own prior work has also revealed only a very small isolated visuospatial subtype (19 of 197 subjects had a very mild −1.0 SD deficit on block design), and conventional MCI criteria failed to identify a visuospatial subtype (Clark et al., Reference Clark, Delano-Wood, Libon, McDonald, Nation, Bangen and Bondi2013).
Similarly, inclusion of Trail Making Tests A and B in the executive functioning/attention/processing speed was necessitated by the limited legacy data. While a broader base of tests in this domain would have been ideal, several other published studies that use Trails A and B as a psychomotor speed/executive function domain demonstrate robust cognitive subtypes with this measurement strategy (Bondi et al., Reference Bondi, Edmonds, Jak, Clark, Delano-Wood, McDonald and Salmon2014; Edmonds, Delano-Wood, Clark, et al., Reference Edmonds, Delano-Wood, Clark, Jak, Nation, McDonald and Bondi2015; Edmonds, Delano-Wood, Galasko, Salmon, & Bondi, Reference Edmonds, Delano-Wood, Galasko, Salmon and Bondi2015; Edmonds et al., Reference Edmonds, Delano-Wood, Galasko, Salmon and Bondi2014).
Additional study limitations were that participants were predominantly white and generalizability to other groups may be limited. Positron emission tomography or CSF biomarker information was not available for all participants; therefore, neurobiological factors were not available to supplement/support the progression data. We were not able to perform a direct statistical comparison of the individual MCI subtypes as defined by the Petersen/Winblad and Jak/Bondi criteria due to lack of a common referent group. The number of incident dementia cases and the numbers of participants for some MCI subtypes was small, thus, results for MCI subtypes should be interpreted with caution. The small number of overall dementia cases also precluded examination of progression specifically to Alzheimer’s dementia. It is possible that a larger number of incident cases, particularly of Alzheimer’s dementia, would reveal a relationship between single domain amnestic MCI and dementia. Future studies with even longer periods of follow-up will be critical. Finally, selection bias may have impacted data at exam 8. That is, we only had repeat neuropsychological testing on 66% of our initial sample; thus, our estimates of diagnostic stability may be affected by bias. It is also likely that some of those who developed dementia in the follow-up interval were lost to follow-up, reducing statistical power. Results may have been more robust in the absence of these biases.
In conclusion, the Petersen/Winblad and Jak/Bondi criteria resulted in a similar strength of association with incident dementia, despite the Jak/Bondi criteria classifying approximately 30% fewer participants as having MCI. The Petersen/Winblad criteria may be over-inclusive, resulting in a tendency for false positive errors. Use of the Petersen/Winblad criterion is pervasive in the MCI research literature, and our data call into question the utility of this criterion as it is traditionally implemented. Given the psychological burden of receiving an MCI diagnosis, it is important to identify sensitive and reliable MCI criteria to best identify high-risk cognitive profiles. MCI criteria that involve a third fewer participants to achieve the same rate of progression to dementia is also a valuable consideration for future clinical research.
The available neuropsychological battery was more restricted than would be optimal, although the present study provides data on the application and function of different MCI classifications when using a smaller corpus of neuropsychological tests. In many older/longitudinal or current pharmacological trials, brief cognitive batteries are the norm, and so being able to adapt MCI classifications flexibly to differing neuropsychological batteries and empirically characterizing these approaches is essential. Finally, failure of pharmacologic trials to date to produce an effective drug treatment for AD may, in part, be due to intervening too late along the insidious onset progression of AD. While originally conceived to represent a preclinical state, MCI diagnostic criteria as recently defined by the NIA (e.g., Albert et al., Reference Albert, DeKosky, Dickson, Dubois, Feldman, Gamst and Phelps2011) requires clinical characterization that has been shown neuropathologically to meet criteria for definite AD. Neuropsychological testing, used effectively, can serve as a more sensitive measure of early cognitive changes reflective of a neurodegenerative disorder that could greatly impact the potential efficacy of drug discovery studies. As research and clinical trial efforts continue to move toward identification of “early” MCI (see ADNI-GO and ADNI-II) or preclinical AD (Sperling et al., Reference Sperling, Aisen, Beckett, Bennett, Craft, Fagan and Phelps2011) detection, further exploration of optimal MCI diagnostic criteria is warranted.
Acknowledgments
This work was supported by National Institutes of Health grant 5 R01 AG016495-13 (P.A.W., M.W.B.) and R01 AG049810 (M.W.B.). The authors report no disclosures or conflicts of interest.