INTRODUCTION
Mild Cognitive Impairment (MCI) has garnered much attention in dementia research for its implication as a prodromal stage of Alzheimer’s disease (AD) (see Morris, Reference Morris2005). Since its establishment as an amnestic syndrome in the presence of otherwise intact cognition and ability to execute activities of daily living (Petersen et al., Reference Petersen, Smith, Waring, Ivnik, Tangalos and Kokmen1999), this well-studied condition has been revised to address and incorporate single-domain and multiple-domain deficits in cognitive abilities other than memory (Peterson & Morris, Reference Morris2005). The revision therefore yielded four possible MCI conditions: single-domain amnestic, multiple-domain amnestic, single-domain nonamnestic, and multiple-domain nonamnestic. Research suggests that amnestic MCI (aMCI) patients convert to AD at a rate of 16–41% per year (Gauthier et al., Reference Gauthier, Reisberg, Zaudig, Petersen, Ritchie and Broich2006) as opposed to a rate of 1–2% per year in the general population (Petersen et al., Reference Petersen, Doody, Kurz, Mohs, Morris and Rabins2001). Some propose research criteria for very early AD that rely on a core diagnostic criterion of early episodic memory impairment, supportive features such as the presence of medial temporal lobe atrophy or abnormal cerebrospinal fluid markers, and exclusionary criteria like depression or sudden onset of symptoms (Dubois et al., Reference Dubois, Feldman, Jacova, DeKoskey, Barberger-Gateau and Cummings2007). Thus, the study of aMCI and its relationship to cognitive decline remains an important focus of neuropsychological inquiry.
We employed a novel nonlinear multivariate classification statistical method called Optimal Data Analysis (ODA; Yarnold & Soltysik, Reference Yarnold and Soltysik2005) with the aim of identifying factors in the prediction of aMCI. Our prior work (Jak et al., Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009), as well as the work of others (see Twamley et al., Reference Twamley, Ropacki and Bondi2006, for a review), suggests that specific performances on standardized clinical measures of memory, such as the Wechsler Memory Scale – Revised edition (WMS-R) Logical Memory and the California Verbal Learning Test – Second edition (CVLT-II), are highly predictive of aMCI status within a group of premorbidly nondemented older adults.
METHOD
Participants and Materials
All human data included in this article were obtained in compliance with regulations of the Internal Review Board of the University of California San Diego. Ninety-four participants were recruited by advertisements through various media sources in and around San Diego, CA (see Table 1). These participants were enrolled in a longitudinal aging study and had been tracked for three years. All were asked to complete an annual battery of psychosocial measures and neuropsychological tests. Participants were assessed for, and when appropriate diagnosed with, aMCI according to criteria delineated in Jak et al (Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009). The Jak et al. (Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009) method for assigning aMCI diagnoses is based on six variables (age-scaled scores of LMI, LMII, VRI, VRII, and CVLT Trials 1–5 Total and CVLT Long Delay Free Recall standard scores). If participants’ performances on at least two of the memory measures fell one or more standard deviations below their age appropriate norms (i.e., single-domain aMCI), or if participants met criteria for a deficit in one or more cognitive domains in addition to single-domain aMCI (i.e., multiple-domain aMCI), the participants were classified as aMCI. Also, the participants with a deficit in one or more cognitive domains in the absence of memory problems (i.e., nonamnestic subtypes of MCI) were excluded from the analysis. Otherwise, participants were classified as “no MCI.” At the initial wave of the longitudinal study, no participant qualified for a diagnosis of aMCI or AD. At the time of this investigation, 52 participants had completed the second wave, and 35 of these also had completed the third wave.
Table 1. Demographic data for participants
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160704223215-41891-mediumThumb-S1355617710000512_tab1.jpg?pub-status=live)
The demographic information, genetic measures (apolipoprotein E genotype), psychosocial measures, and neuropsychological tests that comprised the battery included: age, education, gender, apolipoprotein E genotype, the Logical Memory (LM) subtest and the Visual Reproduction (VR) subtest from the Wechsler Memory Scale–Revised edition (WMS-R), the California Verbal Learning Test–Second edition (CVLT-II), the Dementia Rating Scale (DRS), the Digit Span and Block Design subtests from the Wechsler Adult Intelligence Scale–Revised edition (WAIS-R), Trials A and B, the Draw-A-Clock test, the Boston Naming Test (BNT), Verbal Fluency, Category Fluency, Color-Word Interference, Tower Test, Sorting Test, and Trail-Making Test from the Delis-Kaplan Executive Functions System (D-KEFS), the 48-card version of the Wisconsin Card Sorting Test (WCST), the American National Adult Reading Test (ANART), the Independent Living Scale (ILS), and the Geriatric Depression Scale (GDS). In addition, the participants were asked to submit to a cheek buccal swabbing to determine their APOE allele genotype (see Saunders, Strittmatter, & Schmechel, Reference Saunders, Strittmatter and Schmechel1993). In the ODA statistical analyses, all of the above measures collected at the first wave were used as the independent variables to predict the occurrence of aMCI at the second wave. Furthermore, the measures assessed at the first and second waves were examined to predict the occurrence of aMCI at the third wave. The dependent variable was the diagnosis of aMCI at the second and third waves, respectively.
Analysis Strategy
Optimal Data Analysis (ODA) was used to explore whether there were any demographic (including APOE genotype), psychosocial, or neuropsychological factors that predicted diagnosis of aMCI in the second and third waves. The specific variables included in the analysis are listed in the Appendix. ODA was performed by the Windows-based computer analysis software (Yarnold & Soltysik, Reference Yarnold and Soltysik2005). This nonlinear multivariate classification method provides a hierarchical classification tree model in which cases are categorized into each group of a dichotomous dependent variable (“aMCI” or “no MCI” in the current study) by pathways branched by independent variables or “nodes.” An advantage of ODA is that there are no necessary assumptions such as multivariate normality, additivity, equality of group sizes, number of variables, or multicollinearity (see Yarnold, Soltysik, & Bennett, Reference Yarnold, Soltysik and Bennett1997, for details).
ODA refers to an independent variable as an attribute and a dependent variable as a class variable (Soltysik & Yarnold, Reference Soltysik and Yarnold1993; Yarnold & Soltysik, Reference Yarnold and Soltysik2005). The class variable must be categorical (either dichotomous or multicategorical), whereas attributes may have any scale of measurement. ODA first sets the best categorical borderline for each attribute, called cutpoint or decision rule, which classifies cases with the maximum percentage accuracy (percentage accuracy in classification or PAC) into each category of a class variable. ODA uses a special index, called effect strength for sensitivity (ESS), to indicate the percentage of how many cases belonging to a group are correctly classified. In other words, higher ESS indicates that an obtained cutpoint achieves higher PACs in classifying cases into each category. Next, ODA employs a leave-one-out (LOO) validity approach to evaluate the stability of classification performance. This entails repeatedly analyzing classification performance and checking its consistency across subsamples every time one observation is occasionally excluded. Finally, to evaluate the significance level of classification performance, Fisher’s exact probability test is used.
An attribute that shows the highest ESS, LOO stability, and significant p-value is considered the strongest attribute, which is entered as the top node of the hierarchical tree model (Soltysik & Yarnold, Reference Soltysik and Yarnold1993; Yarnold & Soltysik, Reference Yarnold and Soltysik2005). Once the top attribute is selected, the same procedure is performed again within a subsample classified by the top attribute. Consequently, the model gradually builds a tree of several nodes branched out from the top attribute. If there is no significant attribute, the classification performance is stopped. To finalize the classification tree model, the significance levels of all attributes are retested by a sequentially rejecting Sidak Bonferroni-type multiple comparisons procedure. The purposes of this procedure are to control Type I error rate per comparison and maximize statistical power. If any significance levels are beyond p-value per comparison, these attributes are pruned from the model.
Lastly, it should be noted that, in spite of its unique approach being different from traditional classification methods, the indices used by ODA are compatible with traditional classification method indices, such as the goodness-of-fit index, effect size, and significance level. Therefore, models produced by ODA may be tested according to these parameters. For example, the goodness-of-fit index is comparable to overall classification accuracy, the effect sizes can be calculated by ESS or overall effect strength in ODA, and the significance level is tested by Fisher’s exact probability test.
RESULTS
There were 8 participants categorized as aMCI (5 single-domain) at the second wave, and 5 categorized as aMCI at the third wave (2 single-domain). Three cases from the second wave and one case from the third wave were dropped in accordance with the pairwise deletion method, because these cases had missing data on measures that were significant in the model (i.e., WMS-R LMII % retention, D-KEFS Trail-Making Number Sequencing scaled score, Geriatric Depression Scale score, and WMS-R LMI MOANS age standard score). Figures 1a and 1b summarize the ODA hierarchical classification tree model of baseline data to predict the occurrence of aMCI at the second wave of the longitudinal study. Forty-nine participants entered into the model as the result of a pairwise deletion method, and overall classification accuracy was 93.88% (p < .001) with an overall effect strength of 79.85%. These values indicate that our model was strongly predictive (see Table 2; for the method to evaluate effect strength, see Yarnold & Soltysik, Reference Yarnold and Soltysik2005). Figure 1 depicts that the classification tree model predicted the development of aMCI with 87.5% accuracy; the participants were highly likely to develop aMCI at the second wave if their memory retention rate on WMS-R LM Delayed Recall versus Immediate Recall was lower than or equal to 78.5% at the first wave, and if they had a scaled score of less than or equal to 14.5 on D-KEFS Trail-Making Number Sequencing scale at the first wave. On the other hand, if the participants scored higher than 78.5% of their memory retention rate on WMS-R LMII at the first wave, aMCI was less likely to occur at the second wave with 94.74% accuracy. In addition, even if the memory retention rate was lower than or equal to 78.5% on WMS-R LMII at the first wave, a higher score than 14.5 on the D-KEFS Trail-Making Number Sequencing scale at the first wave predicted the low likelihood of the occurrence of aMCI at the second wave with 100% accuracy.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160704223215-32384-mediumThumb-S1355617710000512_fig1g.jpg?pub-status=live)
Fig. 1a – 1c. (a) The Optimal Data Analysis (ODA) Hierarchical Tree Model 1 for predicting no MCI versus aMCI one year later based on neuropsychological and psychosocial variables (N = 49); (b) Classification performance summary of Optimal Data Analysis prediction of aMCI one year later (N = 49); (c) Classification performance summary of Optimal Data Analysis prediction of aMCI two years later (N = 34).
Note. Ellipses represent nodes, arrow lines represent branches, and rectangles represent prediction endpoints. Numbers under each ellipse (node) indicate Fisher’s exact p value for each node. Numbers next to arrows indicate the cutpoint for classifying cases into the categories (No MCI or aMCI) for each node. Finally, fractions and percentages below each prediction endpoint indicate the absolute number or percentage of the cases correctly classified.
Table 2. Classification performance summary of Optimal Data Analysis prediction of MCI one year later (N = 49)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160704223215-79291-mediumThumb-S1355617710000512_tab2.jpg?pub-status=live)
Note
This classification performance was exactly replicated, regardless of whether the second attribute was (1) D-KEFS Trail-Making Number Sequencing scaled score or (2) Geriatric Depression Scale score. Overall classification accuracy is the percentage of the cases classified correctly. Sensitivity is the percentage of how many cases were correctly classified among cases that actually belong to a given category. Predictive value is the percentage of how many cases were correctly classified among cases that were predicted as a given category. Higher percentage indicates greater classification performance. Effect strength overall is the mean of effect strength for sensitivity and effect strength for predictive value. According to Yarnold & Soltysik (Reference Yarnold and Soltysik2005), the effect strength is strong (75% < ES < 90%).
It was also found that the occurrence of aMCI at the second wave was predicted with the same classification accuracy if the Geriatric Depression Scale (GDS) score was used as the second predictor (see Figure 1b). In this case, the first attribute was still memory retention rate on WMS-R LMII, such that a higher score than 78.5% of their memory retention rate predicted a low likelihood of developing aMCI at the second wave with 94.74% accuracy. On the other hand, if memory retention rate was lower than 78.5%, GDS alternatively predicted the likelihood of developing aMCI in the following way: A participant was less likely to develop aMCI at the second wave if their GDS score was less than or equal to 2.5; otherwise, a participant was likely to develop aMCI at the second wave. Note that both Figures 1a and 1b predicted the occurrence of aMCI with the same accuracy of classification performance.
The predictors of the development of aMCI two years later were also examined by ODA. The ODA hierarchical classification tree model for this prediction is more parsimonious with greater classification accuracy than the first model (see Figure 1c). If participants had a score lower than 8.5 as a Mayo’s Older American Normative Scales (MOANS) age standard score on WMS-R LMI at the first wave, they were diagnosed as aMCI at the third wave; otherwise, participants did not qualify for aMCI at the third wave. Note that both prediction endpoints were predicted with 100.00% accuracy. In other words, the overall classification accuracy was 100.00% (p < .001), and the overall effect strength was also 100.00%, which means that the model perfectly predicted the occurrence of aMCI two years later (see Table 3).
Table 3. Classification performance summary of Optimal Data Analysis prediction of MCI one year later (N = 34)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160704223215-11600-mediumThumb-S1355617710000512_tab3.jpg?pub-status=live)
Note
Overall classification accuracy is the percentage of the cases classified correctly. Sensitivity is the percentage of how many cases were correctly classified among cases that actually belong to a given category. Predictive value is the percentage of how many cases were correctly classified among cases that were predicted as a given category. Higher percentage indicates greater classification performance. Effect strength overall is the mean of effect strength for sensitivity and effect strength for predictive value. According to Yarnold & Soltysik (Reference Yarnold and Soltysik2005), the effect strength is very strong (95% < ES).
DISCUSSION
We employed a novel nonlinear multivariate classification statistical method called Optimal Data Analysis to identify possible predictive factors of developing aMCI in a dataset of neuropsychological and psychosocial measures collected annually for three years from 94 originally nondemented participants. With this method we found that story learning or retention, visuomotor processing speed, and depression were predictive of aMCI one to two years later. No other neuropsychological or psychosocial factors predicted development of aMCI.
Two statistical classification methods have been widely utilized in the literature to conduct exploratory classification analyses: logistic regression analysis (LRA) and discriminant function analysis (DFA). However, these methods assume linearity, where the variability of human behavior is forcefully fit into a mathematical approximation. Specifically, LRA assumes a linear relationship between independent variables and the log odds of a dependent variable, whereas DFA assumes linear combinations of independent variables (i.e., discriminant functions, see Agresti, Reference Agresti2007 and Stevens, Reference Stevens2002). However, the linearity assumption presumes that all observed data should be the same in terms of (1) the set of independent variables, (2) the direction of influence (i.e., positively or negatively predictive), and (3) the coefficient values (or weight) of each independent variable (Yarnold, Soltysik, & Bennett, Reference Yarnold, Soltysik and Bennett1997). If these characteristics are not present, the classification accuracy level is constrained or biased (Soltysik & Yarnold, Reference Soltysik and Yarnold1993; Yarnold & Soltysik, Reference Yarnold and Soltysik2005). In addition to these assumptions, LRA and DFA assume (1) no gross outliers, (2) low multicollinearity of independent variables, (3) the inclusion of independent variables that are all conceptually relevant to a dependent variable, (4) equal and adequate group size, and (5) normality (Agresti, Reference Agresti2007; Jaccard, Reference Jaccard2001; Menard, Reference Menard1995; Peduzzi, Concato, Kemper, Holford, & Feinstein, Reference Peduzzi, Concato, Kemper, Holford and Feinstein1996; Tabachnick & Fidell, Reference Tabachnick and Fidell1989).
In contrast to the linear classification methods, a hierarchical classification tree analysis (CTA) is a nonlinear approach (Yarnold & Soltysik, Reference Yarnold and Soltysik2005; Yarnold, Soltysik, & Martin, Reference Yarnold, Soltysik and Martin1994). The major methods of CTA include classification and regression tree models (e.g., CART; see Breiman, Friedman, Olshen, & Stone, Reference Breiman, Friedman, Olshen and Stone1984) and Optimal Data Analysis (ODA; Soltysik & Yarnold, Reference Soltysik and Yarnold1993; Yarnold & Soltysik, Reference Yarnold and Soltysik2005). These nonlinear methods show some advantages over the linear methods, especially for exploratory analyses. First, CTA theoretically provides a better classification accuracy level than the linear methods, because CTA constructs a hierarchical tree model in which a different set of independent variables with different directions and/or weights are suggested across different partitions of a given sample (i.e., no requirement of forcefully fitting variance into a mathematical estimation). This also means that CTA (1) is less sensitive to gross outliers and (2) detects an interaction effect automatically, without having to create a cross-product variable, which occur in linear classification methods (Bremner & Taplin, Reference Bremner and Taplin2002; Fox, Reference Fox2000; Sonquist & Morgan, Reference Sonquist and Morgan1964).
Furthermore, CTA repeatedly analyzes the overall effect size of each independent variable and enters only the best variable(s) into a model (Breiman et al., Reference Breiman, Friedman, Olshen and Stone1984; Soltysik & Yarnold, Reference Soltysik and Yarnold1993; Yarnold & Soltysik, Reference Yarnold and Soltysik2005), whereas the linear methods compute the partial effect size of each predictor simultaneously to fit all predictors into an overall model. CTA’s unique approach enables (1) selection of a set of independent variables that are all statistically relevant, (2) the ability to ignore a multicollinearity of independent variables, (3) minimization of a loss of observed data by using a pairwise deletion method (rather than a listwise deletion method), and (4) examination of as many independent variables as needed.
Finally, group size is an issue for LRA and DFA because unequal group size can diminish statistical power. In contrast, regardless of group size, CTA maximizes statistical power by using cross-validation (for CART; Breiman et al., Reference Breiman, Friedman, Olshen and Stone1984) or a sequentially rejective Sidak Bonferroni-type multiple comparisons procedure (for ODA; Soltysik & Yarnold, Reference Soltysik and Yarnold1993; Yarnold & Soltysik, Reference Yarnold and Soltysik2005). These procedures determine the size of a CTA model. Thus, CTA does not necessarily assume equality or adequacy of group size to maximize statistical power.
Therefore, CTA (e.g., CART and ODA) is conceptually advantageous over LRA and DFA. But, what is the difference between CART and ODA? CART relies on the least squares and maximum likelihood estimation to evaluate “impurity,” an index that indicates the heterogeneity of given categories (e.g., the Gini index, the towing index, the deviance of nodes; see Breiman et al., Reference Breiman, Friedman, Olshen and Stone1984; Clark & Pregibon, Reference Clark, Pregibon, Charmbers and Hastie1992; Bremner & Taplin, Reference Bremner and Taplin2002), whereas ODA employs percentage accuracy in classification (PAC) and Fisher’s exact probability test. In other words, CART uses parametric tests as classification criteria for a given sample (i.e., the normality and linearity are assumed within a category). However, ODA does not require the assumptions of normality and linearity. Thus, Yarnold et al. (Reference Yarnold, Soltysik and Bennett1997) believe that the nonlinear methods using the least squares/maximum likelihood (e.g., CART) “fail to maximize classification accuracy explicitly for the training sample” (p. 1452), compared to ODA, if the assumptions of normality and linearity are seriously violated within a training sample.
Previous studies revealed that ODA yielded better classification performance accuracy on predicting cardiac death (Yarnold, Soltysik, & Martin, Reference Yarnold, Soltysik and Martin1994) and mortality of patients with cardiopulmonary resuscitation (Yarnold, Soltysik, Lefevre, & Martin, Reference Yarnold, Soltysik, Lefevre and Martin1998) than LRA. For these and the reasons detailed above, ODA was selected in the present study to achieve our goal – exploring neuropsychological and other predictors of aMCI.
Our findings suggest that lower, and not necessarily impaired, performances on measures of story learning and memory, visuomotor processing speed, and depressive symptoms are predictive of subsequent memory decline in a normal population. These findings, at first glance, appear to be in accord with prior studies that have reported the utility of either delayed recall (Albert, Moss, Tanzi, & Jones, Reference Albert, Moss, Tanzi and Jones2001; Arnaiz & Almkvist, Reference Arnaiz and Almkvist2003; Bäckman et al., Reference Bäckman, Jones, Berger, Laukka and Small2005; Twamley et al., Reference Twamley, Ropacki and Bondi2006) or learning measures (Grober & Kawas, Reference Grober and Kawas1997; Rabin et al., Reference Rabin, Pare, Saykin, Brown, Wishart, Flashman and Santulli2009) in providing strong diagnostic sensitivity for aMCI. However, it is important to note that the results showed that relatively lower scores on either WMS-R LM Delayed Recall, D-KEFS Trail-Making Number Sequencing scale, or Geriatric Depression Scale alone did not provide good predictive value of the occurrence of aMCI at follow-up visits, whereas the predictive power improved significantly when Delayed Recall and either D-KEFS Trail-Making Number Sequencing or depression scores were taken into account. Our model suggests that consideration of additional cognitive features beyond memory buttresses the prediction of progression to aMCI.
Studies of aMCI have relied almost exclusively on delayed recall or retention measures in rendering the diagnosis (Arnaiz & Almkvist, Reference Arnaiz and Almkvist2003). Our findings, however, suggest that the diagnosis of aMCI may be aided by the incorporation of other cognitive and psychosocial functioning measurement strategies. A number of studies have specifically shown the sensitivity of Trail-Making test procedures (Chen et al., Reference Chen, Ratcliff, Belle, Cauley, DeKoskey and Ganguli2001), as well as depressive features (Teng, Lu, & Cummings, Reference Teng, Lu and Cummings2007) in the years preceding a diagnosis of Alzheimer’s disease. As Jak and colleagues (Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009) have pointed out, the use of comprehensive neuropsychological assessment when diagnosing MCI subtypes will help to improve the stability and reliability of diagnosis, as will the use of multiple measurements within a cognitive domain, such as episodic memory. These results may suggest that the conventional practice of relying solely on the use of a delayed recall or retention measure, or rating scale summaries of a single delayed recall measure, may lead to more false positive errors (i.e., misdiagnosing healthy individuals as aMCI; Saxton et al., Reference Saxton, Snitz, Lopez, Ives, Dunn and Fitzpatrick2009) than using a procedure based on multiple measures.
Of particular note is the fact that apolipoprotein E (APOE) genotype and gender were not predictive of aMCI in our sample. The APOE genotype, more specifically possession of the epsilon 4 allele, has been associated with earlier age of onset of Alzheimer’s disease (Corder et al., Reference Corder, Saunders, Strittmatter, Schmechel, Gaskell and Small1993) and with impairments in aMCI (Ramakers et al., Reference Ramakers, Visser, Aalten, Bekers, Sleegers and van Broeckhoven2008). However, it was not identified as a significant predictive factor in our model. Our results suggest that neurocognitive and possibly psychological factors may be more predictive of aMCI than the APOE genotype. In regard to gender, some studies have identified a gender difference in MCI incidence (e.g., Das et al., Reference Das, Bose, Biswas, Dutt, Banerjee and Hazra2007), although others have not (e.g., Panza et al., Reference Panza, D’Introno, Colacicco, Capurso, Del Parigi and Caselli2005). Our results suggest gender is not a factor in the incidence of aMCI, at least when considering neurocognitive and psychosocial factors, supporting the refutation of gender as a risk factor for aMCI.
Limitations of the present study include potential sources of sampling error, such as demographic factors that may be not be generalizable to the population as a whole. Our study group’s age range was particularly circumscribed (mean = 77.23, SD = 7.30), and our group had a relatively high level of education (mean = 15.87, SD = 2.49). Our neuropsychological and psychosocial variables were also limited to the battery incorporated for our longitudinal study and may not have addressed factors that could have had an impact on development of aMCI (e.g., neurovascular factors). It is also unknown how many of our aMCI-diagnosed participants will progress to AD. The size of our study sample was not a limitation because ODA as a statistical approach is not limited by traditional sample size power considerations. A final limitation is that our results may be viewed as “circular” given that we examined performances on the same memory measures utilized one or two years later in the diagnosis of aMCI. We do not regard this possibility as reflecting criterion contamination given that we investigated performances on memory measures that were not used in the diagnosis of aMCI at the time that aMCI was diagnosed. In other words, even though the same tests of memory may have been used in the diagnosis of aMCI, the actual test score performances entered into our predictive model were from a different time than diagnosis (i.e., one or two years prior to diagnosis). In addition, the Jak et al. (Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009) method for assigning aMCI diagnoses were based on six variables (age-scaled scores of LMI, LMII, VRI, VRII, and CVLT Trials 1–5 Total and CVLT Long Delay Free Recall standard scores), whereas our predictive models considered a total of 26 memory variables (see Appendix), six of which overlapped with the assignment method of Jak et al. (Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009), although, again, the use of these six test score performances antedated the diagnosis of aMCI – which was based on different test scores from these same tests – by one to two years. As a final remedy to inspect for the possibility of criterion contamination, we again performed ODA analyses excluding those six memory measures used in the Jak et al. (Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009) aMCI classification method. The resulting model trees were identical.
In conclusion, our results have interesting implications for models of the aMCI construct and provide some comparative value to the various definitional schemes recently proposed (see Petersen & Morris, Reference Petersen and Morris2005; Dubois et al. Reference Dubois, Feldman, Jacova, DeKoskey, Barberger-Gateau and Cummings2007, Jak et al. Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009). Some of the advantages of ODA as a statistical approach are that it yields specific cutpoints and a decision tree model that can be cross-validated and empirically tested in future prospective studies. Future research is needed to investigate whether these performance cutpoints in this age range are indeed predictors of aMCI and ultimately of progression to dementia.
ACKNOWLEDGMENTS
This work was supported by grant IIRG 07-59343 from the Alzheimer’s Association (M.W.B.), and National Institute on Aging grants P30 AG10161 (S.D.H), R01 AG012674 (M.W.B.), K24 AG026431 (M.W.B.) and P50 AG05131 (D.P.S.).
APPENDIX
List of attributes analyzed by ODA
1. age as of test date
2. gender
3. handedness
4. examiner
5. education (yrs)
6. ethnicity
7. subject referral
8. ANART VIQ
9. ANART errors
10. WAIS-R digit span forward
11. WAIS-R digit span backwards
12. WAIS-R digit span scaled score
13. WAIS-R digit span MOANS
14. WISC-R block design raw
15. WISC-R block design T score
16. WISC-R block design broken configuration
17. WISC-R block design over time
18. DRS total
19. DRS total T score
20. DRS attention
21. DRS attention T score
22. DRS initiation/perseveration
23. DRS initiation/perseveration T score
24. DRS supermarket items
25. DRS supermarket items T score
26. DRS construction
27. DRS construction T score
28. DRS conceptualization
29. DRS conceptualization T score
30. DRS memory
31. DRS memory T score
32. ADRC form (1 or 2)
33. Boston Naming Test total correct
34. Boston Naming Test total correct T score
35. Boston Naming Test total correct MOANS scaled score
36. BNT spontaneous correct (total)
37. BNT stimulus cues given (total)
38. BNT stimulus cues correct (total)
39. BNT phonemic cues given (total)
40. BNT phonemic cues correct (total)
41. WCST-48 number of categories
42. WCST-48 categories T score
43. WCST-48 nonperseverative errors
44. WCST-48 nonperseverative errors T score
45. WCST-48 perseverative errors
46. WCST-48 perseverative errors T score
47. WCST-48 set losses
48. WCST-48 total errors
49. Trails A
50. Trails A T score
51. Trails A MOANS
52. Trails A no. of errors
53. Trails B
54. Trails B T score
55. Trails B MOANS
56. Trails B no. of errors
57. draw a clock command
58. draw a clock copy
59. verbal fluency version (standard/alternate)
60. letter fluency (f)
61. letter fluency (a)
62. letter fluency (s)
63. letter fluency total raw
64. D-KEFS verbal fluency scaled score
65. letter fluency total T score
66. category fluency (animals) raw
67. D-KEFS category fluency scaled score
68. category fluency (animals) T score
69. D-KEFS color-word interference inhibition scaled score
70. D-KEFS color-word interference inhibition/switch scaled score
71. D-KEFS tower total achievement scaled score
72. D-KEFS sorting test confirmed correct sorts scaled score
73. D-KEFS sorting test sort recognition description scaled score
74. D-KEFS trail-making visual scanning scaled score
75. D-KEFS trail-making number sequencing scaled score
76. D-KEFS trail-making letter sequencing scaled score
77. D-KEFS trail-making number-letter switch scaled score
78. D-KEFS trail-making motor sequencing scaled score
79. WMS-R LMI
80. WMS-R LMI age scaled score
81. WMS-R LMI MOANS age scaled score
82. WMS-R LMII
83. WMS-R LMII age scaled score
84. WMS-R LMII MOANS age scaled score
85. WMS-R LMII % retention
86. WMS-R LMII % retention MOANS age scaled score
87. WMS-R LM recognition %
88. WMS-R LM recognition discrimination percentage
89. WMS-R LM response bias
90. WMS-R VRI
91. WMS-R VRI age scaled score
92. WMS-R VRI MOANS age scaled score
93. WMS-R VRII
94. WMS-R VRII age scaled score
95. WMS-R VRII MOANS age scaled score
96. WMS VRII % retention
97. WMS VRII % retention MOANS age scaled score
98. WMS VRII recognition
99. WMS-R VR recognition discrimination percentage
100. WMS-R VR response bias
101. ILS managing money raw
102. ILS managing money T score
103. ILS managing money problem-solving
104. ILS managing money information
105. ILS health and safety raw
106. ILS health and safety T score
107. ILS health and safety problem-solving
108. ILS health and safety information
109. Geriatric Depression Scale score
110. Geriatric Depression Scale rating
111. CVLT-II
112. CVLT-II list A trials 1–5 total T score
113. CVLT-II long delay free recall T score
114. Overall Abilities
115. Overall Attention
116. Overall Language
117. Overall Visuospatial Skills
118. Overall Executive Functions
119. Overall Memory
120. Overall Living Skills
121. APOE epsilon 4 positive
Note. All attributes listed above were collected at the first wave and the second wave, and each attribute at each wave was individually analyzed by ODA. Class variables were the diagnosis of aMCI at the second wave or the third wave.