Introduction
The rate of Alzheimer's disease is expected to increase two- to three-fold in the coming decades (Brookmeyer, 2011), which threatens to overwhelm healthcare resources at both national and international levels. In response to this challenge, two separate but mutually informative lines of research emerged in recent years. The first attempts to identify at-risk patients before the onset of Alzheimer's dementia (AD). This effort ultimately culminated in the National Institute on Aging's and the Alzheimer's Association's (Albert et al., Reference Albert, DeKosky, Dickson, Dubois, Feldman, Fox and Phelps2011) and the American Psychiatric Association's (APA, 2013) acceptance of mild cognitive impairment (MCI) as a diagnosis that captures cognitively symptomatic individuals who are likely to convert to AD. The second line of research attempts to identify pharmacologic and non-pharmacologic interventions that can enhance, maximize, or otherwise prolong functioning in at-risk patients. This area is especially important since there is often a multi-year period of cognitive stability between conversion from “healthy” to MCI and subsequent progression to AD (Boyle, Wilson, Aggarwal, Tang, & Bennett, Reference Boyle, Wilson, Aggarwal, Tang and Bennett2006; Manly et al., Reference Manly, Tang, Schupf, Stern, Vonsattel and Mayeux2008; Smith et al., Reference Smith, Pankratz, Negash, Machulda, Petersen, Boeve and Ivnik2007). Debate continues on the extent to which pharmacological agents impact cognition and conversion rates (Allain, Bentue-Ferrer, & Akwa, Reference Allain, Bentue-Ferrer and Akwa2007; Daviglus et al., Reference Daviglus, Bell, Berrettini, Bowen, Connolly, Cox and Trevisan2010; Diniz et al., Reference Diniz, Pinto, Gonzaga, Guimaraes, Gattaz and Forlenza2009; Farlow, Reference Farlow2009; Raschetti, Albanese, Vanacore, & Maggini, Reference Raschetti, Albanese, Vanacore and Maggini2007) and is beyond the scope of the current study. Instead, we focus on non-pharmacologic approaches and specifically on techniques that are frequently used in the cognitive rehabilitation of explicit learning and memory, which represent the characteristic areas of impairment in MCI.
Cognitive rehabilitation is considered a practice standard for some patient populations, such as traumatic brain injury and stroke (Cicerone et al., Reference Cicerone, Dahlberg, Malec, Langenbahn, Felicetti, Kneipp and Catanese2005, Reference Cicerone, Langenbahn, Braden, Malec, Kalmar, Fraas and Ashman2011). These practice guidelines emerged after decades of research and the development of a coherent and clinically oriented framework. Cognitive rehabilitation (and remediation) has also become a topic of considerable interest in patients with schizophrenia (see review by Kurtz, Reference Kurtz2012). At this point, the use of cognitive rehabilitation in those with MCI is comparatively understudied and more contentious. For example, a Cochrane Review of randomized controlled trials in those with MCI found no benefit of cognitively based interventions relative to control conditions (Martin, Clare, Altgassen, Cameron, & Zehnder, Reference Martin, Clare, Altgassen, Cameron and Zehnder2011). It is important to note that these conclusions rested on only three studies that used a wide variety of techniques. Over the past 3 years, there have been at least seven reviews (Belleville, Reference Belleville2008; Cotelli, Menenti, Zanetti, & Miniussi, Reference Cotelli, Menenti, Zanetti and Miniussi2012; Huckans et al., Reference Huckans, Hutson, Twamley, Jak, Kaye and Storzbach2013; Jean, Bergeron, Thivierge, & Simard, Reference Jean, Bergeron, Thivierge and Simard2010; Reijnders, van Heugten, & van Boxtel, Reference Reijnders, van Heugten and van Boxtel2013; Simon, Yokomizo, & Bottino, Reference Simon, Yokomizo and Bottino2012; Stott & Spector, Reference Stott and Spector2011) and one meta-analysis (Li et al., Reference Li, Li, Li, Li, Wang and Zhou2011) that examined whether the cognitive rehabilitation of memory can be effective in those with MCI. The results of all eight of these works generally indicated that patients could benefit from memory rehabilitation.
Although the conclusions of these previous efforts are encouraging, the current review focuses on four critical methodological challenges that have yet to be examined in detail, each of which can have profound effects on this overall field of study. We contend that variability in the diagnostic criteria used, techniques investigated, dosage provided, and outcome measures selected render a coherent conclusion difficult at best. We also discuss how this variability and the nature of the rehabilitation affect generalization of the trained techniques. Finally, we provide a model that may prove especially useful in designing and comparing future studies. Selecting and developing control interventions is a major challenge in its own right that has previously been discussed in far more detail than is possible in the current review (see Hart, Fann, & Novack, Reference Hart, Fann and Novack2008).
Methods
We reviewed and summarized the primary research articles from each of the eight previous reviews and meta-analytic studies noted above (Belleville, Reference Belleville2008; Cotelli et al., Reference Cotelli, Menenti, Zanetti and Miniussi2012; Huckans et al., Reference Huckans, Hutson, Twamley, Jak, Kaye and Storzbach2013; Jean, Simard, et al., 2010; Li et al., Reference Li, Li, Li, Li, Wang and Zhou2011; Reijnders et al., Reference Reijnders, van Heugten and van Boxtel2013; Simon et al., Reference Simon, Yokomizo and Bottino2012; Stott & Spector, Reference Stott and Spector2011). This resulted in 29 studies, 1 of which was excluded because it was written in Chinese (Chen et al., Reference Chen, Gu, Wang, Lv, Wang and Gu2008). We used ISI Web of Knowledge to identify seven additional studies that targeted memory rehabilitation in patients with MCI with a publication date before April of 2013. Thus, a total of 36 studies were included in the current methodological review. The key features of these 36 studies, including behavioral outcome, can be seen in Supplemental Table 1. We refer the interested reader to the previous comprehensive reviews for a thorough discussion of treatment efficacy. We examined each of these studies according to: diagnostic criteria (Methodological Challenge 1), techniques used (Methodological Challenge 2), dosage provided (Methodological Challenge 3), and outcome measures used (Methodological Challenge 4). Several of the studies lacked detailed methodological details, especially surrounding the interventions. We were able to gather additional information by contacting study authors. When study authors failed to respond to inquiries, two of the current authors (B.M.H. & M.M.G.) re-reviewed the studies in question and arrived at a consensus decision using the available information.
Table 1 Number of words that a 72 year-old would need to recall to meet standard cutoff scores of -1 or -1.5 SD on two common word list tests
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921003535338-0139:S1355617713001306:S1355617713001306_tab1.gif?pub-status=live)
CVLT-II: California Verbal Learning Test – II. CVLT norms are from the manual. The sample size for the CVLT Short is not provided, as the publisher (Pearson) considers this information a trade secret that cannot be published. RAVLT: Rey Auditory Verbal Learning Test. Metanorms from Table 10-61 and MOANS norms from Table 10-67 in Strauss, Sherman, & Spreen (Reference Strauss, Sherman and Spreen2006). Note that the MOANS values are approximate given their use Scaled Scores. RBANS: Repeatable Battery for the Assessment of Neuropsychological Status. Normative data from RBANS post-publication update (Randolph, 2002).
Methodological Challenge 1: Diagnostic criteria
The ability to compare treatment efficacy across studies fundamentally depends on the accurate and consistent diagnosis of MCI. As seen in Figure 1, however, there is variability in how previous research studies made this diagnosis. A seemingly encouraging finding was that approximately 81% of reviewed studies used some version of the Petersen et al. (e.g., Petersen et al., Reference Petersen, Smith, Waring, Ivnik, Tangalos and Kokmen1999; Petersen, Reference Petersen2004) or Winblad et al. (Reference Winblad, Palmer, Kivipelto, Jelic, Fratiglioni, Wahlund and Petersen2004) criteria, which are comparable. However, closer examination of each study's inclusion criteria revealed a split between a psychometrically based (11/36) and a clinically based (18/36) diagnosis. This split has become a topic of debate in the literature and primarily centers on the confidence with which clinicians can diagnose MCI as a precursor for AD. For example, Lowenstein, Acevedo, Agron, Martinez, and Duara (2007) found substantial variability in the rate of reversion from MCI to “normal” that depended on the diagnostic criteria and psychometric cutoff values used. Reversion was more likely when patients were less impaired as defined by either the number of memory tests showing reduced performance or by the level of psychometric impairment. These and related findings (e.g., Albert, Moss, Tanzi, & Jones, Reference Albert, Moss, Tanzi and Jones2001; Loewenstein, Acevedo, Agron, & Duara et al., Reference Loewenstein, Acevedo, Agron and Duara2007; Loewenstein et al., Reference Loewenstein, Acevedo, Small, Agron, Crocco and Duara2009; Teng, Tingus, Lu, & Cummings, Reference Teng, Tingus, Lu and Cummings2009) suggest that psychometric cutoffs can increase diagnostic certainty, which should translate into a more homogeneous research population. Although we acknowledge the benefits of this approach for diagnostic purposes, we argue that cutoffs may be inappropriate for cognitive rehabilitation-based research for three primary reasons.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-07958-mediumThumb-S1355617713001306_fig1g.jpg?pub-status=live)
Fig. 1 Flow chart of the diagnostic criteria used in the primary research studies (in superscript) that were included in the current review. AAMI = age associated memory impairment.
First, cutoffs are inconsistent with the most commonly used (i.e., Petersen et al., Reference Petersen, Smith, Waring, Ivnik, Tangalos and Kokmen1999; Petersen, Reference Petersen2004; Winblad et al., Reference Winblad, Palmer, Kivipelto, Jelic, Fratiglioni, Wahlund and Petersen2004) and current criteria (Albert et al., Reference Albert, DeKosky, Dickson, Dubois, Feldman, Fox and Phelps2011), which clearly advocate for a clinical diagnosis that is informed by neuropsychological (and other) data (see discussion of clinical vs. psychometric cutoffs on p. 307 of Petersen et al., Reference Petersen, Smith, Waring, Ivnik, Tangalos and Kokmen1999, p. 187 of Petersen, Reference Petersen2004, and p. 272–273 of Albert et al., Reference Albert, DeKosky, Dickson, Dubois, Feldman, Fox and Phelps2011). The intermittent use of psychometric cutoffs in rehabilitation studies may ultimately increase inter-study variability and bias some samples toward more severely impaired patients. We return to this issue when discussing the next two problems.
The second problem with the use of cutoffs relates to the variability in memory test selection and associated normative data. Simply put, there is no “gold standard” for assessing memory functioning and test selection varies considerably between and even within sites. Word list tests [e.g., Rey Auditory Verbal Learning Test (Rey, Reference Rey1941; Reference Rey1964), California Verbal Learning Test (Delis, Kramer, Kaplan, & Ober, Reference Delis, Kramer, Kaplan and Ober2000] are commonly used for clinical purposes and performance on such measures is sensitive to MCI/AD related decline (e.g., Estevez-Gonzalez, Kulisevsky, Boltes, Otermin, & Garcia-Sanchez, Reference Estevez-Gonzalez, Kulisevsky, Boltes, Otermin and Garcia-Sanchez2003; Rabin et al., Reference Rabin, Pare, Saykin, Brown, Wishart, Flashman and Santulli2009). However, such lists vary in length (generally 9–16 words), semantic relatedness, number and structure of exposures, and other critical factors. While appropriate normative data should minimize concerns about test structure, the quality and nature of such data can also be a source of concern. For example, some tests use correction for factors like age, education, ethnicity, and gender while others do not. It is important to note that normative data are often only available or regularly updated for North American populations, which increases the likelihood of error in other parts of the world. Table 1 uses normative data (typically based on North American samples) for three common word list tests to highlight two potential concerns for a theoretical 72-year-old patient.
First, the normative sample sizes vary considerably but are above the accepted number for establishing a normal curve (Fischer, Reference Fischer2010). However, it is possible that the lower end of this “normal” distribution actually represents those with preclinical Alzheimer's disease since biomarkers have not been widely (if ever) used when collecting normative data. This possibility may mean that “normal” age-related variability in memory functioning is currently over-estimated, especially considering recent evidence that biomarker positive “normal” older adults demonstrate subtle memory decline relative to biomarker negative older adults (Sperling et al., Reference Sperling, Johnson, Daoraiswamy, Reiman, Fleisher, Sabbagh and Pontecorvo2013). Second, five or fewer words are needed to meet the -1 or -1.5 cutoff values in all cases in Table 1, indicating that floor effects are pervasive. This means that it will be extremely challenging, if not impossible (e.g., MOANS for the RAVLT; RBANS), to document further decline indicative of progression to AD. In essence, the theoretical distinction between lesser impairment in MCI and greater impairment with progression to AD has been compromised. Functionally, this basal level would typically suggest a frank amnestic profile—again inconsistent with the diagnosis of MCI. These concerns further highlight the probability that samples will be biased toward more severely impaired patients when cutoff scores are used.
The third problem with using cutoff scores for cognitive rehabilitation research is that they fail to recognize the continuum between “normal” aging and the ultimate diagnosis of AD. This is especially important because “early” MCI patients are more likely to be active and are often still employed. Conversely, “late” MCI patients can be virtually indistinguishable from those with AD, are less likely to be employed, and typically require greater assistance in everyday life. Thus, the treatment goals may vary considerably as a function of disease severity. This continuum certainly presents a challenge for cognitive rehabilitation research and requires careful selection of outcome measures that are widely applicable. We contend that rehabilitation-based research should include the full “MCI spectrum” and that disease severity should be included in the analysis plan. Approximately 53% of the studies reviewed identified patients as having single- or multi-domain MCI (e.g., Petersen, Reference Petersen2004); however, this distinction is problematic for two primary reasons: (1) it is no longer recognized in the Albert et al. (Reference Albert, DeKosky, Dickson, Dubois, Feldman, Fox and Phelps2011) criteria and (2) the designation of multi-domain provides little information about the actual domains affected or the severity of that impairment. A more informative approach may be to use neuropsychological performances as continuous variables and examine their relationship with treatment efficacy; an approach supported by our recent findings. For example, mnemonic-strategy based improvement was positively related to both executive (defined as the Trails B/A ratio) and memory functioning (Delayed Memory Index of the RBANS) as well as inversely with the volume of the inferior lateral ventricles (Hampstead, Sathian, Phillips, Amaraneni, Delaune, & Stringer, 2012). Our data suggest that training is appropriate for those with “early” MCI whereas improvement following mere repeated exposure (without strategy use) was unrelated to any cognitive or volumetric variable, suggesting it is appropriate for any stage of MCI. This type of approach will become increasingly important as biomarkers are more widely used in the early identification of Alzheimer's pathology (e.g., Jack, Reference Jack2012). Thus, using cutoffs would truncate the true range of MCI and negate the identification or use of potentially beneficial treatment methods.
On a final note, there may be inherent differences between rehabilitation studies using patients drawn from clinical settings versus community-based recruitment efforts, a possibility that should be examined in future research. With the importance of consistent diagnostic criteria established, we now turn our attention toward factors that can promote perhaps the most common criticism of cognitive rehabilitation research, which is that the effects do not generalize beyond the trained condition. We discuss this issue below (Challenge 5), but contend that an inadequate delineation of the conditions under which the various techniques are successful (Challenge 2), lack of information about the dose-response relationship (see Challenge 3), and questionable selection of outcome measures (see Challenge 4) can all impede generalization.
Methodological Challenge 2: Techniques Used
Just as there are multiple medications available to treat most medical conditions (e.g., high cholesterol), so too are there multiple techniques that could be used, either alone or in combination, in the rehabilitation of memory. Figure 2 highlights several of the most commonly used memory rehabilitation techniques, which we grouped according to the primary “mechanism of action.” Previous research has distinguished between cognitive training (i.e., methods that seek to enhance specific cognitive abilities or areas of deficit) and cognitive rehabilitation (i.e., methods that seek to improve real-world functioning) (e.g., see Bahar-Fuchs, Clare & Woods, 2013). We acknowledge this distinction, but suggest that clearly delineating the conditions under which these classes and/or specific techniques are effective is vital for optimizing treatment outcome. Failure to do so gives rise to a common criticism of this field of research: that training “teaches the task.” As discussed below, this is more likely to occur with some techniques (e.g., computerized training programs/games) but it is critical to understand that this critique becomes less valid as the ecological relevance of the training techniques/tasks increases. For example, it cannot be considered problematic when a patient reports difficulty remembering faces and names and then demonstrates improvement after training/rehabilitation. However, this does not mean that the patient would also experience improvement remembering routes or upcoming appointments, so training may result in some degree of task specificity. Viewing this relative precision as a limitation would be analogous to viewing a statin as ineffective if it lowers some, but not necessarily all, measures of cholesterol (e.g., low density vs. very low density lipoprotein). Addressing this and similar criticisms requires an understanding of the strengths and weaknesses of the various techniques as well as the rational selection of outcome measures (see Challenge 4). The key is to demonstrate that laboratory-based improvement generalizes to everyday life, which is a challenge in its own right (see Challenge 5).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-76125-mediumThumb-S1355617713001306_fig2g.jpg?pub-status=live)
Fig. 2 Common categories and associated techniques that are used in cognitive rehabilitation and cognitive training research. Note that most of these techniques could use errorful or errorless learning methods.
In this section, we first discuss the likely benefits and limitations of these common approaches and then address the current state of research in this area. An important caveat is that almost any of these approaches could use errorful (i.e., allowing the patient to make errors when learning or recalling information) or errorless (i.e., preventing patients from making errors) learning approaches. The potential benefits of errorless learning in the rehabilitation context are discussed elsewhere (Wilson, Reference Wilson2009), and emerged following the recognition that mistakes interfered with memory disordered patients’ ability to encode and retain new information (see Clare & Jones, Reference Clare and Jones2008, for a review).
Common Approaches
As the name suggests, rehearsal-based approaches rely on the repetition of information over time. However, the nature of this repetition can vary considerably across these techniques. For example, spaced retrieval requires the patient to remember targeted information over progressively longer delays whereas the technique of subtracting cues gradually removes aspects of the target information (e.g., the letters of a name) over successive exposures. These techniques have demonstrated efficacy in patients who have progressed to AD (Cherry, Walvoord, & Hawley, Reference Cherry, Walvoord and Hawley2010; Small, Reference Small2012); results that reinforce our above conclusion that rehearsal may be most appropriate for late MCI patients. These techniques are effective for teaching specific information (Hampstead, Sathian, et al., 2012; Sitzer, Twamley, & Jeste, Reference Sitzer, Twamley and Jeste2006) (e.g., the names of new church members) but it is critical to understand that the effects are stimulus/information specific and unlikely to generalize (e.g., to other church members or new members of a Senior Center). Therefore, training needs to be repeated for each new piece of information that an individual wants to learn. Practically, this may take the form of a new church member's name, the location of specific household objects, or a specific route to a new doctor's office. So, although effective, the ultimate utility of these techniques is dependent on the situation and is stimulus specific. Rehearsal can also be time consuming and even complex (e.g., with spaced retrieval); factors that may further limit the clinical utility of these techniques. Although many of the available computerized training programs purport to improve cognitive abilities and there is one report of increased hippocampal activation after training (Rosen et al., Reference Rosen, Sugiura, Kramer, Whitfield-Gabrieli and Gabrieli2011), such approaches have traditionally yielded conflicting evidence of generalization to standardized neuropsychological tests (e.g., Keuider, Parisi, Gross, & Rebok, Reference Keuider, Parisi, Gross and Rebok2012; Owen et al., Reference Owen, Hampshire, Grahn, Dajani, Burns, Howard and Ballard2010) or everyday functioning across patient populations (e.g., d'Amato et al., Reference D'Amato, Bation, Cochet, Jalenques, Galland and Giraud-Baro2011; Lundqvist, Grundstrom, Samuelsson, & Ronnberg, Reference Lundqvist, Grundstrom, Samuelsson and Ronnberg2010). This limitation again highlights the need for training to be functionally oriented.
Compensatory techniques are designed to alter or augment memory processes, thereby changing the manner in which the patient learns, retains, or retrieves information. External compensatory aids are frequently used by even cognitively intact individuals (e.g., grocery lists; smartphones) and can clearly help improve everyday functioning. Such techniques are probably most effective for prospective tasks like remembering appointments and “to-do” lists. Memory notebooks have long been used in the rehabilitation of memory impairment due to traumatic brain injury (TBI) or stroke and are considered a practice standard as a result (Cicerone et al., Reference Cicerone, Langenbahn, Braden, Malec, Kalmar, Fraas and Ashman2011). Several well-documented approaches exist, including one described in the Cognitive Rehabilitation Manual published by the American Congress of Rehabilitation Medicine (Haskins et al., Reference Haskins, Cicerone, Darns-O'Connor, Eberle, Langenbahn and Shapiro-Ronsenbaum2012). Greenaway, Duncan, and Smith (Reference Greenaway, Duncan and Smith2013) modified this traditional notebook approach and have shown that a regimented training process improves activities of daily living and memory self-efficacy in patients with MCI while also reducing caregiver distress. Although such external aids can be effective, several limitations exist. The main limitation is that the individual is highly dependent on these external aids, meaning that task failure is almost guaranteed if the aid is lost or forgotten. Additionally, the use of such aids may not always be appropriate. For example, social norms dictate that an individual's name be recalled relatively quickly, so using an external aid would be especially cumbersome and awkward for this purpose. Likewise, retrieving specific information will become especially challenging with the passage of time given the accumulation of pages (digital or physical), notes, or bookmarks. Although patients may be able to locate such information by referencing key events (e.g., holidays), patients with MCI have difficulty retaining the temporal aspects of information (Gillis, Quinn, Phillips, & Hampstead, Reference Gillis, Quinn, Phillips and Hampstead2013) and associating information (Hampstead et al., 2011; for a review see Sperling, Reference Sperling2007); findings that suggest that such temporal or associative referencing will be especially challenging.
Mnemonic strategies comprise the internal compensatory category and are cognitive “tools” that facilitate the organization and association of new information, thereby facilitating a deeper level of processing. These techniques include processes like semantic organization, semantic elaboration, and mental imagery. The “internal” (i.e., cognitive) nature of these techniques means that the patient can use them virtually anywhere. Mnemonic strategies are considered a practice standard for treatment of those with mild memory deficits after traumatic brain injury (Cicerone et al., Reference Cicerone, Langenbahn, Braden, Malec, Kalmar, Fraas and Ashman2011); a fact consistent with our previous finding that these techniques may be most beneficial in early MCI (Hampstead, Sathian, Phillips, et al., Reference Hampstead, Sathian, Phillips, Amaraneni, Delaune and Stringer2012). Although comparatively less work has been performed in the field of aging and dementia, a meta-analysis (Verhaeghen, Marcoen, & Goossens, Reference Verhaeghen, Marcoen and Goossens1992) and several large-scale studies that included mnemonic strategies within larger programs (Craik et al., Reference Craik, Winocur, Palmer, Binns, Edwards, Bridges and Stuss2007; Oswald, Rupprecht, Cunzelmann, & Tritt, Reference Oswald., Rupprecht, Cunzelmann and Tritt1996; Willis et al., Reference Willis, Tennstedt, Marsiske, Elias, Koepke, Morris and Wright2006) revealed that these techniques facilitate learning and memory in healthy older adults. Our previous findings (Hampstead, Sathian, Moore, Nalisnick, & Stringer, Reference Hampstead, Sathian, Moore, Nalisnick and Stringer2008; Hampstead, Sathian, et al., 2012) and those of other groups (Belleville et al., Reference Belleville, Clement, Mellah, Gilbert, Fontane and Gauthier2011) indicate that mnemonic strategy training can be effective in MCI patients. Because such strategies engage several cognitive processes, it is possible that they can restore the use of “normal” brain regions (or networks) and/or engage alternative compensatory regions (or networks) to achieve behavioral improvement. In fact, we demonstrated increased fMRI-based activation in, and effective connectivity between, several key prefrontal and parietal regions (Hampstead, Stringer, Stilla, Deshpande, et al., Reference Hampstead, Stringer, Stilla, Deshpande, Hu, Moore and Sathian2011) as well as a partial restoration of hippocampal activation in those with MCI (Hampstead, Stringer Stilla, Giddens, & Sathian, Reference Hampstead, Stringer, Stilla, Giddens and Sathian2012). Belleville and colleagues (Reference Belleville, Clement, Mellah, Gilbert, Fontane and Gauthier2011) also suggested that the right temporoparietal junction may play a compensatory role during strategy use in MCI patients.
These findings and the associated behavioral improvements reinforce the notion that mnemonic strategies adaptively alter the manner in which patients process information. As such, they may hold particular promise for generalizing to other situations and everyday life since they involve general “rules” that, once learned, can be applied across settings. As with the other approaches, however, several limitations exist. First, these techniques are time consuming and effortful; facts that are especially important to consider when selecting outcome measures that provide limited exposure time (e.g., one word per second as in most word lists) or a large amount of information that exceeds attentional/working memory capacity. Second, the cognitive demands may render this approach too complex for more cognitively compromised patients (Hampstead, Sathian, et al., 2012). Third, using strategies may be more cumbersome than the task warrants. For example, it is far more efficient to make a grocery list (i.e., rely on an external aid) than to mentally picture and associate all of the items on that list. This last limitation again reinforces the need to select techniques that most appropriately match the patient's concerns.
Current Literature
Figure 3 reveals the variability in techniques used for the primary experimental group in the 36 memory rehabilitation studies we reviewed. We contend that understanding the conditions under which the approaches/techniques are effective is a critical first step for developing effective, individualized cognitive rehabilitation programs. Nearly half (47%) of the reviewed studies used a single treatment approach. Among these, rehearsal based approaches were slightly more often investigated than were compensatory techniques. These findings reveal that there is a relatively small, but comparatively precise, body of research that can be used to develop more comprehensive programs. As discussed below, however, it is difficult to directly compare the outcome of these studies given the highly variable dose (Challenge 3) and wide array of outcome measures used, many of which are unrelated to the primary intervention techniques (see Challenge 4). Two-thirds of all the studies had total samples of 30 or fewer patients (24/36; 66.7%), which raises additional concerns about statistical power and applicability to the larger MCI population (Figure 4).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-42054-mediumThumb-S1355617713001306_fig3g.jpg?pub-status=live)
Fig. 3 Number of studies (references in superscript) using each category (or approach) of techniques (percent of total in parentheses). Note that these designations refer to the primary intervention group. Several studies included active control conditions that used one or more of these techniques. The description of the intervention was vague in eight studies, in which case the intervention was assigned to the most appropriate category based on the information available. Studies in the “Other” category generally involved some variation of psychosocial education or psychotherapy.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-42013-mediumThumb-S1355617713001306_fig4g.jpg?pub-status=live)
Fig. 4 Total sample size (experimental + control groups) for each of the 36 studies reviewed based on type of intervention.
Half of the studies (50%) used multiple approaches to intervention. This format is more analogous to clinical practice than are the single treatment approaches discussed above and may achieve beneficial effects by exposing patients to a range of options that are adopted based on personal preference (i.e., a “buffet” approach). It is also possible that the combination and duration (see Challenge 3) of these programs facilitates synergistic effects between the various techniques. However, the combination of techniques negates any potential causal inference between a specific technique and an observed cognitive change. Rather, it only allows for the general conclusion that the specific combination of approaches resulted in improvement. This could result in the inclusion of ineffective or unnecessary techniques within a larger rehabilitation program, thereby wasting valuable resources. This limitation would be especially problematic when the clinician is faced with a limited number of billable sessions, particularly with the current trend toward empirically based treatments that may ultimately require the clinician to justify each technique used. It is also difficult to quantify the amount of exposure to, or experience with, each technique. This limitation is addressed in more detail below (see Challenge 3) but raises the possibility that the amount of training, and not the techniques themselves, is responsible for any lack of observed benefit or generalization.
Given the variability in techniques used, we have proposed a hierarchical model that will help identify effective techniques and the conditions under which they are successful (see Future Directions).
Methodological Challenge 3: Dose
Knowing the proper dose is a fundamental precondition for any treatment, yet such knowledge is lacking in the field of cognitive rehabilitation compared to other disciplines. For instance, the motor rehabilitation literature has demonstrated that functional improvement and the associated neuroplastic change requires of thousands upon thousands of trials. This includes 2500 repetitions of hand movements (Boyd & Winstein, Reference Boyd and Winstein2003), 18,000 for a finger tracking task (Carey et al., Reference Carey, Kimberley, Lewis, Auerbach, Dorsey, Rundquist and Ugurbil2002), and 31,500 for sequenced finger movements (Karni et al., Reference Karni, Meyer, Jezzard, Adams, Turner and Ungerleider1995). Likewise, constraint induced movement therapy provides patients with hundreds of hours of use with the affected limb (Wolf et al., Reference Wolf, Winstein, Miller, Taub, Uswatte, Morris and Nichols-Larsen2006, Reference Wolf, Winstein, Miller, Thompson, Taub, Uswatte and Nichols-Larsen2008). Cognitive processes are far more complex than many of these basic motor activities, so it may be reasonable to assume that comparable or even greater exposure is necessary to fully address these complex processes. It is also important to recognize the differences in clinical approaches that generally provide effectiveness data versus research studies that generally provide efficacy data. For example, clinically oriented cognitive rehabilitation often uses a patient-specific approach where the amount of treatment (i.e., sessions, hours, trials) varies depending on the targeted area(s)/abilities and patient's adherence to the treatment regimen. One example of this approach is the Ecologically Oriented Neurorehabilitation of Memory (EON-Mem; Stringer, Reference Stringer2007) that is tailored to real-world memory tasks that are both important to, and challenging for, each individual patient. Nightly homework is given and patients continue performing characteristic exercises with each memory task until they have demonstrated a reasonable level of mastery. This approach can be clinically effective (e.g., Stringer, Reference Stringer2011); however, it is largely inconsistent with the research setting where manualized, inflexible, and time limited randomized controlled trials (RCT) are considered the gold standard. As such, RCTs are unlikely to fully accommodate an individual patient's needs. Such differences have led to calls for more clinically oriented research methods without a strict reliance on RCTs (e.g., Whyte, Gordon, & Gonzalez Rothi, Reference Whyte, Gordon and Gonzalez Rothi2009).
As noted, there have been few attempts to identify dose-response relationships in studies of cognitive rehabilitation with MCI. This may relate to the difficulty in defining “dose” since it could refer to the exposure (i.e., number of trials or amount of time) necessary to learn specific information, the exposure necessary to learn to use a given technique or techniques, or the exposure necessary to demonstrate either short- or long-term improvement in functioning. The most common way of expressing dose in the 36 studies we reviewed was in the number of sessions and hours per session. The number of sessions in these studies ranged from 1 to 103 and Figure 5 shows the total number of hours (rounded to the nearest hour) for each study by technique category. It is noteworthy that the duration of intervention did not appear to be empirically based since justification was not typically provided. In essence, the research seemed to use a “best guess” approach. As could be expected, studies using multiple techniques provided more training (i.e., hours) than did those using only one type of intervention. Rehearsal-based studies using computerized programs provided 40–67 hr of training, only two of the six provided justification for the amount of training. Barnes et al. (Reference Barnes, Yaffe, Belfor, Jagust, DeCarli, Reed and Kramer2009) and Rosen et al. (Reference Rosen, Sugiura, Kramer, Whitfield-Gabrieli and Gabrieli2011) provided variable lengths of training that depended on patients achieving either asymptotic performance or over 80% completion on a given portion of the program. However, a dose-response relationship has not been established using these programs despite the relative ease with which it could be performed.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-27066-mediumThumb-S1355617713001306_fig5g.jpg?pub-status=live)
Fig. 5 Total hours of treatment for each of the reviewed studies based on type of intervention.
Regardless of the approach, it could be argued that the number of hours is an insensitive method since it provides little information about how much practice patients actually receive using the trained technique. Conversely, a trial based method may be more beneficial in this regard. For example, we found an inverse relationship between “long-term” memory of face-name associations and the number of trials needed to learn these stimuli using mnemonic strategies (Hampstead et al., Reference Hampstead, Sathian, Moore, Nalisnick and Stringer2008). Our recent RCT showed that mnemonic strategies enhanced memory after a single trial (Cohen's d = 1.18 in healthy older adults; d = 0.97 in MCI) and these benefits persisted over additional trials and subsequent sessions (see data presented in Table 3 from Hampstead, Sathian, et al., 2012).
Given the lack of information in this area, we recommend that future research (1) provide rationale for the dose provided and (2) consider using a trial-based as opposed to a gross time- or session-based measure of dose. Of course, determining the dose at which a given technique or combination of techniques is effective is only relevant if the outcome measures are appropriately selected.
Methodological Challenge 4: Outcome Measures
Most medical research targets a specific condition or disease (e.g., blood pressure, cancer) so precise outcome measures can be obtained and directly compared across studies then translated into clinical practice. However, the study of cognition is far less precise. Just as there are no “gold standards” for assessing memory impairment, none exist for assessing outcome following memory rehabilitation. This is an obvious problem when attempting to compare the results across studies, especially when taking Challenges 1–3 into account. We have identified four general types of outcome measures based on a review of the 36 studies and have summarized some of the benefits and limitations of each type (Table 2). We contend that outcome measures need to be carefully selected based on both theoretical and practical rationale and that these rationale need to be clearly conveyed in the research report. Although seemingly straightforward, such information was rarely provided in the 36 studies we reviewed. In fact, only approximately 64% of the studies provided justification for a general domain (e.g., memory, attention) while a mere 28% justified the use of specific tests/tasks.
Table 2 General types of outcome measures and some benefits and limitations associated with each
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-50917-mediumThumb-S1355617713001306_tab2.jpg?pub-status=live)
Table 3 demonstrates the profound variability in outcome measures used across the 36 studies. Multiple outcome measures of the same type (e.g., neuropsychological tests) appeared to be the rule rather than the exception within studies; a finding that reinforces concern about the theoretical justification of these measures as well as the statistical power of the studies themselves. Neuropsychological measures were, by far, the most commonly used with total of 49 different measures across 31 studies. The Mini-Mental Status Exam (Folstein, Folstein, & McHugh, Reference Folstein, Folstein and McHugh1975) was the most often used individual measure (12 times); the utility of which could certainly be questioned given its non-specific nature and potential ceiling effects in mildly impaired populations. Across studies the inferred rationale was that neuropsychological tests should be sensitive to rehabilitative efforts since they targeted cognitive processes. This assumption is fundamentally linked to the nature of the training program and is limited by the psychometric properties of the outcome measures. These assumptions will be incorrect in several instances. For example, some version of a word list task was used in 17 studies, yet such tasks are more likely to demonstrate change when patients are taught ways of semantically organizing information whereas change is less likely when the intervention focuses on the acquisition of specific information/content. The design of most neuropsychological tests (including word lists) is inappropriate for measuring the efficacy of external aids since their use would violate standardization. Furthermore, the limited stimulus exposure (e.g., one word per second) may mitigate the use of internal mnemonic strategies or rehearsal based approaches, especially in bradyphrenic patients. The ultimate ecological relevance of such measures is tenuous (Bjornebekk, Westlye, Walhovd, & Fjell, Reference Bjornebekk, Westlye, Walhovd and Fjell2010; Farias, Harrell, Neumann, & Houtz, Reference Farias, Harrell, Neumann and Houtz2003; Ruff, Reference Ruff2003). These criticisms do not mean that previous research is critically flawed. For instance, Belleville and colleagues (Reference Belleville, Clement, Mellah, Gilbert, Fontane and Gauthier2011) used a word list format to elucidate the neural mechanisms underlying treatment success in patients with MCI. Instead, these criticisms highlight the need for researchers to justify the inclusion of such tests while also considering more ecologically relevant/valid methods as primary outcome variables.
Table 3 Specific tests used as outcome measures in the 36 reviewed studies
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-95301-mediumThumb-S1355617713001306_tab3.jpg?pub-status=live)
Self- (36 measures in 23 studies) and informant- (13 measures in 8 studies) report measures were the next most commonly used, respectively, but there was again little consistency across studies. Measures assessing activities of daily living (ADLs) and emotional functioning, especially anxiety and depression, were most frequently used. Using measures of ADLs is debatable depending on how strictly one adheres to the diagnostic criteria for MCI, which require patients to generally be independent. However, research has demonstrated that patients with MCI often show some functional limitations, the severity of which appears to increase as patients progress toward AD (Brown et al., 2011; Burton, Strauss, Bunce, Hunter, & Hultsch, Reference Burton, Strauss, Bunce, Hunter and Hultsch2009). Therefore, the potential range for improvement is likely to be far smaller in “early” MCI than in “late” MCI, resulting in bias toward the latter. Although administered in only three studies, the Multifactorial Memory Questionnaire (MMQ) may be an especially promising self-report measure given its focus on clinically relevant aspects of memory, strong psychometric properties (see Troyer & Rich, Reference Troyer and Rich2002, for a full description), and sensitivity to change after treatment in both healthy older adults (Carretti, Borella, Zavagnin, & De Beni, Reference Carretti, Borella, Zavagnin and De Beni2011) and MCI (Kinsella et al., Reference Kinsella, Mullaly, Rand, Ong, Burton, Price and Storey2009; Troyer, Murphy, Anderson, Moscovitch, & Craik, Reference Troyer, Murphy, Anderson, Moscovitch and Craik2008).
Despite the importance of targeting ecologically relevant tasks/abilities, only 10 of the 36 studies (27.8%) used such measures to assess outcome. The limited use of such measures may be yet another reason why generalization is rarely observed and can likely be traced back to the facts that (1) there is a paucity of measures from which to select and (2) the few available measures are rarely used in clinical practice. However, a case could be made for including the Rivermead Behavioural Memory Test (RBMT; RBMT-Extended; Wilson, Cockburn, Baddeley, & Hiorns, Reference Wilson, Cockburn, Baddeley and Hiorns1989; Wilson et al., Reference Wilson, Clare, Baddeley, Cockburn, Watson and Tate1998; Wilson et al., Reference Wilson, Greenfield, Clare, Baddeley, Cockburn, Watson and Nannery2008) in this category given its demonstrated ecological validity (Wilson et al., Reference Wilson, Cockburn, Baddeley and Hiorns1989) and multiple test versions that make it attractive for rehabilitation research. In this case a total of 18 studies (50%) used ecological tasks. Despite its strengths, the RBMT may be relatively insensitive to change given the small number of items within each subtest (i.e., ceiling effects) (Wester, Leenders, Egger, & Kessels, Reference Wester, Leenders, Egger and Kessels2013) and limitations in the normative data (Strauss, Sherman, & Spreen, Reference Strauss, Sherman and Spreen2006; p. 843–845). Wills, Clare, Shiel, and Wilson (Reference Wills, Clare, Shiel and Wilson2000) suggested that combining the different forms may help overcome the former limitation and the RBMT-Extended is another viable possibility for this purpose, but these options do not address concerns about the normative data. Other measures like the Ecologic Memory Simulations (Stringer, Reference Stringer2011) assess memory functioning across several ecologically based tasks, have multiple versions, and have also shown clinical benefit following cognitive rehabilitation in a mixed neurological sample (Stringer, Reference Stringer2011) but have not been used in an aged population. Given the relative lack of tests and growing importance of such measures, it is clear that the development of ecologically valid tests would benefit rehabilitation research as well as the general field of Neuropsychology.
In light of these concerns, we suggest that (1) the strengths and limitations of each type of outcome measure be carefully considered during the study design phase, (2) the rationale for each outcome measure be clearly stated and theoretically and/or functionally related to the treatment approach, and (3) ecological measures be included whenever possible. Care must also be taken, however, to avoid pitfalls associated with the overuse of the same outcome measures across studies in the absence of a control group since this was found to potentially bias results in the TBI and stroke cognitive rehabilitation literature (Rohling, Faust, Beverly, & Demakis, Reference Rohling, Faust, Beverly and Demakis2009). Ultimately, designing the study and outcome measures to assess patient-specific goals may hold particular promise since it would increase patient motivation and adherence to the treatment regimen.
Methodological Challenge 5: Generalization
The ultimate goal of any rehabilitation program is to improve functioning within the patient's everyday life. Achieving this goal is easier when there is a definitive task that is applicable across a wide range of activities. Returning to the motor rehabilitation field, it is clear that regaining the use of an affected limb within the laboratory or clinical setting can have widespread impact on “real-world” activities (e.g., Wolf et al., Reference Wolf, Winstein, Miller, Taub, Uswatte, Morris and Nichols-Larsen2006). Again, cognition is less precise and individuals must learn and remember a vast array of information to function well in everyday life. Each of the Challenges discussed above can affect generalization, whether due to disease progression, an unknown dose-response relationship, and/or outcome measures that have minimal relation to everyday life. Perhaps most important are the issues raised in Challenge 2. It is clear that no single technique or even combination of techniques is sufficient for meeting all of the possible situations in everyday life. Therefore, it is inappropriate to criticize cognitive rehabilitation research (or clinical practice) for a lack of widespread or global memory improvement; however, concerns about the ecological relevance/validity of training and failure to demonstrate improvement in relevant “real world” tasks are unquestionably valid and should be the focus of future research. These issues were highlighted by Dr. Barbara Wilson (Reference Wilson2009), a pioneer in the field of cognitive rehabilitation, when she wrote, “We need to take into account individual preferences and styles because different people may prefer different strategies and, as far as possible, we should focus on things that the person with memory impairments wants and needs to learn. This means we should work on material that will be useful in everyday life. Finally, generalization or the transfer to real life must be built into the training program” (p. 81). Using the existing literature to guide future studies, the key question is how to build such generalization into the training paradigm. In the next section, we offer a model that may be helpful in this regard.
Future Directions
The marked variability in the literature reviewed above leads us to the conclusion that additional work investigating the benefits of individual categories and/or techniques is appropriate at this point in time (see Challenge 2). We previously presented a hierarchal model for establishing the efficacy of internal mnemonic strategies (Hampstead, Sathian, et al., 2012). In light of the current review, we believed it appropriate to revise this model to integrate the Challenges discussed above (Figure 6). At the first and most basic stage of the hierarchy, the technique should help patients learn and remember specific information (e.g., the names of specific church members). If the goal is to improve specific cognitive abilities, as in cognitive training, the approach should improve the targeted abilities. Failure to demonstrate such basic benefits means that the technique likely holds minimal value for those with MCI and should not be included in a comprehensive rehabilitation program. The second stage should demonstrate that patients are capable of independently using the technique to learn information similar to the training conditions (e.g., using the technique to learn the names of a social group). Failure at this stage suggests that the technique is primarily useful under the conditions of Stage 1. Stage 3 requires the critical leap from experimental to a broader range of ecological situations (e.g., learning a new route) or broader cognitive abilities. Basic behavioral (Skinnerian) principles for promoting generalization can be implemented in this process. For example, training could start with experimental stimuli (e.g., specific face-name pairs), progress to laboratory/clinic-based analogues of real-world tasks (e.g., a mock cocktail party), and finally focus on real-world environments (e.g., having the patient use the technique(s) at church, social events, and other similar situations). Explicit instruction should be provided in how the patients should apply the technique to other types of information (e.g., routes) and the generalization process reinforced. While the persistence of any improvements can and perhaps should be addressed at each stage, this information will be especially critical during Stage 3 because it would dictate when “booster” sessions are provided in the clinical program. Failure to generalize across situations would indicate the need to alter training processes and/or focus training efforts on areas that are most important to the patient (i.e., reverting to Stage 2). In essence, the hierarchy progresses from a content-based to a rule-based approach to learning. Consistent with the major challenges identified in this review, we recommend that the effects of disease severity and dose be considered at each stage. Outcome measures should be selected to answer stage-specific questions.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922011754-63449-mediumThumb-S1355617713001306_fig6g.jpg?pub-status=live)
Fig. 6 Hierarchical approach for establishing empirical support for memory rehabilitation techniques (Hampstead, B. M., Sathian, K., Phillips, P. A., Amaraneni, A., Delaune, W. R., & Stringer, A.Y. Mnemonic strategy training improves memory for object location associations in both healthy elderly and patients with amnestic mild cognitive impairment: A randomized, single blind study (Neuropsychology, 26, 385–399, 2012, APA, adapted with permission).
We posit that this approach will facilitate direct comparisons across studies and clarify how a given technique (or category of techniques) is most appropriately used for those with MCI. Once established, the empirically supported techniques could be combined into more comprehensive programs that target an individual patient's needs.
Biomarkers and Neuroimaging
Given the advances in biomarkers of Alzheimer's disease, future studies should integrate these measures into their inclusion criteria and analysis plans (e.g., compare intervention effects in biomarker positive vs. negative participants). This may be especially effective when examining methods to prolong functioning of older adults who are biomarker positive yet cognitively asymptomatic. However, additional guidelines are needed to ensure that the various biomarkers are collected, analyzed, and reported in a standardized manner across sites. This is a challenge unto itself but should be aided through large scale efforts like the Alzheimer's Disease Neuroimaging Initiative (ADNI; http://www.adni-info.org). These same biomarkers could then be used to predict treatment success (e.g., Hampstead, Sathian, et al., 2012). In a recent review, Belleville and Bherer (Reference Belleville and Bherer2012) reported evidence of cognitive-training induced changes in both structural and functional neuroimaging measures. We are especially supportive of such efforts and believe that this type of multi-method approach will facilitate the understanding, development, and selection of techniques that are most effective in those with MCI. Such data could also provide a priori rationale for pairing cognitive training techniques with specific pharmacologic agents or other non-pharmacologic approaches (e.g., exercise).
Summary
While there has been considerable progress in the early identification of those at risk of AD and other forms of dementia, treatment options are lacking. Several novel pharmacological agents are under development; however, these are years away from being widely available under the best circumstances. Even if such agents completely arrested the disease process, patients would presumably experience residual cognitive impairment. Thus, there is a clear need to identify non-pharmacologic interventions that can help maintain and/or maximize functioning in those with MCI, yet success depends on improved methodological rigor. We have highlighted four methodological challenges that are critical to consider in designing and interpreting research studies, each of which can impact the fifth challenge of promoting generalization to everyday life. The proposed model may provide a useful framework for establishing the conditions under which different techniques are effective. This model may also inform other, more holistic, approaches to rehabilitation (Huckans et al., Reference Huckans, Hutson, Twamley, Jak, Kaye and Storzbach2013) and have synergistic interactions with other non-pharmacologic interventions like exercise, diet, and general cognitive stimulation. Given the available evidence, a well-developed and empirically supported, multi-method approach likely holds the most promise for maintaining functioning in this growing population.
Acknowledgments
This work arose from an invited lecture given at the XVII International Symposium in Geriatric Psychiatry, sponsored by the Old Age Research Group (PROTER), University of São Paulo, São Paulo, Brazil. We thank Dr. Cassio Bottino for his support in this regard. Grant support from the Department of Veterans Affairs, Veterans Health Administration, Office of Research and Development, and Rehabilitation Research and Development Service [B6366W to BMH] is acknowledged. The contents of this manuscript do not represent the views of the Department of Veterans Affairs or the United States Government. Dr. Anthony Y. Stringer is the author of the Ecologically Oriented Neurorehabilitation of Memory (EON-Mem) program and receives royalties from its sale. There are no other conflicts of interest or financial disclosures. Each author provided significant intellectual contribution to warrant authorship and declares that he/she has seen and approved this manuscript. Dr. Benjamin M. Hampstead had full access to all the data in the study and had final responsibility for the decision to submit for publication.