INTRODUCTION
Early and accurate diagnosis of neurodegenerative disorders is a global health priority, as the number of individuals with dementia worldwide is estimated to reach 152 million by 2050 (Prince et al., Reference Prince, Bryce, Albanese, Wimo, Ribeiro and Ferri2013). Neuropsychological assessment is used to detect cognitive dysfunction and help determine if the impairment is associated with underlying neuropathological changes, making this testing a crucial component of establishing a dementia diagnosis. For example, episodic memory impairment related to medial temporal lobe dysfunction is a hallmark feature of early Alzheimer’s disease (AD) that helps distinguish it from other neurodegenerative disorders and normal age-related changes in cognition. An important consideration for neuropsychological assessment is test selection. Ideally, there is a balance of including a sufficient number of measures to comprehensively assess each cognitive domain, while not overtaxing patients. There are numerous measures of episodic memory spanning different modalities (verbal vs. nonverbal) and formats (structured vs. unstructured). Understanding the utility of these measures is important for the evidence-based neuropsychological assessment of memory disorders.
Story memory tests, such as the Wechsler Memory Scale – 4th edition (WMS-IV; Wechsler, Reference Wechsler2009) Logical Memory (LM) subtest, are among the most commonly used measures in research and clinical contexts. The original LM test was published in 1945 and has undergone several iterations since then (Wechsler, Reference Wechsler1945). In general, examinees are read a short story and asked to recall it immediately after it is presented and again after a short delay. Large-scale research studies, such as the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease (A4) trial, have relied on story memory performance as part of their diagnostic criteria to objectively identify cognitive impairment. However, research has demonstrated ADNI’s mild cognitive impairment (MCI) criteria may result in diagnostic inaccuracies, as reclassification of MCI with actuarial diagnostic criteria that include comprehensive neuropsychological testing results in stronger associations with AD biomarkers and higher risk of progressing to dementia (Bondi et al., Reference Bondi, Edmonds, Jak, Clark, Delano-Wood, McDonald and Salmon2014; Edmonds et al., Reference Edmonds, Delano-Wood, Clark, Jak, Nation, McDonald and Bondi2015).
Word list learning tasks, such as the Hopkins Verbal Learning Test – Revised (HVLT-R; Brandt & Benedict, Reference Brandt and Benedict2001), are another frequently used verbal memory test and require examinees to learn a list of words over multiple trials and then recall the words after a delay. Story memory and word list learning tasks are often included together in neuropsychological test batteries. Research has shown that story memory and list learning tasks are both sensitive to memory changes associated with MCI and AD (Petersen et al., Reference Petersen, Smith, Waring, Ivnik, Tangalos and Kokmen1999; Rabin et al., Reference Rabin, Paré, Saykin, Brown, Wishart, Flashman and Santulli2009) and are highly correlated with each other (Delis, Cullum, Butters, Cairns, & Prifitera, Reference Delis, Cullum, Butters, Cairns and Prifitera1988). A systematic review of neuropsychological measures that predict progression from MCI to dementia found that delayed recall subtests from story recall and list learning measures each had an overall accuracy of greater than 90% (Guild Paragraph: Sensitivity = 0.96, Specificity = 0.83; RAVLT: Sensitivity = 0.92, Specificity = 0.94; Belleville, Fouquet, Hudon, Zomahoun, & Croteau, Reference Belleville, Fouquet, Hudon, Zomahoun and Croteau2017). Such findings raise the question of whether these types of tests are interchangeable and if including both in a test battery is redundant.
Differences in the formats of these tasks suggest they may each provide unique information. The narrative format of story memory tasks provides context and structure, which can aid memory, whereas list learning tasks are unstructured and allow for the examination of ability to learn over repeated presentations (list learning tasks are typically presented over 3–5 trials, whereas story memory tasks are presented once, sometimes twice at most). There is some evidence that list learning tasks are more strongly associated with executive dysfunction than story memory tasks (Brooks, Weaver, & Scialfa, Reference Brooks, Weaver and Scialfa2006; Tremont, Halpert, Javorsky, & Stern, Reference Tremont, Halpert, Javorsky and Stern2000; Tremont, Miele, Smith, & Westervelt, Reference Tremont, Miele, Smith and Westervelt2010; Zahodne et al., Reference Zahodne, Bowers, Price, Bauer, Nisenzon, Foote and Okun2011), which could contribute to higher rates of list learning impairment. Alternatively, the unstructured format of list learning tasks may make them more challenging than story memory tasks and, therefore, more sensitive to detecting early changes in memory. For example, previous work in this area found that list learning was better at distinguishing cognitively normal individuals from those with MCI, as compared to story memory (Rabin et al., Reference Rabin, Paré, Saykin, Brown, Wishart, Flashman and Santulli2009).
Although studies have found associations between individual episodic memory tests and hippocampal volumes in cognitively normal older adults (Hackert et al., Reference Hackert, Den Heijer, Oudkerk, Koudstaal, Hofman and Breteler2002; Rosen et al., Reference Rosen, Prull, Gabrieli, Stoub, O’Hara, Friedman and DeToledo-Morrell2003; Zimmerman et al., Reference Zimmerman, Pan, Hetherington, Katz, Verghese, Buschke and Lipton2008) and in individuals with AD (Marchiani, Balthazar, Cendes, & Damasceno, Reference Marchiani, Balthazar, Cendes and Damasceno2008; Sarazin et al., Reference Sarazin, Chauviré, Gerardin, Colliot, Kinkingnéhun, De Souza and Dubois2010), direct comparisons between structured and unstructured verbal memory tests in terms of their association with underlying cortical areas are limited. Ezzati et al. (Reference Ezzati, Katz, Zammit, Lipton, Zimmerman, Sliwinski and Lipton2016) found that, among cognitively healthy older adults, list learning performance was associated with total and left hippocampal volume, whereas paragraph recall was not significantly related to hippocampal volumes, consistent with other studies in healthy older adults (Marquis et al., Reference Marquis, Milar Moore, Howieson, Sexton, Payami, Kaye and Camicioli2002; Rodrigue & Raz, Reference Rodrigue and Raz2004). The finding that list learning tasks, but not story memory tasks, are associated with hippocampal volumes in cognitively normal adults may support the early sensitivity of list learning tasks. In contrast, in a mild AD sample, both list learning and story memory delayed recall were significantly associated with hippocampal volume (Wolk & Dickerson, Reference Wolk and Dickerson2011). Story memory tasks may not reflect hippocampal atrophy until later in the disease process.
In addition to verbal memory, assessment of nonverbal or visuospatial memory is a standard component of neuropsychological evaluations. Nonverbal memory tasks, such as the Brief Visuospatial Memory Test – Revised (BVMT-R; Benedict, Reference Benedict1997), typically involve the reproduction of geometric designs immediately after stimulus presentation and after a delay period. Some findings support material-specific lateralization of verbal memory tests with left medial temporal lobe structures and nonverbal/spatial memory tests with right medial temporal lobe structures in older adults with normal cognition (Ezzati et al., Reference Ezzati, Katz, Zammit, Lipton, Zimmerman, Sliwinski and Lipton2016) and those with AD (De Toledo-Morrell et al., Reference De Toledo-Morrell, Dickerson, Sullivan, Spanovic, Wilson and Bennett2000; Petersen et al., Reference Petersen, Jack, Xu, Waring, O’Brien, Smith and Kokmen2000). In contrast, previous work by Bonner-Jackson et al. (Reference Bonner-Jackson, Mahmoud, Miller and Banks2015) found that verbal (HVLT-R) and nonverbal (BVMT-R) memory in a clinical sample were both significantly associated with bilateral hippocampal volumes, though a greater number of BVMT-R indices showed positive associations with hippocampal volumes than HVLT-R.
To examine the utility of including story memory tests in neuropsychological assessment of older adults, the current study directly compared story memory to list learning and nonverbal memory on rates of impairment and association with hippocampal volumes in a memory disorder clinic sample. It was expected that story memory would be impaired less often and that list learning and nonverbal memory would be more strongly associated with hippocampal volumes than story memory. This study also aimed to extend findings by Bonner-Jackson et al., (Reference Bonner-Jackson, Mahmoud, Miller and Banks2015) by comparing the association of verbal and nonverbal memory performance with left and right hippocampal volumes.
METHOD
Participants
This study was conducted in compliance with regulations of the Institutional Review Board of the Cleveland Clinic and in accordance with the Helsinki Declaration. Archival data were obtained from a clinical data repository consisting of patients who completed a neuropsychological evaluation as part of their routine care at a specialty outpatient neurology clinic for cognitive disorders. As patients are often referred for diagnostic clarification purposes, the cognitive disorder diagnosis (i.e., cognitively normal, MCI, or dementia) and etiology of cognitive deficits are often unknown at the time of data entry, which occurs immediately after testing is completed. Therefore, diagnosis and underlying causes are not available in the data repository. It captures a mixed clinical sample referred for neuropsychological assessment due to concerns about cognition (mostly memory complaints), which is reflective of patients seen in memory disorder clinics. In general, the most frequent diagnosis seen in this clinic is AD; other typical diagnoses include vascular cognitive impairment, frontotemporal dementia, Parkinson’s disease and atypical Parkinsonian syndromes, MCI, and subjective cognitive impairment.
Patients were selected for this study if they were ≥ 65 years old and completed all three episodic memory measures of interest (described below). For patients who had repeated neuropsychological evaluations, only data from their first evaluation was included. This resulted in a total sample of 1617 older adult patients (Mage = 74.4, rangeage = 65–93) who were included in analyses related to impairment rates. A subset of included participants (n = 182) had magnetic resonance imaging (MRI) data that underwent a quality assessment process. Three participants with significant outlier hippocampal data were removed, resulting in a sample of n = 179 for hippocampal volume analyses. Demographic and descriptive information is presented in Table 1 for the total sample and the MRI sample.
Table 1. Demographic and clinical information
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220720141039989-0084:S1355617721000850:S1355617721000850_tab1.png?pub-status=live)
LM = Logical Memory; HVLT-R = Hopkins Verbal Learning Test – Revised; BVMT-R = Brief Visuospatial Memory Test – Revised; TIV = Total Intracranial Volume.
aAge-adjusted LM scaled scores are presented for the total sample and raw scores are presented for the MRI sample; bAge-adjusted T scores are presented for the total sample and raw scores are presented for the MRI sample.
Procedure and materials
Neuropsychological measures
Three well-validated, episodic memory measures administered as part of a comprehensive neuropsychological battery were included: the WMS-IV(Wechsler, Reference Wechsler2009) LM subtest, the HVLT-R(Brandt & Benedict, Reference Brandt and Benedict2001), and the BVMT-R(Benedict, Reference Benedict1997). Standard forms for each these measures were used.
For the WMS-IV LM subtest, local clinical protocols indicate the use of the Older Adult version (aged 65–90) for all individuals over the age of 65, which involves oral presentation of two short stories, with the first story being presented twice. Examinees are asked to immediately recall the stories after each presentation (Immediate Recall; range = 0–53). A delayed free recall trial is completed 25–35 min later, for which examinees are asked to retell both stories (Delayed Recall; range = 0–39). Age-adjusted scaled scores for Immediate Recall and Delayed Recall based on published normative data for the WMS-IV were used to examine impairment rates; raw scores were used for hippocampal volume analyses.
The HVLT-R involves oral presentation of a 12-item word list over three learning trials. After each trial, examinees are asked to recall as many words as possible in any order. Scores for each trial are summed to create an immediate recall score (Total Recall; range = 0–36). After a 20-minute delay, a delayed free recall trial is completed (Delayed Recall; range = 0–12). Age-adjusted T scores for Learning and Delayed Recall based on published normative data for the HVLT-R were used to examine impairment rates; raw scores were used for hippocampal volume analyses.
The BVMT-R is a nonverbal memory test that has a similar structure as the HVLT-R. For each of the three learning trials, examinees view a stimulus page with six geometric figures on it for 10 s and then are asked to draw as many figures in their correct location as possible (Total Recall; range = 0–36). After a delay, participants are asked to spontaneously produce the figures again (Delayed Recall; range = 0–12). Age-adjusted T scores for Learning and Delayed Recall based on published normative data for the BVMT-R were used to examine impairment rates; raw scores were used for hippocampal volume analyses.
Structural MRI parameters
Imaging was performed using a 3.0 T MRI scanner (Siemens Verio prior to September 1, 2016; Siemens Skyra after that date) and 32-channel head coil. Verio acquisition was as follows: 3D magnetization-prepared rapid acquisition gradient echo (MPRAGE); 160–190 slices/1mm3 or 1mm2 × 1.2mm; sagittal acquisition; repetition time [TR] = 2300ms; echo time [TE] = 2.98ms; flip angle = 9. Skyra protocol was identical except for TE = 2.83ms. Gross segmentation was assessed with an in-house quality assessment protocol involving the review of imaging by research assistants prior to analysis. Volumetric analysis was conducted with FreeSurfer 6.0 (Dale, Fischl, & Sereno, Reference Dale, Fischl and Sereno1999; Dale & Sereno, Reference Dale and Sereno1993; Fischl, Liu, & Dale, Reference Fischl, Liu and Dale2001; Fischl et al., Reference Fischl, Salat, Busa, Albert, Dieterich, Haselgrove and Dale2002, Reference Fischl, Salat, Van Der Kouwe, Makris, Ségonne, Quinn and Dale2004; Iglesias et al., Reference Iglesias, Augustinack, Nguyen, Player, Player, Wright and Van Leemput2015; Reuter, Rosas, & Fischl, Reference Reuter, Rosas and Fischl2010; Ségonne et al., Reference Ségonne, Dale, Busa, Glessner, Salat, Hahn and Fischl2004; Ségonne, Pacheco, & Fischl, Reference Ségonne, Pacheco and Fischl2007; Sled, Zijdenbos, & Evans, Reference Sled, Zijdenbos and Evans1998).
Statistical analysis
Statistical analyses were conducted using SPSS, version 26 (IBM Corp, 2019). To examine rates of impairment, age-adjusted scaled scores (for LM) and T scores (for HVLT-R and BVMT-R) based on each test’s published normative data were converted to Z scores to provide a common cutoff across measures. Impairment was defined as >1.5 SD below age-adjusted normative means. Age-adjusted scores were used to increase ecological validity, as these scores are typically considered for diagnostic decision-making in clinical neuropsychological assessment. Frequency distributions were used to examine cumulative rates of impairment for each memory test. Participants were also grouped according to their pattern of impairment on memory testing based on delayed recall performance across all measures, and frequency distributions examined patterns of impairments (e.g., impaired on all three measures, impaired on HVLT-R and BVMT-R only, etc.).
Raw scores from memory tests were used for hippocampal volume analyses to allow for the inclusion of age as a covariate. Given the nature of the population, several variables were not normally distributed, partially due to a high number of participants who did not recall any information after the delay periods. Therefore, nonparametric partial correlations, controlling for age, education, and total intracranial volume (TIV) were conducted to examine the associations between individual memory measures and hippocampal volumes. Fisher’s r-to-z transformations testing dependent correlations from a single sample compared the magnitude of correlations between the two hemispheres and between the different memory measures (Lee & Preacher, Reference Lee and Preacher2013). Linear regression models were fit using age, education, and TIV to predict left and right hippocampal volumes separately. Memory scores were then individually added to separate models to determine the increase in the predictive value of including each memory test score.
RESULTS
Impairment rates (n = 1617)
Overall, participants were impaired on LM less often than HVLT-R or BVMT-R (Figure 1). For LM, 21.6% of participants were impaired on learning, and 35.7% were impaired on delayed recall. For HVLT-R, 34.8% of participants were impaired on learning, and 48.8% were impaired on delayed recall. For BVMT-R, 51.6% were impaired on learning, and 46.1% were impaired on delayed recall.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220720141039989-0084:S1355617721000850:S1355617721000850_fig1.png?pub-status=live)
Fig. 1. Percentage of impaired scores (>1.5 standard deviation below normative means). LM = Logical Memory; HVLT-R = Hopkins Verbal Learning Test – Revised; BVMT-R = Brief Visuospatial Memory Test – Revised.
Patterns of impairment on memory testing based on delayed recall performance across all measures were also examined. Impairment on all three measures was the most common pattern of impairment (25%), followed by impairment on both HVLT-R and BVMT-R (10.3%). Impairment on both verbal memory measures with intact BVMT-R occurred less often (5.3%). Impairment on LM and BVMT-R with intact HVLT-R was the least frequent pattern, occurring in only 1.7% of patients. For those who were impaired on only one memory test, impairment on BVMT-R was most common (9.2%), followed by HVLT-R (8.3%), and then by LM (3.8%).
Association with hippocampal volumes (n = 179)
Correlational analyses
Learning and delayed recall raw scores for all three memory measures showed significant positive correlations with hippocampal volumes (r’s = .264 – .432, p’s < .001; Table 2). Using a two-tailed test of significance, Fisher’s r-to-z transformation results indicated that the magnitude of the correlations was not differentially associated with right and left hippocampus: LM Immediate Recall (z = −0.26, p = .79); LM Delayed Recall (z = −0.19, p = .85); HVLT-R Learning (z = −0.88, p = .38); HVLT-R Delayed Recall (z = −0.13, p = .90); BVMT-R Learning (z = 0.78, p = .44); BVMT-R Delayed Recall (z = 0.75, p = .46).
Table 2. Partial correlations between memory performance and hippocampal volumes
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220720141039989-0084:S1355617721000850:S1355617721000850_tab2.png?pub-status=live)
**p < .001; LM = Logical Memory; HVLT-R = Hopkins Verbal Learning Test – Revised; BVMT-R = Brief Visuospatial Memory Test – Revised.
Fisher’s r-to-z transformations were also used to compare the magnitude of correlations between hippocampal volume and different memory measures. The correlation between right hippocampus and HVLT-R Learning (r = .264) was significantly smaller than the correlation between right hippocampus and BVMT-R Learning (r = .432, z = −2.79, p = .005), whereas the correlations with left hippocampus were not significantly different (HVLT-R: r = .299; BVMT-R: r = .403; z = −1.72, p = .08). The correlation between right hippocampus and HVLT-R Learning (r = .264) trended toward being significantly smaller than the correlation between right hippocampus and LM Immediate Recall (r = .372; z = 1.98, p = .05), whereas the correlations with left hippocampus were not significantly different (z = 1.53, p = .13). The remaining comparisons between each memory test and hippocampal volume were not significantly different: LM Delayed Recall and HVLT-R Delayed Recall (right: z = 1.05, p = .29; left: z = 1.10, p = .27); LM Immediate Recall and BVMT-R Learning (right: z = −0.94, p = .35; left: z = −0.33, p = .74; LM Delayed Recall and BVMT-R Delayed Recall (right: z = −0.55, p = .58; left: z = 0.03, p = .97);. HVLT-R Delayed Recall and BVMT-R Delayed Recall (right: z = −1.39, p = .16; left: z = −0.82, p = .41).
Linear regressions
Multiple linear regression models predicting hippocampal volumes based on age, education, TIV, and memory scores are presented in Table 3. Age, education, and TIV were entered in Block 1 of the model and explained a significant amount of variance in right (R 2 = .29, F(3, 175) = 23.51, p < .001) and left HCVs (R 2 = .27, F(3, 175) = 21.10, p < .001). The individual addition of memory subtest scores significantly improved the fit of the models to predict left and right hippocampal volumes (p’s < .001).
Table 3. Effect of adding memory subtest as a predictor of hippocampal volume
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220720141039989-0084:S1355617721000850:S1355617721000850_tab3.png?pub-status=live)
Control variables include age, education, and TIV. Each row reflects a separate regression model. All regression coefficients and ∆R 2 statistics are significant at p < .001.
LM = Logical Memory; HVLT-R = Hopkins Verbal Learning Test – Revised; BVMT-R = Brief Visuospatial Memory Test – Revised.
DISCUSSION
The findings from this study demonstrate that patients referred for neuropsychological assessment in a memory clinic are less likely to be impaired on story memory (WMS-IV LM) compared to list learning (HVLT-R) and nonverbal memory (BVMT-R). Despite this difference, all three measures were similarly associated with hippocampal volumes. Consistent with previous findings in a mixed clinical sample (Bonner-Jackson et al., Reference Bonner-Jackson, Mahmoud, Miller and Banks2015), verbal and nonverbal memory indices were significantly correlated with bilateral hippocampal volumes; however, nonverbal learning showed a stronger association with the right hippocampus compared to list learning.
Story memory being impaired less often than list learning is consistent with research by Tremont et al. (Reference Tremont, Miele, Smith and Westervelt2010) who examined verbal memory impairment rates in a smaller sample of patients with MCI (n = 90). The difference in impairment rates suggests that unstructured, list learning tasks are more sensitive to memory decline than structured, story memory tasks. Alternatively, the higher rates of impairment on unstructured memory tasks could be related to the influence of executive dysfunction on performance. An early study by Tremont et al. (Reference Tremont, Halpert, Javorsky and Stern2000) demonstrated that patients with executive dysfunction performed worse on learning and recall conditions on a list learning task but not story memory. However, findings from the latter study by Tremont and colleagues (Reference Tremont, Miele, Smith and Westervelt2010) indicated that learning trials on LM and HVLT-R were affected by executive dysfunction, whereas delayed recall was not. Therefore, some evidence supports that differences in delayed recall rates seen in the present study are explained by executive functioning. Even if impairment on unstructured memory tasks reflects executive dysfunction in part, list learning tasks have demonstrated utility to differentiate individuals with and without cognitive impairment and predict progression to dementia (Belleville et al., Reference Belleville, Fouquet, Hudon, Zomahoun and Croteau2017; Rabin et al., Reference Rabin, Paré, Saykin, Brown, Wishart, Flashman and Santulli2009). Future directions include examining executive functioning performance and its association with unstructured versus structured memory tests within this sample.
The differences in impairment rates between story memory and list learning support that these tasks differentially reflect memory functioning. Inclusion of only one of these verbal memory measures in a test battery would result in different conclusions regarding cognitive functioning. Reliance on only story memory might underdiagnosis early memory impairment, which is particularly detrimental to the goal of detecting individuals at risk for AD for the purposes of intervention. Only including list learning could result in overdiagnosis of memory impairment if the influence of executive dysfunction is not considered. However, the inclusion of a story memory task with list learning may not necessarily mitigate the risk of overdiagnosis. For example, diagnostic criteria for MCI developed by Jak, Bondi, and colleagues (Bondi et al., Reference Bondi, Edmonds, Jak, Clark, Delano-Wood, McDonald and Salmon2014; Jak et al., Reference Jak, Bondi, Delano-Wood, Wierenga, Corey-Bloom, Salmon and Delis2009) require two or more impaired scores within a cognitive domain. These criteria would most often be met by impaired learning and recall on a list learning task or impairment on list learning and nonverbal memory (given the higher rates of impairment); therefore, the inclusion of LM would not result in a different diagnostic outcome. Furthermore, only a small proportion of cases (˜5%) were impaired on story memory when list learning was intact. For these patients, the amount of information presented in story memory without the opportunity for repeated exposure may contribute to this pattern of impairment and could reflect their memory difficulties in daily life. Interpretation of both measures in the context of a full neuropsychological profile is likely beneficial.
This is the first study to examine impairment rates of both verbal and nonverbal memory measures in a clinical sample of older adults. Although verbal memory tests are considered especially sensitive to early changes associated with AD and are more often prioritized in clinical trials and research batteries, nonverbal delayed recall was impaired at a similar rate as list learning delayed recall (46.1% and 48.8%, respectively). Interestingly, total learning on nonverbal memory was highly impaired in this sample (51.6%), more so than list learning (34.8%). Deficits in both learning and retention of nonverbal information have been found in the early stages of AD (Contador, Fernández-Calvo, Cacho, Ramos, & Lopez-Rolon, Reference Contador, Fernández-Calvo, Cacho, Ramos and Lopez-Rolon2010). One of the challenging aspects of the BVMT-R is that it requires examinees to not only recall the designs but also the location of the designs on the display, and spatial localization of items has found to be impaired in early AD (Anderson, De Jager, & Iversen, Reference Anderson, De Jager and Iversen2006). Although comparisons to other nonverbal memory tests were not made given the available data, the spatial localization and learning over repeated presentations components of the BVMT-R are likely strengths compared to other measures and increase its sensitivity.
Profiles of memory impairment were also examined. Being impaired on all three measures was the most common pattern of impairment, which was expected given the nature of the sample, and supports the validity of these tests. As noted above, verbal memory decline is traditionally considered one of the earliest changes associated with AD. Therefore, it might be assumed that a large proportion of patients would demonstrate impairment on both list learning and story memory, but this profile did not occur frequently (5.3%). In contrast, impairment on the unstructured memory tests (list learning and nonverbal memory) with intact story memory was almost twice as common (10.3%) as verbal memory impairment alone. This finding supports that an unstructured format results in increased sensitivity to cognitive impairment (whether that be memory impairment alone or the influence of executive dysfunction). The list learning and nonverbal memory tasks also share a multi-trial learning format, whereas story memory is typically presented over a single trial. It may be expected that the repeated presentations would result in lower rates of impairment, which is true for individuals with normal cognition, but impaired learning is a characteristic deficit associated with medial temporal dysfunction. Therefore, the differences in administration and format rather than modality (verbal vs. nonverbal) seem to drive the patterns of memory impairment on these measures.
Another reason impairment rates may differ between tests is differences in normative data. The impairment rates in this study were determined using age-adjusted scores based on the published normative data for each test. Inclusion of individuals with cognitive impairment in a normative sample would result in lower average performance, and in turn, a lower threshold for impairment. The WMS-IV manual (Wechsler, Reference Wechsler2009) discusses this issue, as concerns were raised that the WMS-III normative sample may have included MCI and mild dementia cases. To mitigate this problem, the norming process for the WMS-IV Older Adult battery involved cognitive screening and assessment of functional status. The BVMT-R and HVLT-R, which are co-normed, also conducted cognitive screening with older adults to reduce the risk of including individuals with cognitive impairment in their normative samples. Thus, it is unlikely that normative sample differences are the primary cause of discrepancies in impairment rates, but they cannot be ruled out. Co-norming of nonverbal memory, list learning, and story memory would be beneficial and allow for a more direct comparison of impairment rates.
To examine if these three memory measures were differentially associated with underlying neuroanatomical structures associated with memory, we correlated memory performance with hippocampal volumes and examined prediction of hippocampal volumes when memory scores were added to regression models that already accounted for age, education, and TIV. Findings demonstrated that learning and delayed recall were significantly associated with hippocampal volumes across measures. Previous research by Bonner-Jackson and colleagues (Reference Bonner-Jackson, Mahmoud, Miller and Banks2015) demonstrated that both the HVLT-R and the BVMT-R were significantly correlated with hippocampal volumes in a memory clinic population, but LM was not examined. The current study replicated the findings between hippocampal volumes and the unstructured memory tests (HVLT-R and BVMT-R) and expanded the findings to a structured, verbal memory test (LM). Contrary to expectations, the magnitude of the correlations between story memory and hippocampal volume was similar to that of the correlations between hippocampal volumes and list learning and nonverbal memory. Only the learning trial from list learning had a weaker correlation with right hippocampus than learning trials from story memory (trended toward significance) and nonverbal memory (significant). This finding is consistent with previous research examining neuroanatomical correlates of learning versus delayed recall on a list learning task, which indicated that that learning (particularly early learning trials) shows a weaker association with medial temporal lobe structures than delayed recall (Putcha, Brickhouse, Wolk, & Dickerson, Reference Putcha, Brickhouse, Wolk and Dickerson2019; Wolk & Dickerson, Reference Wolk and Dickerson2011). Overall, there is no evidence that story memory is less associated with hippocampal integrity than unstructured verbal and nonverbal memory tests among patients who present to a memory disorders clinic.
Also consistent with previous findings by Bonner-Jackson et al. (Reference Bonner-Jackson, Mahmoud, Miller and Banks2015), lateralization of verbal memory with left hippocampal volume and nonverbal memory with right hippocampal volume was not found. Research comparing list learning to picture learning in AD and healthy controls also found both tests were associated with overlapping networks, contrary to expectations (Slachevsky et al., Reference Slachevsky, Barraza, Hornberger, Muñoz-Neira, Flanagan, Henríquez and Delgado2017). High rates of impairment on all three measures in the current sample may reflect progression to later stages of a neurodegenerative disease process. As other studies have demonstrated lateralization of memory performance in cognitively normal or MCI samples (Ezzati et al., Reference Ezzati, Katz, Zammit, Lipton, Zimmerman, Sliwinski and Lipton2016), more widespread atrophy at later disease stages could explain the lack of lateralization findings in clinical samples. In a study by Peng et al., (Reference Peng, Feng, He, Chen, Liu, Liu and Luo2015), only left hippocampal volume was correlated with list learning performance at follow-up in individuals with MCI, whereas list learning was associated with both left and right hippocampal volumes in AD patients. These findings support the importance of studying mixed clinical samples, as findings from highly controlled research samples may not generalize to clinical practice. Future examination of lateralization in the early versus late stages of neurodegenerative disease would be beneficial. Future studies could also examine if material-specific lateralization is associated with specific memory profiles (e.g., Do patients with impaired verbal memory but intact nonverbal memory show reduced left hippocampal volumes?). The subset of patients with available MRI data were too small to allow for comparisons between different memory profiles.
Strengths of this study include examination of memory performance in a large clinical cohort from a specialized neurology clinic. Examining a mixed clinical sample increases ecological validity and generalizes well to other memory disorder clinic settings that focus on older adults. It provides information about how these commonly used memory measures function in a real-world setting, whereas findings from a narrowly defined sample (e.g., a carefully screened AD sample that excluded those with comorbidities) do not reflect the typical patient population referred to the clinic. However, not having data on cognitive diagnosis or potential etiology available in the dataset is also a limitation. For example, being able to examine the frequency of intact story memory performance in cases that were diagnosed with amnestic MCI (based on HVLT-R and/or BVMT-R performance) would support that relying on story memory only would have resulted in underdiagnosis of cognitive impairment. Having data on likely etiology would also provide useful information about impairment rates and neuroanatomical correlates in specific patient populations, which is a future direction. However, as demonstrated by a large body of literature, associations between memory performance and hippocampal volumes are not unique to one particular disease process and are demonstrated in cognitive normal individuals as well (Van Petten, Reference Van Petten2004).
Our study, which includes a broader comparison of memory tests (two verbal and one nonverbal) than previously examined, uniquely contributes to the understanding of expected patterns of memory impairment, as well as the utility of these measures to detect underlying cortical changes. A limitation of this study is the examination of one time point. Longitudinal analysis comparing dementia risk and progression rates of different memory impairment profiles is recommended for future studies. Additionally, our sample was predominantly non-Hispanic White with relatively high educational attainment, which may limit generalizability to other samples.
Overall, this study was the first to examine rates of impairment on both verbal and nonverbal memory measures in a clinical sample of older adults. Story memory was impaired less frequently than list learning and nonverbal memory, raising concern that it may underdiagnose memory disorders if used in isolation. When memory profiles were examined, impairment on all three measures was the most common pattern of impairment, followed by impairment on the list learning and nonverbal memory. Despite differences in impairment rates, story memory was similarly associated with hippocampal volumes as to the other measures. Assessment of memory via different formats and modalities provides unique information and these measures are valid methods for assessing hippocampal integrity. Of course, it is important to consider these findings in combination with clinical decision-making (based on the referral question, presenting complaints, history, and observations) to guide test selection. Additional research understanding the neural correlates of different memory profiles and sensitivity of measures at different disease stages is warranted.
FINANCIAL SUPPORT
This work was supported by the Nevada Exploratory Alzheimer’s Disease Research Center (NIH P20 AG068053 to C.G.W., J.Z.K.C., and J.B.M.), the National Institute of Graduate Medical Studies (NIGMS; NIH-1 P20 GM109025 to J.Z.K.C., J.B.M., and S.J.B.), the Women’s Alzheimer’s Movement (J.Z.K.C.), the Engelstad Foundation (S.L.J.), and Keep Memory Alive (J.B.M.).
CONFLICTS OF INTEREST
None.
ETHICAL STANDARDS
This study was conducted in compliance with regulations of the Institutional Review Board of Cleveland Clinic and in accordance with the Helsinki Declaration.