Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-02-05T19:46:14.382Z Has data issue: false hasContentIssue false

NIH Toolbox Cognition Battery (NIHTB-CB): List Sorting Test to Measure Working Memory

Published online by Cambridge University Press:  24 June 2014

David S. Tulsky*
Affiliation:
Rusk Institute/Department of Rehabilitation Medicine, Department of Orthopedic Surgery, Department of General Medicine, New York University Langone Medical Center, New York, New York Spinal Cord Injury Laboratory, Neuropsychology and Neuroscience Laboratory, Kessler Foundation, New Jersey
Noelle Carlozzi
Affiliation:
Department of Physical Medicine and Rehabilitation, University of Michigan, Ann Arbor, Michigan
Nancy D. Chiaravalloti
Affiliation:
Spinal Cord Injury Laboratory, Neuropsychology and Neuroscience Laboratory, Kessler Foundation, New Jersey
Jennifer L. Beaumont
Affiliation:
Department of Medical Social Sciences, Northwestern University, Chicago, Illinois
Pamela A. Kisala
Affiliation:
Rusk Institute/Department of Rehabilitation Medicine, New York University Langone Medical Center, New York, New York
Dan Mungas
Affiliation:
Department of Neurology, University of California, Davis, California
Kevin Conway
Affiliation:
National Institute on Drug Abuse, Rockville, Maryland
Richard Gershon
Affiliation:
Department of Medical Social Sciences, Northwestern University, Chicago, Illinois
*
Correspondence and reprint requests to: David S. Tulsky, Assessment Research and Translation, Rusk Institute/Department of Rehabilitation Medicine, New York University Langone Medical Center, Ambulatory Care Center 240 E. 38th Street, 17th Floor, New York, New York 10016. E-mail: david.tulsky@nyumc.org
Rights & Permissions [Opens in a new window]

Abstract

The List Sorting Working Memory Test was designed to assess working memory (WM) as part of the NIH Toolbox Cognition Battery. List Sorting is a sequencing task requiring children and adults to sort and sequence stimuli that are presented visually and auditorily. Validation data are presented for 268 participants ages 20 to 85 years. A subset of participants (N=89) was retested 7 to 21 days later. As expected, the List Sorting Test had moderately high correlations with other measures of working memory and executive functioning (convergent validity) but a low correlation with a test of receptive vocabulary (discriminant validity). Furthermore, List Sorting demonstrates expected changes over the age span and has excellent test–retest reliability. Collectively, these results provide initial support for the construct validity of the List Sorting Working Memory Measure as a measure of working memory. However, the relationship between the List Sorting Test and general executive function has yet to be determined. (JINS, 2014, 20, 1–12)

Type
Special Series
Copyright
Copyright © The International Neuropsychological Society 2014 

Introduction

The National Institutes of Health (NIH) has recognized that the lack of continuity of outcomes measures across the research portfolio hinders the ability to aggregate findings across studies and populations, and has called for an effort to improve the “economies of scale in the research enterprise” by identifying common variables that could be analyzed across studies (Hodes, Insel, & Landis, Reference Hodes, Insel and Landis2013). Therefore, the NIH Toolbox project was initiated to develop a set of state-of-the-art outcomes measures that can serve as common data elements for the evaluation of cognitive, social, emotional and motor functioning across the lifespan (Gershon et al., Reference Gershon, Cella, Fox, Havlik, Hendrie and Wagster2010, Reference Gershon, Wagster, Hendrie, Fox, Cook and Nowinski2013). The Cognitive Health Domain consists of discrete subtests designed to evaluate Working Memory, Executive Function, Processing Speed, Episodic Memory, and Language, respectively, in a total of less than 30 min (Weintraub et al., Reference Weintraub, Dikmen, Heaton, Tulsky, Zelazo, Bauer and Gershon2013). This manuscript describes the validation of the new measure of working memory, the List Sorting Test, in adults without known neurological deficits. The development and validation of the List Sorting Test for use in children has been previously described by Tulsky and colleagues (Tulsky et al., Reference Tulsky, Carlozzi, Chevalier, Espy, Beaumont and Mungas2013).

The term working memory (WM) was first introduced by Newell and Simon (Newell, Reference Newell1973; Newell & Simon, Reference Newell and Simon1972) but is now more closely associated with Alan Baddeley and his colleagues (Baddeley, Reference Baddeley1987, Reference Baddeley1992, Reference Baddeley2002, Reference Baddeley2010; Baddeley & Hitch, Reference Baddeley and Hitch1974). WM commonly refers to the cognitive process involved in storing and manipulating a limited amount of information over a brief period of time (Baddeley, Reference Baddeley1987, Reference Baddeley1992). It is essentially an updated conceptualization of “short-term memory,” a passive storage buffer that became overloaded when the amount of information exceeded its capacity (Miller, Reference Miller1956). WM refers to the ability to (1) process information across tasks and modalities, (2) hold information in a short-term store, (3) manipulate information, and (4) hold the products of that manipulation in the same short-term store. The working memory conceptual workspace is an active system that forms a strong basis for information processing overall (Baddeley, Reference Baddeley2002).

In the original conceptualization, WM was comprised of a central executive system (CES) and two “slave” systems that contain the phonological loop (PL) and the visuospatial sketchpad (VS) (Baddeley, Reference Baddeley1986; Baddeley & Hitch, Reference Baddeley and Hitch1974; Logie, Reference Logie1996). The CES is the most important but least understood component of the working memory model (Baddeley, Reference Baddeley1996); it is theorized to have no retention capacity, but instead provides a set of mechanisms for controlling all the WM processes through its attentional capacity. The CES is thus thought to be responsible for the selection, initiation, and termination of processing routines within working memory. Baddeley (Reference Baddeley1996) described the CES as a “homunculus” because it is such a powerful system that is closely related to executive functioning. According to the classic model postulated by Baddeley, the CES functions include the coordination of the two slave systems and the manipulation of their contents. The CES is also responsible for the control of encoding and retrieval strategies, as well as the focus, dividing and switching of attention. In contrast, the slave systems are thought to be responsible for the temporary maintenance of information; the phonological loop is where verbal material is stored and processed, while the visuospatial sketch pad is where spatial material is stored and processed. In the face of empirical findings that did not support a three-component system of working memory, Baddeley (Reference Baddeley2000) updated the model, introducing a fourth component: the episodic buffer, which is a limited capacity system that provides passive and temporary storage of information. The episodic buffer provides an interface between the above subsystems and information from perception and long term memory (Baddeley, Reference Baddeley2000, Reference Baddeley2001, Reference Baddeley2010). The Baddeley model has been widely and successfully applied in basic and applied psychology, as well as in neuroscience (Miyake & Shah, Reference Miyake and Shah1999; Repovs & Bresjanak, Reference Repovs and Bresjanak2006).

WM is important for a wide variety of cognitive skills, such as long-term memory (Jonides, Reference Jonides1995), learning (Kirasic, Allen, Dobson, & Binder, Reference Kirasic, Allen, Dobson and Binder1996), reasoning (Salthouse, Mitchell, Skovronek, & Babcock, Reference Salthouse, Mitchell, Skovronek and Babcock1989), problem solving, planning, and active listening (Jonides, Reference Jonides1995). Miyake and colleagues (2000) have identified WM as an “updating and monitoring” component of executive functions that play a central role in WM performance, since it manipulates WM content to select, from time to time, the goal-relevant ones. McCabe, Roediger, McDaniel, Balota, and Hambrick (Reference McCabe, Roediger, McDaniel, Balota and Hambrick2010) specifically examined the relationship between WM and executive functioning, concluding that the two constructs share a factor of “executive attention” which predicts higher level cognitive processes such as episodic memory. Indeed, WM is necessary for common yet complex everyday activities which require multitasking. For instance, people will often take notes while talking on the telephone at work, perform mental arithmetic at school, or compare and calculate prices at a store. Individuals with significant WM deficits have great difficulty processing and learning information (Smith, Jonides, & Koeppe, Reference Smith, Jonides and Koeppe1996), and basic everyday tasks like writing and reading comprehension may be affected by impairments in WM (Engle, Reference Engle1996). As such, WM is a critical and primary component of cognition, and WM disruption can be particularly problematic to everyday functioning.

The importance of WM in cognition throughout the lifespan is evident when one considers that published studies examine WM from infancy through older adulthood (Pelphrey & Reznick, Reference Pelphrey and Reznick2003). Structural components of WM are thought to be in place by age 4 (Alloway, Gathercole, & Pickering, Reference Alloway, Gathercole and Pickering2006) and much work has been done with WM in pediatric populations (Conlin, Gathercole, & Adams, Reference Conlin, Gathercole and Adams2005). However, the vast majority of these studies have involved school-age children and beyond (e.g., ≥ 6 years), likely due to the increased ease of communication and ability to follow instructions in children ages 6 and over. Studies have demonstrated that WM improves significantly in conjunction with overall cognitive development, with WM span doubling in capacity between the ages of 5 and 10 (Riggs, McTaggart, & Simpson, Reference Riggs, McTaggart and Simpson2006) and tripling by the time the person reaches young adulthood (Dempster, Reference Dempster1981). WM integrity has been linked to scholastic development (Hitch, Towse, & Hutton, Reference Hitch, Towse and Hutton2001) and letter knowledge (de Jong & Olson, Reference de Jong and Olson2004), and its impairment has been linked to reading disabilities (de Jong, Reference de Jong1998). WM remains relatively stable throughout adulthood, while reduced WM capacity is evident in older adults. Many investigators have posited that declines in cognitive functions will often result in poor WM performance. For instance, Tim Salthouse (Reference Salthouse1994) attributes WM decline in older adulthood to decreased processing speed, rather than to changes in WM itself. Other explanations of age-related WM decline have attributed it to a heightened distractability and deficits in inhibiting irrelevant information in older adults, rather than direct deficits in WM (Hasher, Stolzfus, Zacks, & Rypma, Reference Hasher, Stolzfus, Zacks and Rypma1991).

Impaired WM is common in medical disorders with neurological involvement such as Multiple Sclerosis (Grigsby, Ayarbe, Kravcisin, & Busenbark, Reference Grigsby, Ayarbe, Kravcisin and Busenbark1994; Grigsby, Busenbark, Kravcisin, Kennedy, & Taylor, Reference Grigsby, Busenbark, Kravcisin, Kennedy and Taylor1994), Traumatic Brain Injury (McAllister, Flashman, Sparling, & Saykin, Reference McAllister, Flashman, Sparling and Saykin2004; Willmott, Ponsford, Hocking, & Schönberger, Reference Willmott, Ponsford, Hocking and Schönberger2009), HIV (Woods, Moore, Weber, & Grant, Reference Woods, Moore, Weber and Grant2009), and Schizophrenia (Van Snellenberg, Reference Van Snellenberg2009). Such WM deficits have been documented both behaviorally (Demaree, DeLuca, Gaudino, & Diamond, Reference Demaree, DeLuca, Gaudino and Diamond1999) and in neuroimaging studies (Chiaravalloti et al., Reference Chiaravalloti, Hillary, Ricker, Christodoulou, Kalnin, Liu and DeLuca2005; McAllister et al., Reference McAllister, Flashman, Sparling and Saykin2004). Additionally, such deficits have been shown to impact higher order cognitive processes such as long term memory (Litvan, Grafman, Vendrell, & Martinez, Reference Litvan, Grafman, Vendrell and Martinez1988; Rao et al., Reference Rao, Grafman, DiGiulio, Mittenberg, Bernardin, Leo and Unverzagt1993) and executive control (Orellana & Slachevsky, Reference Orellana and Slachevsky2013). However, debate remains as to whether these difficulties arise from impairment in the CES (D’Esposito et al., Reference D’Esposito, Onishi, Thompson, Robinson, Armstrong and Grossman1996) or from within the PL (Litvan et al., Reference Litvan, Grafman, Vendrell and Martinez1988; Rao et al., Reference Rao, Grafman, DiGiulio, Mittenberg, Bernardin, Leo and Unverzagt1993).

Given the prevalence of WM dysfunction in neurological populations, it is not surprising that various neuroimaging techniques have been applied to examine the neurofunctional basis of these deficits. Since the inception of neuroimaging, WM has received considerable attention in healthy individuals in an effort to understand its anatomical and functional architecture (Braver et al., Reference Braver, Cohen, Nystrom, Jonides, Smith and Noll1997; Fuster, Reference Fuster1989; Goldman-Rakic, Reference Goldman-Rakic1987; Jacobsen, Reference Jacobsen1935; Manoach et al., Reference Manoach, Schlaug, Siewert, Darby, Bly, Benfield and Warach1997; Milner, Reference Milner1964). Such studies have revealed a distributed network of cerebral involvement during WM functioning, including activation of the prefrontal and premotor regions of the frontal lobes (e.g., the middle and inferior frontal gyri) (Belger et al., Reference Belger, Puce, Krystal, Gore, Goldman-Rakic and McCarthy1998; Braver et al., Reference Braver, Cohen, Nystrom, Jonides, Smith and Noll1997; Courtney, Ungerleider, Keil, & Haxby, Reference Courtney, Ungerleider, Keil and Haxby1997; Crosson et al., Reference Crosson, Rao, Woodley, Rosen, Bobholz, Mayer and Stein1999; Curtis & D’Esposito, Reference Curtis and D’Esposito2003; Grossman et al., Reference Grossman, Armstrong, Onishi, Thompson, Schaefer, Robinson and Silberberg1994; Linden, Reference Linden2007) as well as parietal (Braver et al., Reference Braver, Cohen, Nystrom, Jonides, Smith and Noll1997; McAllister et al., Reference McAllister, Saykin, Flashman, Sparling, Johnson, Guerin and Yanofsky1999; Paulesu, Frith, & Frackowiak, Reference Paulesu, Frith and Frackowiak1993) and temporal activation (Paulesu et al., Reference Paulesu, Frith and Frackowiak1993; Salmon et al., Reference Salmon, Van der Linden, Collette, Delfiore, Maquet, Degueldre and Franck1996; Seidman et al., Reference Seidman, Breiter, Goodman, Goldstein, Woodruff and Rosen1998). Some regions have been shown to be modality-specific, responsible for verbal working memory in particular (Curtis & D’Esposito, Reference Curtis and D’Esposito2003; Linden, Reference Linden2007). Other regions appear to be multimodal in nature (Curtis & D’Esposito, Reference Curtis and D’Esposito2003; Klingberg, Kawashima, & Roland, Reference Klingberg, Kawashima and Roland1996; Linden, Reference Linden2007). Most research in healthy individuals has supported the Baddeley WM model through anatomic differentiation between the cerebral substrates of maintenance (PL and VS) and manipulation (CES). Findings indicate that simple maintenance of information is associated with ventral prefrontal regions (Awh et al., Reference Awh, Jonides, Smith, Schumacher, Koeppe and Katz1996). The dorsal prefrontal cortex becomes involved when information needs to be manipulated (Beatty et al., Reference Beatty, Wilbanks, Blanco, Hames, Tivis and Paul1996; D’Esposito et al., Reference D’Esposito, Aguirre, Zarahn, Ballard, Shin and Lease1998; Owen, Evans, & Petrides, Reference Owen, Evans and Petrides1996), which is thought to reflect central executive system (CES) recruitment.

WM capacity has also been linked to differential patterns of cerebral activation during WM task performance across the lifespan. Results indicate that the development of adult cognition is reliant upon increased cerebral activation in localized regions necessary for effective WM performance (Scherf, Sweeney, & Luna, 2006). Specifically, young children have shown limited activation of core WM regions (e.g., dorsolateral prefrontal cortex and parietal regions), with greater activation documented in these regions in adolescents (Klingberg, Forssberg, & Westerberg, 2002). Adults demonstrate the most specialized network, including left dorsolateral prefrontal cortex, ventrolateral prefrontal cortex, and the supramarginal gyrus (Scherf et al., 2006). These changes in cerebral activation patterns have shown functional relevance, with WM capacity being positively correlated with such activation (Klingberg et al., 2002; Crone, Wendelken, Donohue, van Leijenhorst, & Bunge, 2006; Kwon, Reiss, & Menon, 2002). In fact, neural network models have indicated that the strength of frontoparietal connectivity underlies WM capacity (Edin et al., Reference Edin, Klingberg, Johansson, McNab, Tegner and Compte2009; Edin, Macoveanu, Olesen, Tegner, & Klingberg, Reference Edin, Macoveanu, Olesen, Tegner and Klingberg2007).

WM is frequently assessed with tasks requiring simultaneous storage and processing, such as complex span tasks (e.g., backward digit span, letter-number sequencing, etc.) and various updating tasks (e.g., n-back, keeping track, etc.) (Conway et al., Reference Conway, Kane, Bunting, Hambrick, Wilhelm and Engle2005; Schmiedek, Hildebrandt, Lovden, Lindenberger, & Wilhelm, Reference Schmiedek, Hildebrandt, Lovden, Lindenberger and Wilhelm2009). WM is significantly related to other tasks of sustained attention, executive functioning, and processing speed. Kyllonen and Christal (Reference Kyllonen and Christal1990) demonstrated that WM is related to reasoning tests and discussed the fact that WM is highly related to general intelligence. Miyake et al. (Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000) showed that working memory, while separate from other forms of executive function, is related to set shifting and inhibition and shares an underlying commonality with them. Similarly, Lehto (Reference Lehto1996) demonstrated that WM tasks are correlated with the Wisconsin Card Sorting Tasks or set shifting tasks. WM abilities and executive functioning have been noted to be similarly impacted in populations such as adults with Huntington’s Disease (You et al., Reference You, Geschwind, Sha, Apple, Satris, Wood and Possin2013) and children with specific language impairment (Vugs, Hendriks, Cuperus, & Verhoeven, Reference Vugs, Hendriks, Cuperus and Verhoeven2014), with some authors discussing WM as a facet of executive functioning (Hofmann, Schmeichel, & Baddeley, Reference Hofmann, Schmeichel and Baddeley2012). Given the importance of WM to overall cognitive development and the execution of higher order cognitive tasks, in the NIH Toolbox emphasis and concern was placed upon the selection of a valid and reliable task that could be used across the lifespan and that could be administered quickly as a core component of a 30-min battery.

As reported by Tulsky et al. (Reference Tulsky, Carlozzi, Chevalier, Espy, Beaumont and Mungas2013), the Toolbox List Sorting Working Memory Test is a sequencing task requiring children and adults to sort and sequence information. Participants are presented with a series of stimuli (i.e., illustrated pictures of an animal or a piece of food), each of which is both visually and auditorily presented by computer (see Figure 1). A picture of each stimulus is displayed on the computer monitor for 2 s while the name of the stimulus is simultaneously being read via a computerized voice; stimuli are presented one after another, in a seamless manner. The examinee is required to remember each stimulus in a series, mentally re-order them from smallest to largest, and recite the names of the stimuli in this order. The final version of this task included both a 1-list and 2-list component. In the “1-list” section, only one type of item stimulus is presented (e.g., “animals” or “food”). Participants begin by sequencing a two-item string, and with each correct response the string is increased by a single item (up to a maximum of a seven-item string). If the participant is unable to sequence the string correctly, they are provided with a second trial of the same number of items in the string; the task is discontinued when the participant provides incorrect responses on two trials with the same number of items in the string or when the participant correctly sequences all seven items. Participants begin with the 1-list version of the task (which requires sequencing among a single category, either “animals” or “food”), and then, upon discontinuing the 1-list section, move on to the 2-list section (which requires sequencing of both “food” and “animals” within a string). In the 2-list section, the examinee is required to sort the stimuli by category before sequencing the stimuli in size order, requiring the examinee to both sort and dual sequence the information. Participants begin the 2-list section with a two-item string, which is increased by a single item with each correct response (also up to a maximum of a seven-item string). They are provided a second trial of the same number of items in the string following an incorrect response, and the task is discontinued when the participant provides incorrect responses on two trials with the same number of items or when the participant correctly sequences all seven items. List Sorting scores are based upon a sum of the total correct across both lists, which comprise the List Sorting “Total Score.” The raw sum score is then transformed to a standardized t-metric (mean=50, 50 and SD=10).

Fig. 1 Examples of One-List and Two-List List Sorting Task. 1-List List Sorting requires participants to sequence items according to a single category, whereas 2-List List Sorting requires sequencing that involves an alternation between two different categories. The above is a sample NIH Toolbox List Sorting Test item. All NIH Toolbox-related materials are ©2012 Northwestern University and the National Institutes of Health.

The NIH Toolbox Cognition team chose a sequencing task because such a task had proven successful in measuring WM in the Wechsler Adult Scales of Intelligence, Third Edition (Tulsky, Saklofske, & Zhu, Reference Tulsky, Saklofske and Zhu2003; Wechsler, Reference Wechsler1997), in individuals with schizophrenia (Gold, Carpenter, Randolph, Goldberg, & Weinberger, Reference Gold, Carpenter, Randolph, Goldberg and Weinberger1997), and in older adults (Mungas, Reed, Marshall, & Gonzalez, Reference Mungas, Reed, Marshall and Gonzalez2000). The team developed a task that would be valid for the youngest children that was modeled from the sequence test in the Spanish and English Neuropsychological Assessment Scales (Mungas et al., Reference Mungas, Reed, Marshall and Gonzalez2000), in which animals and/or fruit were verbally presented to examinees who were instructed to re-order the stimuli in size order.

Extensive pilot development was conducted to help mold the test into a task that could be completed by very young children (Tulsky et al., Reference Tulsky, Carlozzi, Chevalier, Espy, Beaumont and Mungas2013). However, a small percentage of children ≤6 did not complete the first items correctly on either the 1-list or 2-list tasks, prompting the team to remove it as part of the early childhood battery. The test is recommended as part of the core Toolbox battery with children ≥7. The administration instructions, starting points, and discontinue rules are identical for children and adults on both the 1-list item and 2-list items. Since information is presented both visually and auditorily, scores do not simply reflect one subcomponent (e.g., PL or VS) of WM. Instead, the List Sorting Test is designed to globally assess the WM system.

Method

Participants

A sample of 268 participants ages 18 to 85 years (n=159; ages 20–60 years and n=109; ages 65–85 years) was recruited from four testing sites across the United States. Participants ages 20–60 were recruited with flyers advertising a research study for healthy volunteers. For the older adults (age group, 65–85 years), 62 of the participants were recruited from a pool of known cognitively healthy individuals participating in other NIH studies while the remaining 47 participants were recruited through the use of flyers. No formal screening of health or cognitive functioning was performed, but self-report questionnaires provided information about health status. Study coordinators established and monitored a stratification matrix so that key demographic variables (i.e., age, education level, gender, and ethnicity) could be distributed. A subset of 89 participants (approximately 33%) completed a retest 7 to 21 days later to assess test–retest reliability. The participants in the validation phase are described in detail in Weintraub et al. (Reference Weintraub, Dikmen, Heaton, Tulsky, Zelazo, Bauer and Gershon2013). The institutional review board at each collaborating site reviewed and approved this study, and all study procedures were carried out in accordance with the principles outlined by the Helsinki Declaration.

Measures

Participants were tested with the Toolbox List Sorting Working Memory Test and criterion measures including the Wechsler Adult Intelligence Scale, 4th Edition Letter-Number Sequencing subtest and the Paced Auditory Serial Addition Test (Gronwall, Reference Gronwall1977) (that would provide indices of concurrent validity) and the Peabody Picture Vocabulary Test, 4th Edition (that would provide an index of discriminant validity). Other criterion measures of executive functioning (e.g., Delis Kaplan Executive Function System Color-Word Test) and processing speed (e.g., the Wechsler Adult Intelligence Scale, Fourth Edition Processing Speed Composite) were expected to be related to WM measures. Finally, other NIH Toolbox measures were administered (in particular, the Dimensional Change Card Sorting and the Flanker tests) which are described in other manuscripts in this special issue (Zelazo et al., Reference Zelazo, Andersen, Richler, Wallner-Allen, Beaumont and Weintraub2013).

The Toolbox List Sorting Working Memory Test

As described previously, the List Sorting Test involves size order sequencing of familiar stimuli. In the validation version, the task was discontinued when two trials of the same length were failed. This task takes approximately 10 min to administer. Test scores consisted of combined total items correct on the 1- and 2-list versions of the task (maximum 28).

Validation Measures

WAIS-IV Letter-Number Sequencing (Wechsler, Reference Wechsler2008)

In this test, participants are presented with a mixed list of numbers and letters and asked to repeat the list by saying the numbers first in ascending order and then the letters in alphabetical order. This subtest has a strong WM component (Crowe, Reference Crowe2000; Gold et al., Reference Gold, Carpenter, Randolph, Goldberg and Weinberger1997; Haut, Kuwabara, Leach, & Arias, Reference Haut, Kuwabara, Leach and Arias2000) and has demonstrated discrimination between individuals with TBI and matched controls (Donders, Tulsky, & Zhu, Reference Donders, Tulsky and Zhu2001). Scores reflect the number of correct responses (maximum 30).

WAIS-IV Coding (Wechsler, Reference Wechsler2008)

This test requires the participant to associate numbers and symbols using a key. It has demonstrated sensitivity to motor coordination, short-term memory deficits, and visuoperceptual problems (Tulsky et al., Reference Tulsky, Saklofske and Zhu2003). Scores reflect the number of correct responses in 120 s (maximum 135).

WAIS-IV Symbol Search (Wechsler, Reference Wechsler2008)

This subtest is designed to measure speed at processing new information. Participants are presented with a series of paired groups; each pair consists of a target group (two symbols) and a search group (five symbols). The participant must identify whether either target symbol is part of the search group. Scores reflect number of correct responses minus number of incorrect responses in 120 s (maximum 60).

WAIS-IV Processing Speed Index (PSI) (Wechsler, Reference Wechsler2008)

PSI is a composite score that combines performance on WAIS-IV coding and WAIS-IV Symbol Search. PSI scores are significantly lower in individuals with several neuropsychological and psychoeducational problems (Hawkins, Reference Hawkins1998; Martin, Donders, & Thompson, 2000).

DKEFS Inhibition

This is a three-part test of cognitive flexibility, ability to suppress a habitual response in favor of an unusual one, and selective attention. First, the examinee names the colors of boxes (red, green, or blue). Next, the examinee reads the names of the colors (“red,” “green,” “blue”) that appear in black print. Finally, the examinee responds to the interference card in which the words are printed in non-corresponding color (e.g., “red” printed in blue ink) with the instructions to ignore the printed words and report only the color of ink in which the word is printed. For this study, we examined the color-word inhibition score, which provides a measure of susceptibility to interference; higher scores indicate better performance.

Paced Auditory Serial Addition Test (PASAT) (Gronwall, Reference Gronwall1977)

The PASAT is a measure of cognitive function that specifically assesses auditory information processing speed and flexibility, attention, and calculation ability. Single digits are presented every 2 s and the participant must add each new digit to the one immediately before it. Scores reflect the number of correct sums given (maximum 50).

Peabody Picture Vocabulary Test, IV Edition (PPVT-IV) (Dunn & Dunn, Reference Dunn and Dunn2007)

This is a measure of expressive vocabulary and word retrieval in which participants are asked to identify which of four pictures reflects a specific word. Scores are based on the number of correct responses (maximum 228).

The NIH Toolbox Dimensional Change Card Sorting (DCCS) Test (Zelazo et al., Reference Zelazo, Andersen, Richler, Wallner-Allen, Beaumont and Weintraub2013)

The Dimensional Change Card Sort Test is designed to assess the set shifting component of executive functioning. A target visual stimulus is displayed and the participant must match this to one of two choice stimuli according to shape or color. A criterion word “shape” or “color” appears at the bottom of the screen. For adults, the scoring algorithm is based on both accuracy and reaction time.

The NIH Toolbox Flanker Inhibitory Control Test (Zelazo et al., Reference Zelazo, Andersen, Richler, Wallner-Allen, Beaumont and Weintraub2013)

The “Flanker” test measures the ability of the participant to inhibit responses to irrelevant task dimensions. On each item, a target stimulus (fish or arrow) is presented in the center, surrounded on both sides by similar stimuli pointing in either the same or different direction as the target stimuli. The examinee must indicate the direction that the central target stimulus is pointing and inhibit a response based upon the “flanking” stimuli. The scoring algorithm integrates accuracy and reaction time.

Wisconsin Card Sorting Test-64 Card Version (Kongs, Thompson, Iverson, & Heaton, Reference Kongs, Thompson, Iverson and Heaton2000)

The Wisconsin Card Sorting Test-64 Card Version is a shortened, 64-card version of the Wisconsin Card Sorting Test, which assesses the ability to shift sets using visual stimuli that are easily verbally mediated. It requires participants to sort pictured cards into piles according to changing rules. The test requires cognitive flexibility since, as the rules for sorting change, the examinee is required to mentally shift sets. The perseverative errors score was used as a criterion measure for this study.

Data Analysis

This study examines associations of the Toolbox List Sorting Working Memory Test scores with age, its test–retest reliability, and its convergent and discriminant validity. Age associations reflect the validity of the Toolbox Cognition tests for detecting age-related cognitive decline during adulthood. Test–retest correlations provide information on the reliability and stability of the measure. Convergent and discriminant validity results provide evidence that the Cognition Battery is measuring the intended constructs.

For the Toolbox and “gold standard” tests, raw scores for each subtest were converted to normally distributed scaled scores that were not age-corrected. These standard scores were created by first ranking the raw scores and then applying a normative transformation to the ranks to create a standard normal distribution and, as a final step, rescaling the distribution to have a mean of 10 and a standard deviation of 3. These scaled scores were used for all analyses and are not adjusted for age or any other demographic variable. Pearson correlation coefficients between actual age and test performance were calculated to assess the ability of the Toolbox List Sorting Working Memory Test to detect age-related cognitive decline during adulthood. Intraclass correlation coefficients (ICC) with 95% confidence intervals were calculated to evaluate test–retest reliability. ICCs are more likely than Pearson correlation coefficients to be affected by practice effects or other substantial differences that occur during the re-test interval. As a result, Pearson correlation coefficients tend to overestimate reliability coefficients. We computed practice effects, or the change in test scores between the test–retest administrations, by subtracting the performance at time 1 from the performance at time 2. We calculated the standardized estimate of the mean change (or “effect size”) by dividing the change score by the standard deviation of the time 1 score (Cohen, Reference Cohen1992). Convergent validity was assessed with correlations between the List Sorting Task and well-established “gold standard” working memory measures (e.g., WAIS-IV Letter-Number Sequencing, PASAT). Also, because there has been shown to be a relation between working memory and executive functioning tests, convergent validity was demonstrated between the List Sorting Task and measures of executive function (e.g., D-KEFS Color word task, Wisconsin Card Sorting Test, DCCS, Flanker). Finally, evidence of discriminant validity consisted of lower correlations with a “gold standard” measure of a different cognitive construct (i.e., receptive memory as assessed by the PPVT-4). To examine the effect of other demographic associations with performance, comparisons were performed using general linear models adjusted for age, education, and gender. Effect sizes are reported as Cohen’s d, with cutoffs of .20, .50, and .80 indicating small, medium, and large effects, respectively.

Results

Of the 268 participants in the study, 264 completed the NIHTB List Sorting Test. The minimum score obtained in the sample was 8 points and the maximum score was 28 points; however, only one individual in the sample received a score at the ceiling of 28 points. The median score was 18 while the mean score was 18.32, with a standard deviation of 2.99.

Age Effects

Cognitive abilities are expected to peak in early adulthood and then gradually decline with age. Participants ages 20–60 had average scaled scores of 10.84 (SD=2.78) whereas participants ages 65–85 had average scores of 8.75 (SD=2.80) (see Figure 2). A negative association was seen between age and List Sorting test performance (df=264; r=-0.44; p <.0001); successive age groups tended to complete fewer items correctly during middle and later adulthood.

Fig. 2 Toolbox List Sorting Working Memory Test scores by age (means ± 2 standard errors.

Other Demographic Associations

Table 1 shows effect sizes for other demographic factors known to influence cognitive performance, after adjusting for age and other additional relevant demographic variables. There was no significant difference in List Sorting Test or gold standard scores between males and females, with negligible to small effect sizes of 0.00 to 0.15. Caucasian participants scored better than African American or Hispanic participants on all measures, with moderate effect sizes between 0.41 and 0.58. Significant differences were also demonstrated for education, general health, and school problems for the List Sorting Test and gold standard measures.

Table 1 Effect sizes (ES) for comparisons of scores between groups

Note. 1Gender comparison adjusted for age and education.

2 Race/ethnicity comparisons adjusted for gender, age, and education.

3 Other comparisons adjusted for age.

Test–Retest Reliability

For the participants who completed the retest 7 to 21 days after the initial test, test–retest ICCs were computed and demonstrated a high degree of consistency. Overall ICC for the Toolbox List Sorting Working Memory Test was .77 (95% CI =.67, .84) which was parallel to the test–retest reliability ICC for the criterion measure, the WAIS-IV Letter Number Sequencing, (r=0.80; 95% CI=.71, .86). The List Sorting Test exhibited a significant but small practice effect. The mean practice effect in scaled score units=0.79, SD=1.89, t(88)=3.95, p=.001, d=0.27. This is very similar to the practice effect obtained on the WAIS-IV Letter Number sequencing (mean=0.49; SD=1.76; t(88)=2.59; p=.01; d=0.18).

Convergent and Discriminant Validity

Table 2 shows results for convergent and discriminant validity. Convergent validity was demonstrated through the correlations between the List Sorting test and the WAIS-IV Letter Number Sequence (r=0.57; p<.0001), the processing speed measures (r=.50 to r=.52; all p<.0001), and the measures of executive function (r=.43 to r=.58; all p<.0001). We had predicted that this correlation (between two WM tasks) and between WM and executive functioning tests would be larger than the correlations between the List Sorting task and a truly divergent task, namely receptive vocabulary. Hence, discriminant validity was demonstrated through the negligible corrections between the List Sorting test and the PPVT, which had the lowest-magnitude correlation of r=.24 (p<.0001). Finally, Fisher’s Exact Tests were conducted to statistically compare correlations between the List Sorting test and WAIS-III Working Memory versus the List Sorting Test and PPVT (r=0.57 vs. r=0.24; p<.001).

Table 2 Convergent & Discriminant Validity of Toolbox List Sorting Working Memory test

Note. All p<.0001.

Discussion

The NIH Toolbox List Sorting Test is a new sequencing task that is designed to measure WM in 7–85 year olds. The task contains series of stimuli that have to be sequenced in size order. In the initial part of the test, the examinee sequences one type of stimulus (e.g., animals) in size order, from smallest to largest. The examinee is then asked to sequence stimuli from two different categories. The second part of the test, a 2-list task, requires examinees to both “bin” the stimuli into one of two categories (animals or food) while simultaneously sequencing each subgroup of stimuli in size order, from smallest to largest.

Successful WM tests tax the limit of an individual’s storage capacity with a goal of determining when the WM system becomes overloaded. WM tasks often involve multi-tasking activities such as reading a series of sentences and remembering the last word, reciting numbers while performing arithmetic tasks, or remembering a string of numbers and reciting what number occurred two or three numbers back. The two component tasks are often quite different and as such, these activities can be frustrating to the examinee. A strength of the List Sorting Test is that it progressively taxes WM capacity while remaining simple and straightforward. The test can be completed by children older than 7 years of age (Tulsky et al., Reference Tulsky, Carlozzi, Chevalier, Espy, Beaumont and Mungas2013) and by adults through at least age 85.

Unlike traditional tests of WM that focus on either verbal functioning (PL) or visual stimuli (VS), the NIH Toolbox List Sorting Test presents each stimulus in both visual and auditory modalities. The List Sorting Test was designed to provide a general assessment of WM ability rather than provide specific, targeted assessment of the verbal or visual structures underlying WM ability. For a measurement tool like the NIH Toolbox, where the objective is to provide a rapid assessment of WM ability that can be repeated over time, the data presented here support such a claim.

The age-trend data in this study reflect the pattern of gradual decline through adulthood that one would expect from a test of this kind (Dempster, Reference Dempster1981; Gathercole, Pickering, Ambridge, & Wearing, Reference Gathercole, Pickering, Ambridge and Wearing2004; Heaton, Taylor, & Manly, Reference Heaton, Taylor and Manly2003; McAuley & White, Reference McAuley and White2011; Wechsler, Reference Wechsler1952). Performance tends to increase, as expected, through the childhood years, peaking in the early 20s. Test performance then tends to decrease through adulthood. The performance pattern by age mirrors the pattern on other measures of WM as well as other performance-based tests.

Effects of additional demographic factors on the Toolbox List Sorting Working Memory Test were generally small to moderate, and consistent in magnitude with corresponding gold standard measures. Results from the Toolbox List Sorting Test parallel results on other, related cognitive tests (Heaton et al., Reference Heaton, Taylor and Manly2003; Strauss, Sherman, & Spreen, Reference Strauss, Sherman and Spreen2006). Medium effect sizes were found for race/ethnicity and education, a finding that is typical of other WM measures. Furthermore, with regard to ecological validity, the List Sorting Test has small to moderate relationships with current self-reported general health status as well as self-reported history of problems in school.

As predicted, the List Sorting Test demonstrated test–retest reliability with a test–retest correlation coefficient of .77. This is comparable to other established tests of WM such as the WAIS-IV Letter Number Sequencing task, where the coefficient was .80. Furthermore, study data provide evidence of convergent and discriminant validity of the List Sorting Test as a measure of WM and executive function and also provide initial support for the construct validity of the List Sorting Test. Convergent validity is seen in the correlation between the List Sorting Test and other measures of WM (i.e., WAIS-IV Letter Number Sequencing and the PASAT). The correlations of r=.57 and r=.50, respectively, provide evidence that both tests are measuring a similar construct. Similarly, discriminant validity is seen in the relatively low correlation between the List Sorting Test and the PPVT, a test of receptive vocabulary (r=.24).

It is important to note that while the List Sorting Test is related to criterion measures of WM, it is also moderately correlated with tests measuring executive functioning including tests of inhibition, cognitive flexibility, perseveration, and processing speed. This is not a unique finding as several previous investigators have reported high correlations between WM and executive functioning (Lehto, Reference Lehto1996; Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000) and other measures of general reasoning (Kyllonen & Christal, Reference Kyllonen and Christal1990). The List Sorting Test had moderately high correlations with the WAIS-IV Processing Speed Index (r=.52) and also with tests of executive functioning (i.e., WCST, DKEFS, DCCS, and Flanker; correlations range from r=.43 to r=.58). While not uncommon in a test of WM, examiners should be aware that the List Sorting Test is related to these additional constructs. Further research will help the research and clinical communities determine if the List Sorting Test is a specific measure of WM or a more global measure of executive functioning.

The validation of a new test is always a process. There is not one study or one statistic that captures construct validity. Originally introduced by Cronbach and Meehl (Reference Cronbach and Meehl1955), the concept of construct validity was updated by Samuel Messick (Reference Messick1980) to entail a process of “marshaling” data from several experiments where the researcher forms hypothesis and uses empirical data to confirm that the test performs as expected. Messick (Reference Messick1991) then went a step further to assert that all types of validity (e.g., content, criterion, discriminant) were really subcomponents of construct validation. The initial data presented here provide the initial empirical support of the NIH Toolbox List Sorting Test as a valid measure of WM. These data can be “marshaled together” with the results presented previously by Tulsky et al. (Reference Tulsky, Carlozzi, Chevalier, Espy, Beaumont and Mungas2013) showing that List Sorting Test’s performance in children also met the a priori research hypotheses.

At the same time, the results in this manuscript stop short of providing the conclusive validity evidence necessary for usage of the NIH Toolbox List Sorting Test in clinical populations for either clinical trials research or clinical uses. Furthermore, it has yet to be determined if the List Sorting task will measure WM independent of general executive functioning. These limitations should not be regarded as negative properties of the List Sorting Test, but rather as a limitation of the data that has been collected to date and a caution until further research is performed. Additional studies are needed to evaluate the performance of this test (and the other NIH Toolbox component tests) in clinical groups. That notwithstanding, the results presented here do provide support for the validity and reliability of the NIH Toolbox List Sorting Test.

Acknowledgments

Funding for this research was provided by contract number N01AG60007 from the National Institutes of Health. The authors have nothing to disclose.

References

Alloway, T.P., Gathercole, S.E., & Pickering, S.J. (2006). Verbal and visuospatial short-term and working memory in children: Are they separable? Child Development, 77(6), 16981716.CrossRefGoogle ScholarPubMed
Awh, E., Jonides, J., Smith, E.E., Schumacher, E.H., Koeppe, R.A., & Katz, S. (1996). Dissociation of storage and rehearsal in verbal working memory: Evidence from positron emission tomography. Psychological Science, 7(1), 2531.CrossRefGoogle Scholar
Baddeley, A. (1986). Working memory. Oxford, England: Oxford University Press.Google ScholarPubMed
Baddeley, A. (1987). Working memory (Vol. 11). Gloucestershire: Clarendon Press.Google Scholar
Baddeley, A. (1992). Working memory. Science, 255(5044), 556559.CrossRefGoogle ScholarPubMed
Baddeley, A. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychology, 49A, 528.CrossRefGoogle Scholar
Baddeley, A. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4(11), 417423.CrossRefGoogle ScholarPubMed
Baddeley, A. (2001). Is working memory still working? American Psychologist, 56(11), 851864.CrossRefGoogle ScholarPubMed
Baddeley, A. (2002). Is working memory still working? European Psychologist, 7(2), 8597.CrossRefGoogle Scholar
Baddeley, A. (2010). Working Memory. Current Biology, 20(4), R136R140.CrossRefGoogle ScholarPubMed
Baddeley, A., & Hitch, G.J. (1974). Working Memory. In G.H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 8, pp. 4790). San Diego, CA: Academic Press.Google Scholar
Beatty, W.W., Wilbanks, S.L., Blanco, C.R., Hames, K.A., Tivis, R., & Paul, R.H. (1996). Memory disturbance in multiple sclerosis: Reconsideration of patterns of performance on the selective reminding test. Journal of Clinical and Experimental Neuropsychology, 18(1), 5662. doi:10.1080/01688639608408262CrossRefGoogle ScholarPubMed
Belger, A., Puce, A., Krystal, J.H., Gore, J.C., Goldman-Rakic, P., & McCarthy, G. (1998). Dissociation of mnemonic and perceptual processes during spatial and nonspatial working memory using fMRI. Human Brain Mapping, 6(1), 1432.3.0.CO;2-O>CrossRefGoogle ScholarPubMed
Braver, T.S., Cohen, J.D., Nystrom, L.E., Jonides, J., Smith, E.E., & Noll, D.C. (1997). A parametric study of prefrontal cortex involvement in human working memory. Neuroimage, 5, 4962.CrossRefGoogle ScholarPubMed
Chiaravalloti, N., Hillary, F., Ricker, J., Christodoulou, C., Kalnin, A., Liu, W.C., DeLuca, J. (2005). Cerebral activation patterns during working memory performance in multiple sclerosis using FMRI. Journal of Clinical and Experimental Neuropsychology, 27(1), 3354.CrossRefGoogle ScholarPubMed
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155159.CrossRefGoogle ScholarPubMed
Conlin, J.A., Gathercole, S.E., & Adams, J.W. (2005). Children’s working memory: Investigating performance limitations in complex span tasks. Journal of Experimental Child Psychology, 90(4), 303317.CrossRefGoogle ScholarPubMed
Conway, A.R., Kane, M.J., Bunting, M.F., Hambrick, D.Z., Wilhelm, O., & Engle, R.W. (2005). Working memory span tasks: A methodological review and user’s guide. Psychonomic Bulletin & Review, 12(5), 769786.CrossRefGoogle ScholarPubMed
Courtney, S.M., Ungerleider, L.G., Keil, K., & Haxby, J.V. (1997). Transient and sustained activity in a distributed neural system for human working memory. Nature, 386(6625), 608611.CrossRefGoogle Scholar
Cronbach, L.J., & Meehl, P.E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281302.CrossRefGoogle ScholarPubMed
Crone, E.A., Wendelken, C., Donohue, S., van Leijenhorst, L., & Bunge, S.A. (2006). Neurocognitive development of the ability to manipulate information in working memory. Proceedings of the National Academy of Sciences of the United States of America, 103(24), 93159320.CrossRefGoogle ScholarPubMed
Crosson, B., Rao, S.M., Woodley, S.J., Rosen, A.C., Bobholz, J.A., Mayer, A., Stein, E.A. (1999). Mapping of semantic, phonological, and orthographic verbal working memory in normal adults with functional magnetic resonance imaging. Neuropsychology, 13(2), 171187.CrossRefGoogle ScholarPubMed
Crowe, S.F. (2000). Does the letter number sequencing task measure anything more than digit span? Assessment, 7(2), 113117.CrossRefGoogle ScholarPubMed
Curtis, C.E., & D’Esposito, M. (2003). Persistent activity in the prefrontal cortex during working memory. Trends in Cognitive Sciences, 7(9), 415423.CrossRefGoogle ScholarPubMed
D’Esposito, M., Aguirre, G.K., Zarahn, E., Ballard, D., Shin, R.K., & Lease, J. (1998). Functional MRI studies of spatial and nonspatial working memory. Cognitive Brain Research, 7(1), 113.CrossRefGoogle ScholarPubMed
D’Esposito, M., Onishi, K., Thompson, H., Robinson, K., Armstrong, C., & Grossman, M. (1996). Working memory impairments in multiple sclerosis. Neuropsychology, 10, 5156.CrossRefGoogle Scholar
de Jong, P.F. (1998). Working memory deficits of reading disabled children. Journal of Experimental Child Psychology, 70(2), 7596.CrossRefGoogle ScholarPubMed
de Jong, P.F., & Olson, R.K. (2004). Early predictors of letter knowledge. Journal of Experimental Child Psychology, 88(3), 254273.CrossRefGoogle ScholarPubMed
Demaree, H.A., DeLuca, J., Gaudino, E.A., & Diamond, B.J. (1999). Speed of information processing as a key deficit in multiple sclerosis: Implications for rehabilitation. Journal of Neurology, Neurosurgery, & Psychiatry, 67(5), 661663.CrossRefGoogle ScholarPubMed
Dempster, F.N. (1981). Memory span - Sources of individual and developmental differences. Psychological Bulletin, 89(1), 63100.CrossRefGoogle Scholar
Donders, J., Tulsky, D.S., & Zhu, J. (2001). Criterion validity of new WAIS-II subtest scores after traumatic brain injury. Journal of the International Neuropsychological Society, 7(7), 892898.CrossRefGoogle ScholarPubMed
Dunn, L.M., & Dunn, D.M. (2007). Peabody Picture Vocabulary Test - Fourth edition. Minneapolis, MN: NCS Pearson.Google Scholar
Edin, F., Klingberg, T., Johansson, P., McNab, F., Tegner, J., & Compte, A. (2009). Mechanism for top-down control of working memory capacity. Proceedings of the National Academy of Sciences of the United States of America, 106(16), 68026807.CrossRefGoogle ScholarPubMed
Edin, F., Macoveanu, J., Olesen, P., Tegner, J., & Klingberg, T. (2007). Stronger synaptic connectivity as a mechanism behind development of working memory-related brain activity during childhood. Journal of Cognitive Neuroscience, 19(5), 750760.CrossRefGoogle ScholarPubMed
Engle, R.W. (1996). Working memory and retrieval: An inhibition-resource approach. In J. Richardson (Ed.), Working memory and human cognition (pp. 89119). New York: Oxford University Press.CrossRefGoogle Scholar
Fuster, J.M. (1989). The prefrontal cortex. New York: Raven Press.Google Scholar
Gathercole, S.E., Pickering, S.J., Ambridge, B., & Wearing, H. (2004). The structure of working memory from 4 to 15 years of age. Developmental Psychology, 40(2), 177190.CrossRefGoogle Scholar
Gershon, R.C., Cella, D., Fox, N.A., Havlik, R.J., Hendrie, H.C., & Wagster, M.V. (2010). Assessment of neurological and behavioural function: The NIH Toolbox. Lancet Neurology, 9(2), 138139.CrossRefGoogle ScholarPubMed
Gershon, R.C., Wagster, M.V., Hendrie, H.C., Fox, N.A., Cook, K.F., & Nowinski, C.J. (2013). NIH Toolbox for assessment of neurological and behavioral function. Neurology, 80(Suppl. 3), S2S6.CrossRefGoogle ScholarPubMed
Gold, J.M., Carpenter, C., Randolph, C., Goldberg, T.E., & Weinberger, D.R. (1997). Auditory working memory and Wisconsin Card Sorting Test performance in schizophrenia. Archives of General Psychiatry, 54(2), 159165.CrossRefGoogle ScholarPubMed
Goldman-Rakic, P.S. (1987). Circuitry of primate prefrontal cortex and regulation of behavior by representational memory. In V.B. Mountcastle (Ed.), Handbook of physiology (pp. 373417). Bethesda, MD: American Psychological Society.Google Scholar
Grigsby, J., Ayarbe, S., Kravcisin, N., & Busenbark, D. (1994). Working memory impairment among persons with chronic-progressive multiple sclerosis. Journal of Neurology, 241(3), 125131.CrossRefGoogle ScholarPubMed
Grigsby, J., Busenbark, D., Kravcisin, N., Kennedy, P.M., & Taylor, D. (1994). Impairment of the working memory system in relapsing-remitting multiple sclerosis. Archives of Clinical Neuropsychology, 9, 134135.Google Scholar
Gronwall, D.M. (1977). Paced auditory serial-addition task: A measure of recovery from concussion. Perceptual & Motor Skills, 44(2), 367373.CrossRefGoogle ScholarPubMed
Grossman, M., Armstrong, C., Onishi, K., Thompson, H., Schaefer, B., Robinson, K., Silberberg, D. (1994). Patterns of cognitive impairment in relapsing-remitting and chronic progressive multiple-sclerosis. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 7(3), 194210.Google Scholar
Hasher, L., Stolzfus, E.R., Zacks, R.T., & Rypma, B. (1991). Age and inhibition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17(1), 163169.Google ScholarPubMed
Haut, M.W., Kuwabara, H., Leach, S., & Arias, R.G. (2000). Neural activation during performance of number-letter sequencing. Applied Neuropsychology, 7(4), 237242.CrossRefGoogle ScholarPubMed
Hawkins, K.A. (1998). Indicators of brain dysfunction derived from graphic representations of the WAIS-III/WMS-III Technical Manual clinical samples data: A preliminary approach to clinical utility. The Clinical Neuropsychologist, 12(4), 535551.CrossRefGoogle Scholar
Heaton, R.K., Taylor, M.J., & Manly, J. (2003). Demographic effects and use of demographically corrected norms with the WAIS-III and WMS-III. In D.S. Tulsky (Ed.), Clinical interpretation of the WAIS-III and WMS-III (pp. 181210). San Diego: Academic Press.CrossRefGoogle Scholar
Hitch, G.J., Towse, J.N., & Hutton, U. (2001). What limits children’s working memory span? Theoretical accounts and applications for scholastic development. Journal of Experimental Psychology-General, 130(2), 184198.CrossRefGoogle Scholar
Hodes, R.J., Insel, T.R., & Landis, S.C. (2013). The NIH Toolbox: Setting a standard for biomedical research. Neurology, 80(Suppl. 3), S1.CrossRefGoogle ScholarPubMed
Hofmann, W., Schmeichel, B.J., & Baddeley, A.D. (2012). Executive functions and self-regulation. Trends in Cognitive Sciences, 16(3), 174180. doi:10.1016/j.tics.2012.01.006.CrossRefGoogle ScholarPubMed
Jacobsen, C.F. (1935). Functions of frontal association area in primates. Archives of Neurology and Psychiatry, 33(3), 558569.CrossRefGoogle Scholar
Jonides, J. (1995). Working memory and thinking. In E.E. Smith & D.N. Osherson (Eds.), Invitation to cognitive science: Thinking (2nd ed., Vol. 3, pp. 215265). Cambridge, MA: MIT Press.Google Scholar
Kirasic, K., Allen, G., Dobson, S., & Binder, K. (1996). Aging, cognitive resources, and declarative learning. Psychology and Aging, 11, 658670.CrossRefGoogle ScholarPubMed
Klingberg, T., Forssberg, H., & Westerberg, H. (2002). Increased brain activity in frontal and parietal cortex underlies the development of visuospatial working memory capacity during childhood. Journal of Cognitive Neuroscience, 14(1), 110.CrossRefGoogle ScholarPubMed
Klingberg, T., Kawashima, R., & Roland, P.E. (1996). Activation of multi-modal cortical areas underlies short-term memory. European Journal of Neuroscience, 8(9), 19651971.CrossRefGoogle ScholarPubMed
Kongs, S.K., Thompson, L.L., Iverson, G.L., & Heaton, R. (2000). Wisconsin card sorting test - 64 card version: Professional manual. Odessa, FL: Psychological Assessment Resources.Google Scholar
Kwon, H., Reiss, A.L., & Menon, V. (2002). Neural basis of protracted developmental changes in visuo-spatial working memory Proceedings of the National Academy of Sciences of the United States of America, 99(20), 1333613341.CrossRefGoogle ScholarPubMed
Kyllonen, P.C., & Christal, R.E. (1990). Reasoning ability is (little more than) working-memory capacity? Intelligence, 14, 389433.CrossRefGoogle Scholar
Lehto, J. (1996). Are executive function tests dependent on working memory capacity? The Quarterly Journal of Experimental Psychology, 49A(1), 2950.CrossRefGoogle Scholar
Linden, D.E. (2007). The working memory networks of the human brain. Neuroscientist, 13(3), 257267.CrossRefGoogle ScholarPubMed
Litvan, I., Grafman, J., Vendrell, P., & Martinez, J.M. (1988). Slowed information processing in multiple sclerosis. Archives of Neurology, 45(3), 281285.CrossRefGoogle ScholarPubMed
Logie, R.H. (1996). The seven ages of working memory. In J. Richardson (Ed.), Working memory and human cognition (pp. 3165). New York: Oxford University Press.CrossRefGoogle Scholar
Manoach, D.S., Schlaug, G., Siewert, B., Darby, D.G., Bly, B.M., Benfield, A., Warach, S. (1997). Prefrontal cortex fMRI signal changes are correlated with working memory load. Neuroreport, 8(2), 545549.CrossRefGoogle ScholarPubMed
Martin, T. A., Donders, J., & Thompson, E. (2000). Potential of and problems with new measures of psychometric intelligence after traumatic brain injury. Rehabilitation Psychology, 45(4), 402408.CrossRefGoogle Scholar
McAllister, T.W., Flashman, L.A., Sparling, M.B., & Saykin, A.J. (2004). Working memory deficits after traumatic brain injury: Catecholaminergic mechanisms and prospects for treatment -- A review. Brain Injury, 18(4), 331350.CrossRefGoogle ScholarPubMed
McAllister, T.W., Saykin, A.J., Flashman, L.A., Sparling, M.B., Johnson, S.C., Guerin, S.J., Yanofsky, N. (1999). Brain activation during working memory 1 month after mild traumatic brain injury: A functional MRI study. Neurology, 53(6), 13001308.CrossRefGoogle ScholarPubMed
McAuley, T., & White, D.A. (2011). A latent variables examination of processing speed, response inhibition, and working memory during typical development. Journal of Experimental Child Psychology, 108(3), 453468.CrossRefGoogle ScholarPubMed
McCabe, D.P., Roediger, H.L., McDaniel, M.A., Balota, D.A., & Hambrick, D.Z. (2010). The relationship between working memory capacity and executive functioning: Evidence for a common executive attention construct. Neuropsychology, 24, 222243.CrossRefGoogle ScholarPubMed
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35(11), 10121027.CrossRefGoogle Scholar
Messick, S. (1991). Validity of test interpretation and use. In M.C. Alkin (Ed.), Encyclopedia of educational research (6th ed.), New York: Macmillan.Google Scholar
Miller, G.A. (1956). The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 8197.CrossRefGoogle ScholarPubMed
Milner, B. (1964). Some effects of frontal lobectomy in man. In J.M. Warren & K. Akert (Eds.), The frontal granular cortex and behavior (pp. 313334). New York: McGraw-Hill.Google Scholar
Miyake, A., Friedman, N.P., Emerson, M.J., Witzki, A.H., Howerter, A., & Wager, T.D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49100.CrossRefGoogle ScholarPubMed
Miyake, A., & Shah, P. (Eds.) (1999). Models of working memory: Mechanisms of active maintenance and executive control. New York: Cambridge University Press.CrossRefGoogle Scholar
Mungas, D., Reed, B.R., Marshall, S.C., & Gonzalez, H.M. (2000). Development of psychometrically matched English and Spanish language neuropsychological tests for older persons. Neuropsychology, 14(2), 209223.CrossRefGoogle ScholarPubMed
Newell, A. (1973). Productions systems: Models of control structures. In W.G. Chase (Ed.), Visual information processing (pp. 463526). New York: Academic Press.CrossRefGoogle Scholar
Newell, A., & Simon, H.A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
Orellana, G., & Slachevsky, A. (2013). Executive functioning in schizophrenia. Frontiers in Psychiatry, 4, 35. doi:10.3389/fpsyt.2013.00035CrossRefGoogle ScholarPubMed
Owen, A.M., Evans, A.C., & Petrides, M. (1996). Evidence for a two-stage model of spatial working memory processing within the lateral frontal cortex: A positron emission tomography study. Cerebral Cortex, 6(1), 3138.CrossRefGoogle ScholarPubMed
Paulesu, E., Frith, C.D., & Frackowiak, R.S. (1993). The neural correlates of the verbal component of working memory. Nature, 362(6418), 342345.CrossRefGoogle ScholarPubMed
Pelphrey, K.A., & Reznick, J.S. (2003). Working memory in infancy. Advances in Child Development and Behavior, 31, 173227.CrossRefGoogle ScholarPubMed
Rao, S.M., Grafman, J., DiGiulio, D., Mittenberg, W., Bernardin, L., Leo, G.J., Unverzagt, F. (1993). Memory dysfunction in multiple sclerosis: Its relation to working memory, semantic encoding and implicit learning. Neuropsychology, 7(3), 364374.CrossRefGoogle Scholar
Repovs, G., & Bresjanak, M. (2006). Cognitive neuroscience of working memory. Neuroscience, 139, 1413.CrossRefGoogle ScholarPubMed
Riggs, K., McTaggart, J., & Simpson, A. (2006). Changes in the capacity of visual working memory in 5- to 10-year-olds. Journal of Experimental Child Psychology, 95, 1826.CrossRefGoogle Scholar
Salmon, E., Van der Linden, M., Collette, F., Delfiore, G., Maquet, P., Degueldre, C., Franck, G. (1996). Regional brain activity during working memory tasks. Brain, 119(Pt 5), 16171625.CrossRefGoogle ScholarPubMed
Salthouse, T. (1994). The aging of working memory. Neuropsychology, 8(4), 535543.CrossRefGoogle Scholar
Salthouse, T., Mitchell, D.R., Skovronek, E., & Babcock, R.L. (1989). Effects of adult age and working memory on reasoning and spatial abilities. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(3), 507516.Google ScholarPubMed
Scherf, K.S., Sweeney, J.A., & Luna, B. (2006). Brain basis of developmental change in visuospatial working memory. Journal of Cognitive Neuroscience, 18(7), 10451058.CrossRefGoogle ScholarPubMed
Schmiedek, F., Hildebrandt, A., Lovden, M., Lindenberger, U., & Wilhelm, O. (2009). Complex span versus updating tasks of working memory: The gap is not that deep. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35(4), 10891096.Google Scholar
Seidman, L.J., Breiter, H.C., Goodman, J.M., Goldstein, J.M., Woodruff, P.W.R., & Rosen, B.R. (1998). A functional magnetic resonance imaging study of auditory vigilance with low and high information processing demands. Neuropsychology, 12(4), 505518.CrossRefGoogle ScholarPubMed
Smith, E.E., Jonides, J., & Koeppe, R.A. (1996). Dissociating verbal and spatial working memory using PET. Cerebral Cortex, 6(1), 1120.CrossRefGoogle ScholarPubMed
Strauss, E., Sherman, E.M.S., & Spreen, O. (2006). A compendium of neuropsychological tests: Administration, norms, and commentary (3rd ed.). New York: Oxford University Press.Google Scholar
Tulsky, D.S., Carlozzi, N.E., Chevalier, N., Espy, K., Beaumont, J., & Mungas, D. (2013). NIH Toolbox Cognitive Function Battery (CFB): Measuring working memory. Society For Research In Child Development, Monograph, 78(4), 7087.CrossRefGoogle Scholar
Tulsky, D.S., Saklofske, D.H., & Zhu, J. (2003). Revising a Standard: Evaluation of the Origin and Development of the WAIS-III. Clinical Interpretation of the WAIS-III and WMS-III (pp. 4392). San Diego: Elsevier Science.CrossRefGoogle Scholar
Van Snellenberg, J.X. (2009). Working memory and long-term memory deficits in schizophrenia: Is there a common substrate? Psychiatry Research, 174(2), 8996.CrossRefGoogle Scholar
Vugs, B., Hendriks, M., Cuperus, J., & Verhoeven, L. (2014). Working memory performance and executive function behaviors in young children with SLI. Research in Developmental Disabilities, 35(1), 6274.CrossRefGoogle ScholarPubMed
Wechsler, D. (1952). The range of human capacities. Baltimore: The Williams & Wilkins Company.Google Scholar
Wechsler, D. (1997). WAIS-III Administration and Scoring Manual. San Antonio: The Psychological Corporation.Google Scholar
Wechsler, D. (2008). Wechsler Adult Intelligence Scale IV. San Antonio: Harcourt Assessment Inc.Google Scholar
Weintraub, S., Dikmen, S.S., Heaton, R.K., Tulsky, D.S., Zelazo, P.D., Bauer, P.J., Gershon, R.C. (2013). Cognition assessment using the NIH Toolbox. Neurology, 80(11 Suppl. 3), S54S64.CrossRefGoogle ScholarPubMed
Willmott, C., Ponsford, J., Hocking, C., & Schönberger, M. (2009). Factors contributing to attentional impairments after traumatic brain injury. Neuropsychology, 23(4), 424432.CrossRefGoogle ScholarPubMed
Woods, S.P., Moore, D.J., Weber, E., & Grant, I. (2009). Cognitive neuropsychology of HIV-associated neurocognitive disorders. Neuropsychology Review, 19(2), 152168.CrossRefGoogle ScholarPubMed
You, S.C., Geschwind, M.D., Sha, S.J., Apple, A., Satris, G., Wood, K.A., Possin, K.L. (2013). Executive functions in premanifest Huntington’s disease. Movement Disorders, 29(Suppl. 3), 405409.CrossRefGoogle ScholarPubMed
Zelazo, P.D., Andersen, J., Richler, J., Wallner-Allen, K., Beaumont, J., & Weintraub, S. (2013). NIH Toolbox Cognitive Function Battery (CFB): Measuring executive function and attention. Society For Research In Child Development, Monograph, 78(4), 1633.CrossRefGoogle Scholar
Figure 0

Fig. 1 Examples of One-List and Two-List List Sorting Task. 1-List List Sorting requires participants to sequence items according to a single category, whereas 2-List List Sorting requires sequencing that involves an alternation between two different categories. The above is a sample NIH Toolbox List Sorting Test item. All NIH Toolbox-related materials are ©2012 Northwestern University and the National Institutes of Health.

Figure 1

Fig. 2 Toolbox List Sorting Working Memory Test scores by age (means ± 2 standard errors.

Figure 2

Table 1 Effect sizes (ES) for comparisons of scores between groups

Figure 3

Table 2 Convergent & Discriminant Validity of Toolbox List Sorting Working Memory test