Introduction
Working memory (WM) is the set of mental processes that enables manipulation of information stored within short-term memory, and provides an interface between sensory perception, long-term memory, and active interaction with one’s environment (Baddeley, Reference Baddeley2012; Conway, Jarrold, Kane, Miyake, & Towse, Reference Conway, Jarrold, Kane, Miyake and Towse2007; Miyake & Shah, Reference Miyake and Shah1999). This transient storage and active manipulation of goal-relevant information facilitates higher-order cognitive processes, such as reasoning, comprehension, planning and learning (Baddeley, Reference Baddeley1986; D’Esposito, Reference D’Esposito2007; Just & Carpenter, Reference Just and Carpenter1992; Was, Dunlosky, Bailey, & Rawson, Reference Was, Dunlosky, Bailey and Rawson2012). The prominent role of WM in diverse cognitive processes has motivated research investigating WM dysfunction across a range of psychiatric and neurologic disorders, including schizophrenia, attention deficit hyperactivity disorder, dementia, and traumatic brain injury (Gagnon & Belleville, Reference Gagnon and Belleville2011; Gorman, Barnes, Swank, Prasad, & Ewing-Cobbs, Reference Gorman, Barnes, Swank, Prasad and Ewing-Cobbs2012; Kim et al., Reference Kim, Manoach, Mathalon, Turner, Mannell, Brown and Calhoun2009; Schweitzer et al., Reference Schweitzer, Faber, Grafton, Tune, Hoffman and Kilts2000). However, disparate WM theories and approaches to its study have resulted in incongruities in our understanding of its components, functions and dynamics (Baddeley, Banse, Huang, & Page, Reference Baddeley, Banse, Huang and Page2012; Conway et al., Reference Conway, Jarrold, Kane, Miyake and Towse2007; Miyake & Shah, Reference Miyake and Shah1999). Thus, we must first improve our characterization of the neural encoding of normative population variance in WM to provide a framework by which we may define the neural processing variance associated with cognitive and behavioral dysfunction in clinical groups.
Functional neuroimaging has played a pivotal role in refining the cognitive construct and neural representation of WM (Champod & Petrides, Reference Champod and Petrides2010; Smith & Jonides, Reference Smith and Jonides1997). The n-back task is arguably the most widely used functional neuroimaging paradigm for investigating the neural basis of WM due to its ability to produce robust and consistent neuroactivations and to parametrically vary memory load demands (Braver et al., Reference Braver, Cohen, Nystrom, Jonides, Smith and Noll1997; Cohen et al., Reference Cohen, Perlstein, Braver, Nystrom, Noll, Jonides and Smith1997; Jonides et al., Reference Jonides, Schumacher, Smith, Lauber, Awh, Minoshima and Koeppe1997; Kane & Engle, Reference Kane and Engle2002). However, despite the n-back task’s strong face validity as a measure of WM, its construct validity as a WM measure has been inconsistently established. Some studies report strong convergent validity between n-back performance and WM-related processes such as WM capacity (Schmiedek, Hildebrandt, Lövdén, Wilhelm, & Lindenberger, Reference Schmiedek, Hildebrandt, Lövdén, Wilhelm and Lindenberger2009; Shamosh et al., Reference Shamosh, Deyoung, Green, Reis, Johnson, Conway and Gray2008; Shelton, Elliott, Hill, Calamia, & Gouvier, Reference Shelton, Elliott, Hill, Calamia and Gouvier2009; Shelton, Metzger, & Elliott, Reference Shelton, Metzger and Elliott2007), various executive functions (Ciesielski, Lesnik, Savoy, Grant, & Ahlfors, Reference Ciesielski, Lesnik, Savoy, Grant and Ahlfors2006), and/or general and fluid intelligence (Colom, Abad, Quiroga, Shih, & Flores-Mendoza, Reference Colom, Abad, Quiroga, Shih and Flores-Mendoza2008; Gevins & Smith, Reference Gevins and Smith2000; Gray, Chabris, & Braver, Reference Gray, Chabris and Braver2003; Jaeggi, Buschkuehl, Perrig, & Meier, Reference Jaeggi, Buschkuehl, Perrig and Meier2010). However, others have reported weak or negligible correlations between n-back performance and WM-related measures (Friedman et al., Reference Friedman, Miyake, Corley, Young, Defries and Hewitt2006; Jaeggi et al., Reference Jaeggi, Buschkuehl, Perrig and Meier2010; Kane, Conway, Miura, & Colflesh, Reference Kane, Conway, Miura and Colflesh2007), instead associating n-back with simple short-term memory (Oberauer, Reference Oberauer2005; Roberts & Gibson, Reference Roberts and Gibson2002), or concluding that the complexity of n-back tasks requires a combination of processes not easily disentangled or characterized through comparison of performance on existing cognitive tests (Jaeggi et al., Reference Jaeggi, Buschkuehl, Perrig and Meier2010).
The inconsistency of these findings limits the interpretability of the n-back as a reliable probe of WM (as mentioned by Conway et al., Reference Conway, Kane, Bunting, Hambrick, Wilhelm and Engle2005; Miller, Price, Okun, Montijo, & Bowers, Reference Miller, Price, Okun, Montijo and Bowers2009). Thus, the present study sought to validate the n-back as a reliable WM probe by demonstrating its cognitive-behavioral specificity for a domain-general WM construct. We use the letter variant of the n-back task (LNB), generally considered a measure of verbal WM (Cohen et al., Reference Cohen, Perlstein, Braver, Nystrom, Noll, Jonides and Smith1997; Owen, McMillan, Laird, & Bullmore, Reference Owen, McMillan, Laird and Bullmore2005). To establish LNB’s construct validity, we correlate LNB task performance with a diverse battery of (1) clinically validated neuropsychological (NP) measures of WM (i.e., auditory, verbal, and visuospatial), to establish convergent validity, and (2) measures related to but conceptually distinct from WM, to establish discriminant validity. We sought to control for the effects of paradigm design-specific variance on the validity relationships by administering NP instruments that included separate subtests which did and did not measure WM, thereby allowing within-instrument measures of convergent and discriminant validity with the n-back.
Furthermore, while several brain regions have been shown to be consistently recruited by the LNB task, it remains unclear how these regions are integrated into larger functional networks. In a 2005 quantitative meta-analysis, 24 normative functional neuroimaging studies of several n-back task variants differing in WM process (i.e., location vs. identity monitoring) and content (i.e., verbal vs. non-verbal material) were compiled to investigate the neuroanatomic representation of WM (Owen et al., Reference Owen, McMillan, Laird and Bullmore2005). Seven brain regions were identified as consistently activated across all studies, regardless of task variant, including six cortical regions: (1) bilateral and medial posterior parietal cortex; (2) bilateral premotor cortex; (3) dorsal cingulate/medial premotor cortex; (4) bilateral rostral prefrontal cortex; (5) bilateral dorsolateral prefrontal cortex; (6) bilateral mid-ventrolateral prefrontal cortex, and (7) the medial cerebellum. Owen et al. provided strong evidence for the consistent involvement of core frontal and parietal cortical regions across variants of the n-back task, as well as identifying differential subregional and lateralized activation patterns for process- and content-specific task differences.
We sought to expand upon Owen et al.’s findings by identifying the LNB’s network-level neural correlates of WM using independent component analysis (ICA), a data-driven statistical method for identifying functionally connected networks of brain regions (Calhoun, Adali, Pearlson, & Pekar, Reference Calhoun, Adali, Pearlson and Pekar2002; McKeown et al., Reference McKeown, Jung, Makeig, Brown, Kindermann, Lee and Sejnowski1998). While previous studies have sought to establish the neural correlates of this task using ICA-based approaches, all have either studied dysfunctional network organization in patients (Cousijn et al., Reference Cousijn, Wiers, Ridderinkhof, van den Brink, Veltman and Goudriaan2014; Nejad et al., Reference Nejad, Madsen, Ebdrup, Siebner, Rasmussen, Aggernæs and Baaré2013; Palacios et al., Reference Palacios, Sala-Llonch, Junque, Roig, Tormos, Bargallo and Vendrell2012; Penadés et al., Reference Penadés, Pujol, Catalán, Massana, Rametti, García-Rizo and Junqué2013), the organization of the default mode network (DMN) (Esposito et al., Reference Esposito, Aragri, Latorre, Popolizio, Scarabino, Cirillo and Di Salle2009, Reference Esposito, Bertolino, Scarabino, Latorre, Blasi, Popolizio and Di Salle2006; Sambataro et al., Reference Sambataro, Blasi, Fazio, Caforio, Taurisano, Romano and Bertolino2010), or focused on neuroimaging method development (Haller, Homola, Scheffler, Beckmann, & Bartsch, Reference Haller, Homola, Scheffler, Beckmann and Bartsch2009; Missonnier et al., Reference Missonnier, Leonards, Gold, Palix, Ibáñez and Giannakopoulos2003). The present study aims to characterize normative neural networks recruited by the LNB task and their relationships to task demand, thereby testing the hypothesis that the LNB and its neural processing networks fulfill convergent and discriminant validity as a WM demand.
METHODS
Subjects
Fifty-two participants [29 female; mean (SD) age=32 (10) years] were recruited via community advertisements in accordance with University of Arkansas for Medical Sciences (UAMS) Institutional Review Board approval and oversight. We strove to recruit a demographically diverse sample of participants by posting flyers and banners on city buses, at local eateries, and in community centers, in addition to posting at the university. Inclusion criteria for the study were native English-speaking adults ages 18–50 without history of psychiatric or neurologic illness. Exclusion criteria included the presence of any DSM-IV psychiatric disorders (except nicotine dependence) as determined by structured clinical interview (SCID-NP) (First, Spitzer, Gibbon, & Williams, Reference First, Spitzer, Gibbon and Williams2002), ferromagnetic implants and other contraindications to the high-field MRI environment (determined through medical history), and pregnancy (determined through urinalysis). Eighteen participants were excluded from analyses due to incomplete fMRI or NP data (n=11) or excessive head motion artifact (n=7; see the METHODS, Image Acquisition and Processing, for details). Analyses were conducted on the remaining 34 participants [22 female; mean (SD) age=32 (10) years; range 19–50 years; 31 right-handed; see Table 1 for full demographic information].
† Includes one participant self-reporting as both African-American and Caucasian
Procedures
The data included in the present study were collected as a subset of a larger, multifaceted initiative known as the “Cognitive Connectome (Cognectome)” which seeks to comprehensively map the brain’s functional and structural encoding of individual variation in cognitive and behavioral abilities across cognitive modalities. All procedures were conducted at the Brain Imaging Research Center (BIRC) in the Psychiatric Research Institute at UAMS. Participants first underwent a telephone interview to establish inclusion criteria. Eligible participants were invited to the BIRC for an intake session where they provided written informed consent and underwent SCID-NP assessment and medical evaluation to assess exclusion criteria. Eligible participants underwent two MRI sessions (1 hr each), a battery of computerized assessments (1 hr), and comprehensive NP assessment (3–4 hr). Intakes were conducted in the morning, fMRI sessions in the afternoon (between 1 p.m. and 5 p.m.), and NP assessment (due to length) in morning or afternoon at participants’ convenience. Participants were compensated $25 for completion of each of the four sessions (Intake, one NP, two fMRI), in addition to compensation for parking or bus fare.
Neuropsychological (NP) Assessments
NP instruments were administered per standardized instructions. Administrators were trained by a board-certified clinical neuropsychologist. The following tests were selected from the larger Cognectome test battery for having at least one subscale accepted as a measure of WM and at least one subscale that was not, thus permitting exploration of convergent and discriminant validity. Two additional tests, the Halstead-Reitan Finger Tapping Test and the Boston Naming Test, were included as stand-alone measures of discriminant validity.
Digit Span Test (WAIS-IV)
The Digit Span Test of the Wechsler Adult Intelligence Scales-Fourth Edition (WAIS-IV) was designed to measure span of auditory attention and verbal WM and was administered and scored per standardized instructions (Wechsler, Reference Wechsler2008). This version of the Digit Span Test includes: Digit Span Forward (DSF), Digit Span Backward (DSB), and Digit Span Sequencing (DSS) which requires oral repetition, reversal, and sequencing of number strings, respectively.
Spatial Span Test (WMS-III)
The Wechsler Memory Scale-Third Edition (WMS-III) Spatial Span Test was designed to measure span of visuospatial attention and visuospatial WM and was administered and scored per standardized instructions (Wechsler, Reference Wechsler1997). The Spatial Span Test includes: Spatial Span Forward (SSF) and Spatial Span Reverse (SSR), which requires manual repetition and reversal of visuospatial sequences, respectively.
Test of Everyday Attention (TEA)
The Test of Everyday Attention (TEA) consists of eight subtests designed to measure three major features of attention—selective attention, sustained attention and attentional switching—as well as auditory-verbal WM (Robertson, Ward, Ridgeway, & Nimmo-Smith, Reference Robertson, Ward, Ridgeway and Nimmo-Smith1994, Reference Robertson, Ward, Ridgeway and Nimmo-Smith1996). Five of the eight subtests were included in the Cognectome test battery and were administered and scored per standardized instructions (Robertson et al., Reference Robertson, Ward, Ridgeway and Nimmo-Smith1994). These included: [TEA 1] Map Search (selective visual attention), [TEA 2-3] Elevator Counting and Elevator Counting with Distraction (sustained attention and auditory-verbal WM, respectively) and [TEA 4-5] Elevator Counting with Reversal using visual or auditory stimuli (attentional switching and auditory-verbal WM, respectively).
D-KEFS Trail-Making Test (TMT)
The Delis-Kaplan Executive Function System (D-KEFS) Trail-Making Test (TMT) was administered and scored per standardized instructions and included five self-paced subtests designed to measure cognitive flexibility/set-shifting (an executive subprocess of WM) while also allowing for the disambiguation of constituent processes embedded in the higher-level task performances (i.e., visuomotor speed, selective visual attention, and temporal sequencing; Delis, Kaplan, & Kramer, Reference Delis, Kaplan and Kramer2001). The five self-paced TMT subtests included: [1] Visual Scanning (visuomotor speed, selective visual attention), [2-3] Number Sequencing and Letter Sequencing (visuomotor speed, selective visual attention, temporal sequencing of numerical or alphabetical stimuli, respectively), [4] Number-Letter Sequencing (visuomotor speed, selective visual attention, complex temporal sequencing, cognitive flexibility/set shifting), and [5] Motor Speed (visuomotor speed/agility).
Halstead-Reitan Finger Tapping Test (FTT)
The Halstead-Reitan Finger Tapping Test (FTT) was designed to provide a measure of simple motor speed and was administered and scored per standardized instructions (Reitan & Wolfson, Reference Reitan and Wolfson1985).
Boston Naming Test-2 (BNT)
The Boston Naming Test-2 (BNT) is a picture naming task designed to measure visual confrontational naming—which involves processes such as semantic fluency, lexical retrieval, and speech production—and was administered and scored per standardized instructions (Kaplan, Goodglass, & Weintraub, Reference Kaplan, Goodglass and Weintraub2001).
Letter n- back (LNB) fMRI Task
The LNB task was conducted as a block design using Presentation 14.4 (Neurobehavioral Systems, Inc.) and consisted of alternating blocks of 0-back (sensorimotor and sustained attention control) and 2-back (WM) conditions. Task blocks involved the random sequential presentation of uppercase letter stimuli (A-E), with each trial lasting a total of 1500 ms (letter presentation for 1200 ms or until participant made a response, followed by fixation cross presented for remainder of the trial). During 0-back blocks, participants were instructed to respond whenever the letter “A” was shown. For 2-back blocks, participants were instructed to respond if the currently presented letter matched the letter presented two letters prior. Twenty-five percent of trials from each condition were coded as target trials warranting a response. Before fMRI scanning, all participants practiced the task outside the MRI scanner to ensure task comprehension.
The first five participants underwent alternating 90s blocks of 0-back and 2-back conditions (three blocks each; six total). Each block was preceded by a 6 s Instruction block indicating task condition (“0-back” or “2-back”) and followed by a 20s Rest (baseline) block consisting of a static fixation cross, for total duration of 11.1 min. Interim analysis showed comparable neural responses when each task condition was reduced to blocks with shorter total durations, prompting a task redesign to reduce participant fatigue. Subsequent participants underwent alternating 40s blocks of 0-back and 2-back trials (four blocks each; eight total), which were now preceded by a 5 s Instruction block and followed by a 15 s Rest block for a total task time of 7.3 min.
Performance on the LNB was computed using the Critical Success Index (CSI), a modified estimate of percent accuracy given by: the number of hits (correct intentional responses) divided by the sum of hits, false alarms (incorrect intentional responses), and misses (incorrect intentional non-responses) (Wilks, Reference Wilks2011). CSI was preferred over standard percent accuracy due to the fact that correct intentional non-responses (“rejections”) cannot be discriminated from correct unintentional non-responses when using standard percent accuracy calculations, which results in overinflated accuracy estimates, especially for designs with a high percentage of non-response trials, such as the present study (~75% of trials). Thus, the CSI provided a performance measure less biased by the ambiguity of non-response trials (Wilks, Reference Wilks2011). However, for simplicity, we will still refer to the performance scores as “accuracy” or “percent accuracy” throughout the manuscript.
Construct Validity Analysis of the LNB Task
LNB construct validity was determined using bootstrapping to estimate correlations of LNB performance with each NP assessment. Considering our relatively small sample compared to other behavioral and/or construct validation studies, bootstrapping provided a statistical approach more robust against small sample size than simple correlations alone (Cumming, Reference Cumming2008; Efron & Tibshirani, Reference Efron and Tibshirani1993; Gardner & Altman, Reference Gardner and Altman1986; Young & Lewis, Reference Young and Lewis1997). This is because it involves the simulation of an empirical distribution of correlation estimates that represents the true (population) distribution, followed by the evaluation of the stability of these approximations using confidence intervals (CI), a more interpretable (i.e., generalizable) statistical measure of significance than p-values (Cumming, Reference Cumming2008; Efron & Tibshirani, Reference Efron and Tibshirani1993; Gardner & Altman, Reference Gardner and Altman1986; Young & Lewis, Reference Young and Lewis1997). Bootstrapping provided estimates of the means, standard deviations (SE) and 95% CI for correlations between LNB performance and each NP assessment, as follows. First, 34 subjects were randomly selected with replacement from the observed sample to form a bootstrap resample; this process was repeated 1000 times to form 1000 34-subject resamples. Partial correlations were calculated (controlling for age and education) between LNB accuracy and the raw scores on each NP subscale for each of the 1000 resamples, forming an empirical distribution of the correlation estimates, from which a mean, SE, and 95% CI were calculated. LNB was interpreted as having discriminant or convergent validity with a NP test if the 95% CI of its bootstrapped correlation did or did not include zero, respectively. All analyses of behavioral performance were conducted using Matlab 7.10 (The MathWorks, Inc.).
Image Acquisition and Processing
Participants were scanned using a Philips 3T Achieva X-series MRI scanner (Philips Healthcare, USA). Anatomic images were acquired with a magnetization prepared gradient echo (MPRAGE) sequence (matrix=256×256, 160 sagittal slices, repetition time/echo time/flip angle [TR/TE/FA]=2600 ms/3.05 ms/8°, final resolution=1×1×1 mm3). Functional images were acquired for 23 participants using an 8-channel head coil with an echo planar imaging sequence [TR/TE/FA= 2000 ms/30 ms/90°, field of view=240×240 mm, matrix=80×80, 37 oblique slices (parallel to orbitofrontal cortex to reduce sinus artifact), slice thickness=4 mm, interleaved slice acquisition, final resolution 3×3×4 mm3]. Following an equipment upgrade, functional data for the 11 remaining participants were acquired with a Phillips 32-channel head coil (Philips Healthcare, USA). The same image acquisition parameters were used, except with thinner slices (slice thickness=2.5 mm with 0.5 mm gap) and a sequential ascending slice acquisition. The thinner slices were selected to reduce orbitofrontal signal loss caused by sinus cavity artifact.
MRI data preprocessing was performed using AFNI version 2011_12_21_1014 (Cox, Reference Cox1996). Anatomic data underwent skull stripping, spatial normalization to the ICBM 452 brain atlas and segmentation into white matter, gray matter, and cerebrospinal fluid (CSF) using FSL v5.0.4 (Jenkinson, Beckmann, Behrens, Woolrich, & Smith, Reference Jenkinson, Beckmann, Behrens, Woolrich and Smith2012). The functional data underwent despiking; slice time correction; deobliquing (to 3×3×3 mm3 voxels); head motion correction; transformation to the spatially normalized anatomic image; regression of motion parameters, mean timecourse of white matter voxels, and mean timecourse of CSF voxels; spatial smoothing with a 6 mm FWHM Gaussian kernel; and scaling to percent signal change.
After preprocessing, ICA was conducted using the Group ICA of fMRI Toolbox (GIFT v1.3) for Matlab (Calhoun, Adali, Pearlson, & Pekar, Reference Calhoun, Adali, Pearlson and Pekar2001) to identify and remove sources of signal caused by head motion (Tohka et al., Reference Tohka, Foerde, Aron, Tom, Toga and Poldrack2008; see METHODS, Independent Component Analysis and General Linear Modeling for detailed ICA procedure). Head motion artifact manifests in the functional data as alternating “bands” or “stripes” of activity corresponding to the order of slice acquisition. For each subject, ICA solved for the optimal number of components as determined by GIFT’s MDL algorithm (typically 150–200 components). Because the pattern of slice acquisition (e.g., all even slices or all odd slices) does not represent biologically relevant brain activity, a liberal threshold (r>0.05) was used to identify components that correlated with slice acquisition. These components were removed from the preprocessed functional data using the “icatb_removeArtifact.m” command in Matlab. Motion artifact was assessed before and after ICA “stripe” removal using single voxel seed-based correlation analyses via AFNI’s “InstaCorr” function. Seven subjects still demonstrated “striping” after ICA removal and were thus excluded from further analysis. Two additional subjects were excluded because excessive signal loss in the orbitofrontal cortex resulted in poor normalization of the functional data to the ICBM 452 template.
Independent Component Analysis and General Linear Modeling
Independent Component Analysis (ICA)
ICA was performed on the LNB fMRI data using Matlab’s GIFT. GIFT uses a two-step data-reduction ICA approach. The first step of the GIFT ICA procedure performs a principal component analysis (PCA) on each individual participant to reduce the dimensionality of each fMRI dataset into subject-specific principal components. The second step performs ICA upon these subjects’ principal components by first concatenating the subject-specific components into a group and then identifying group-level independent components (i.e., components that are consistently represented across all subjects). For the first step, PCAs were performed using GIFT’s expectation maximization and stacked datasets options. For the second step, GIFT’s ICASSO3 toolbox was used to determine the reliability of the ICA components across iterations. ICASSO3 was repeated 20 times using the Infomax algorithm to calculate the stability index (iq) of each component, a measure of how reliably each component was reproduced in the sample across ICASSO iterations.
The GIFT ICA procedure defines components by their spatial independence, which requires that the spatial distributions of the components be independent of one another (Beckmann, Reference Beckmann2012; McKeown et al., Reference McKeown, Jung, Makeig, Brown, Kindermann, Lee and Sejnowski1998). Thus, any given brain region is capable of contributing to multiple components—for instance, when the same brain region is transiently recruited by several discrete neural processes throughout the course of the task—as long as the overall spatiotemporal maps are statistically spatially distinct (Calhoun et al., Reference Calhoun, Adali, Pearlson and Pekar2001; Xu, Potenza, & Calhoun, Reference Xu, Potenza and Calhoun2013). However, ICA model order selection (i.e., the number of components solved for) can greatly impact the extent of segregation and/or overlap of the components’ spatiotemporal maps by altering the stringency of their spatial independence (Calhoun et al., Reference Calhoun, Adali, Pearlson and Pekar2001; Ray et al., Reference Ray, McKay, Fox, Riedel, Uecker, Beckmann and Laird2013). Yet methods for determining the optimal number of components for a given sample are either still in debate or in active development (Li, Adali, & Calhoun, Reference Li, Adali and Calhoun2007; Ray et al., Reference Ray, McKay, Fox, Riedel, Uecker, Beckmann and Laird2013). Thus, the selection of the present study’s ICA model order was motivated by an empirical evaluation of the component quality across several different ICA procedures within our sample, as follows. Four separate ICAs were conducted (solving for the following combinations of PCA/ICA components: 40/20, 50/25, 60/30, and 70/35) to find the model with, both, the best component stability (given by the iq estimate) and the best replication of well-supported (“canonical”) functional networks (Ray et al., Reference Ray, McKay, Fox, Riedel, Uecker, Beckmann and Laird2013; Smith et al., Reference Smith, Fox, Miller, Glahn, Fox, Mackay and Beckmann2009). Of the four models, solving for 35 components produced the best balance between component stability (iq ≥.90) and the partitioning of components into well-validated canonical functional networks (Ray et al., Reference Ray, McKay, Fox, Riedel, Uecker, Beckmann and Laird2013; Smith et al., Reference Smith, Fox, Miller, Glahn, Fox, Mackay and Beckmann2009). Because the group ICA approach identifies components common to all individuals, this also ensured the generalizabilty (and interpretation) of the ICA results across subjects (Calhoun et al., Reference Calhoun, Adali, Pearlson and Pekar2001).
In the final step of the GIFT ICA procedure, a back-reconstruction was performed using the GICA3 algorithm to identify the subject-specific neuroanatomical and timecourse representations of every component, followed by the normalization of all data to Z-scores to enable the comparison of subject-specific component maps across subjects. (Note: additional software parameters included full storage of covariance matrix to double precision and usage of selective eigenvariate solvers, as detailed in the GIFT v1.3 User Manual).
General Linear Modeling (GLM)
The GLM analysis assessed task-dependent recruitment of each component. Each subject-specific ICA timecourse underwent GLM using AFNI’s 3dDeconvolve program (code available upon request). The GLM modeled nine parameters: three task conditions (Instructions, 0-back, and 2-back) as parameters of interest, and six head motion parameters (x, y, z, roll, pitch, yaw) included into the baseline model as parameters of no interest. Because trials were terminated by participant responses, we accounted for the effect of trial duration variability upon brain activity using amplitude modulation, which models participants’ brain activity for each task block as varying proportionally to the participants’ mean reaction time (RT) for that block. Beta values (β) were estimated describing the magnitude of component recruitment across each of the task conditions for the following general linear test (GLT) contrasts (controlling for age and education): 0-back vs. Rest, 2-back vs. Rest, and 2-back vs. 0-back. Group-level t tests then identified the components whose β contrasts significantly differed from zero across subjects within the sample [i.e., with p ≤.05 after false-discovery rate (FDR) correction for the 105 contrasts (35 components×3 GLT contrasts)].
A post hoc analysis was also conducted to determine if the mid-study equipment upgrade from the 8- to 32-channel head coil represented a confounding factor for the GLM analysis. For each ICA component, a one-way analysis of variance (ANOVA) tested whether subjects scanned on either head coil (8 vs. 32) had significantly different GLT βs for the 2-back vs. 0-back contrast. ANOVA results demonstrated no significant effect of head coil on GLT results for any of the components (all p>.05 after FDR correction), supporting the exclusion of head coil as a covariate.
RESULTS
Behavioral Results
Table 2 provides descriptive statistics for participants’ LNB and NP performances. Mean (SD) LNB accuracy was 0.97 (0.09) for 0-back trials and 0.61 (0.21) for 2-back trials. Accuracy was significantly lower for 2-back vs. 0-back trials [paired t test(33)=12.13, p<1×10-12], consistent with expected diminishes in performance resulting from 2-back’s greater task difficulty. Increased task difficulty was also conveyed by parallel increases in RT for 2-back relative to 0-back conditions, with a mean (SD) RT of 0.45 (0.06) and 0.63 (0.07) seconds for 0-back and 2-back trials, respectively (paired t test(33)=13.77; p<1×10-14).
As an initial step of the construct validity analysis, a Lilliefors’ test determined LNB 2-back accuracy to have a non-normal distribution (K=0.155; p <.05), warranting the use of Spearman’s correlations for the bootstrapping procedure. Table 3 and Figure 1 illustrate the 95% CIs calculated from the bootstrapped correlations between LNB and each NP assessment. The following assessments had 95% CIs that did not include zero, indicating convergent validity with the LNB: DSS, SSF, SSR, TEA 4 (accuracy and timing), TEA 5, and TMT 2-4. Conversely, the following tests’ 95% CIs did include zero, indicating discriminant validity with the LNB: DSF, DSB, TEA 1-3, TMT 1 and 5, FTT, and BNT. As described in Table 3, the NP assessments that significantly converged with LNB had mean bootstrapped |ρ| ranging from 0.37 to 0.60, while assessments that discriminated from LNB had mean bootstrapped |ρ| ranging from 0.007 to 0.29.
Abbreviations: SSF=Spatial Span Forward, SSR=Spatial Span Reverse, DSF=Digit Span Forward, DSB=Digit Span Backward, DSS=Digit Span Sequencing, TEA=Test of Everyday Attention, TMT=Trail Making Test, FTT=Finger Tapping Test, BNT=Boston Naming Test-2.
Neuroimaging Results
Of the 35 ICA components identified, 14 were classified as noise artifact components according to the criteria delineated in METHODS, Image Acquisition and Processing, and omitted from subsequent analysis (see Supplementary Table S1 for full descriptions of all 35 components, including noise components). Table 4 GLM statistics show that, of the 21 non-noise components, 17 demonstrated significant task-dependent activity. Eight components were significantly more active during 2-back than 0-back: cerebellum (IC10), superior parietal lobule (SPL)/precuneus (IC12), right frontoparietal (RFP; IC21), supplementary motor area/lateral premotor (SMA/LPM; IC22), left frontoparietal (LFP; IC23), dorsolateral prefrontal cortex (DLPFC; IC25), frontoinsular (IC29), and bilateral primary motor (M1; IC30) (all FDR p<.0005; see Figure 2 for images of these components). Nine components were significantly less active during 2-back than 0-back; bilateral anterior insulae (IC4), left primary motor (LM1; IC6), right primary motor (RM1; IC14), anterior DMN (IC17), bilateral amygdalae/hippocampi (IC18), posterior DMN (IC19), dorsomedial prefrontal cortex (DMPFC; IC20), auditory (IC28), and ventromedial prefrontal cortex (VMPFC; IC35) (all FDR p<.05; see Supplementary Figure S2 for images of these components).
Abbreviations: GLT=General Linear Test, IC=independent component, 2B=2-back, 0B=0-back, R=Rest, FDR p-val=False Discovery Rate corrected p-value, SPL=superior parietal lobule, DMN=default mode network, DMPFC=dorsomedial prefrontal cortex, SMA=supplementary motor area, LPM=lateral premotor, DLPFC=dorsolateral prefrontal cortex, VMPFC=ventromedial prefrontal cortex
DISCUSSION
The current study sought to empirically validate the LNB as a WM probe by assessing both its behavioral construct validity and characterizing its specificity for network-level recruitment of WM-related neural processing correlates.
LNB Demonstrates Strong Measurement Specificity for WM Constructs
LNB demonstrated broad convergence with NP tasks assessing auditory and verbal WM. LNB was most strongly convergent with TEA 5, which measures auditory-verbal WM and attentional switching, and was slightly less convergent with TEA 3, which measures auditory-verbal WM and inhibitory control. While LNB has been proposed to involve both attentional switching and inhibitory control processes (Bledowski, Kaiser, & Rahm, Reference Bledowski, Kaiser and Rahm2010; Owen et al., Reference Owen, McMillan, Laird and Bullmore2005; Wager & Smith, Reference Wager and Smith2003), LNB’s stronger correlation with TEA 5 than TEA 3 suggests a greater involvement of attentional switching during the LNB. LNB also exhibited strong convergence with DSS, which requires maintenance and manipulation of auditory-verbal stimuli within WM. Unexpectedly, LNB did not converge with DSB, despite DSB reportedly tapping much the same cognitive abilities as DSS. This finding may reflect the greater WM demands required of DSS’s more complex ordinal sequencing manipulation vs. DSB’s stimulus reversal (Sattler & Ryan, Reference Sattler and Ryan2009; Wechsler, Reference Wechsler2008). Furthermore, the 95% CIs for LNB’s correlation with TEA 5 and DSS (Figure 1) are distinctly different from zero (i.e., robust), suggesting a high probability of replicating this finding in an independent sample. Conversely, the 95% CIs for LNB’s correlation with DSB and TEA 3 are marginally distinct from zero, indicating that these findings are less statistically robust, that is, more likely to differ in an independent replication.
LNB also demonstrated convergence with tasks engaging visual and visuospatial WM processes. LNB was strongly convergent with both SSF and SSR. This finding corroborates previous reports that forward and reverse conditions of the Spatial Span (and the homologous Corsi Block Tapping Task) have comparable WM demands (Kessels, van den Berg, Ruis, & Brands, Reference Kessels, van den Berg, Ruis and Brands2008; Li & Lewandowsky, Reference Li and Lewandowsky1995; Smyth & Scholey, Reference Smyth and Scholey1992; Wilde & Strauss, Reference Wilde and Strauss2002). LNB performance also converged with TEA 4 timing and accuracy, measures of visual attentional switching. Lastly, LNB exhibited strong convergence with TMT 4 and weaker but significant convergence with TMT 2 and 3. We expected convergence with TMT 4 given its temporal sequencing and set shifting demands; however, LNB convergence with TMT 2 and 3 was unanticipated. Similarities in task design may explain this unexpected convergence, as LNB, TMT 2 and TMT 3 all require maintenance of a continuous alphabetical or numerical sequence.
In establishing discriminant validity, we expected LNB performance to be unrelated to both verbal and visual/visuospatial measures of short-term memory, simple sustained attention, vigilance, visuomotor speed, simple motor speed, and language processing. This was demonstrated in the lack of correlation between the LNB and DSF, TEA 2, TMT 1, TMT 5, FTT, and BNT, respectively.
In summary, we report moderate to high convergence of LNB performance with WM measures across sensory modality (i.e., auditory-verbal, visuospatial, visual) and discrimination from measures of short-term memory, attention, vigilance, visuomotor speed, and general language processing (Table 3). Although LNB is generally considered a verbal WM task, we report convergence with both verbal and visual/visuospatial WM measures. Thus, LNB’s assessment of the underlying domain-general WM construct transcended the modality-specific differences that exist across differing WM validity tests. These findings support the LNB as a processing load for the same core cognitive construct measured by canonical WM tasks despite task-specific differences, and demonstrate that their relationships are not mediated by general attentional or perceptual task demands alone. As such, these collective results corroborate the use of the LNB as a robust, domain-general cognitive-behavioral probe of WM.
LNB Demonstrates Specificity for WM-Related Recruitment of Neural Networks
Our combined ICA and GLM approach identified eight task-positive networks (Table 4; Figure 2), which included all seven task-related regions described by previous meta-analysis of n-back tasks (Owen et al., Reference Owen, McMillan, Laird and Bullmore2005). In addition to replicating these past univariate findings, ICA informs how these regions actively interact during task. Four of these previously reported regions (bilateral DLPFC, mid-VLPFC, rostral PFC, and posterior parietal/precuneus regions) were captured within our left and right frontoparietal networks [LFP (IC23) and RFP (IC21)]. These networks were more active during 2-back vs. Rest and 2-back vs. 0-back contrasts but not 0-back vs. Rest, indicating WM-related recruitment specificity. These networks have been implicated in a broad range of cognitive processes, including language, reasoning, attention, and explicit memory (Fox et al., Reference Fox, Laird, Fox, Fox, Uecker, Crank and Lancaster2005; Smith et al., Reference Smith, Fox, Miller, Glahn, Fox, Mackay and Beckmann2009). LFP is often portrayed as having greater domain-general task involvement than RFP (Smith et al., Reference Smith, Fox, Miller, Glahn, Fox, Mackay and Beckmann2009), but is particularly identified in tasks involving verbal WM (Owen et al., Reference Owen, McMillan, Laird and Bullmore2005; Wager & Smith, Reference Wager and Smith2003), as supported by our present findings. Of interest, ICA also identified bilateral DLPFC (IC25) and bilateral SPL/precuneus (IC12) components independent of the LFP and RFP networks, possibly reflecting the intrinsic interhemispheric connectivity of these regions.
ICA revealed Owen et al.’s bilateral premotor, dorsal cingulate/SMA, and medial cerebellar neuroactivations as three separate motor-related components: bilateral M1 (IC30), SMA/LPM (IC22), and cerebellum (IC10). These networks were positively activated across all three n-back GLT contrasts, suggesting a general role in executing motor responses independent of WM load. The premotor cortex has also been associated with updating and maintenance of the temporal order of stimuli encoded during n-back task performance (Wager & Smith, Reference Wager and Smith2003). Of note, the medial cerebellar cluster identified by Owen et al. was represented in both our cerebellar and LFP networks, indicating a possible role in cognitive aspects of LNB performance. Given the known role of the cerebellum in cognition, (Desmond & Fiez, Reference Desmond and Fiez1998; Koziol, Budding, & Chidekel, Reference Koziol, Budding and Chidekel2012; Leiner, Leiner, & Dow, Reference Leiner, Leiner and Dow1986; Marien, Engelborghs, & De Deyn, Reference Marien, Engelborghs and De Deyn2001; Strick, Dum, & Fiez, Reference Strick, Dum and Fiez2009), and, more specifically, in WM (Desmond, Gabrieli, Wagner, Ginier, & Glover, Reference Desmond, Gabrieli, Wagner, Ginier and Glover1997; Durisko & Fiez, Reference Durisko and Fiez2010; Hautzel, Mottaghy, Specht, Müller, & Krause, Reference Hautzel, Mottaghy, Specht, Müller and Krause2009) this is not completely unexpected, although additional research tracking cerebellar recruitment in WM may prove informative to better understanding the interplay of cerebellar and frontal networks. Our ICA also identified a frontoinsular network (IC29) that included Owen et al.’s WM-related precuneus, dorsal cingulate/SMA, and bilateral mid-VLPFC activations. Studies have implicated this network in domain-independent, externally directed task modes in opposition to the DMN that initiate transitions between engagement and disengagement of the frontoparietal and DMN across tasks and stimulus modalities (Dosenbach et al., Reference Dosenbach, Fair, Miezin, Cohen, Wenger, Dosenbach and Petersen2007; Sridharan, Levitin, & Menon, Reference Sridharan, Levitin and Menon2008; Tang, Rothbart, & Posner, Reference Tang, Rothbart and Posner2012). This dynamic switching function has been proposed to permit access to attentional and WM resources following a task-salient event to facilitate goal-directed behavior (Menon & Uddin, Reference Menon and Uddin2010).
Our task-active ICA-derived networks overlapped in their recruitment of the regions identified by Owen et al., suggesting that these common regions subserve multiple, differing functional roles (Xu, Zhang, et al., Reference Xu, Zhang, Calhoun, Monterosso, Li, Worhunsky and Potenza2013). Whereas, traditional univariate approaches identify which brain regions are involved in a task, ICA complements such findings by also informing how these regions are co-recruited to form distinct functional networks. The ability of ICA to map the functional organization of these regions into discrete neural processing networks underscores its power for exploring individual differences in the brain’s capacity to dynamically organize cognitive resources in response to task demands (Congdon et al., Reference Congdon, Mumford, Cohen, Galvan, Aron, Xue and Poldrack2010; Xu, Zhang, et al., Reference Xu, Zhang, Calhoun, Monterosso, Li, Worhunsky and Potenza2013).
The ICA and GLM analysis also identified nine “task-negative” networks, that is, networks less engaged during 2-back than 0-back conditions (Table 4 for GLM statistics; Supplementary Figure S2 for images of task-negative components). Because Owen et al. only included task-positive activations, it is unclear whether our “deactivations” are specific to this n-back task variant or generally applicable to n-back tasks. Also, although we identified many of the same networks as Esposito et al. (Reference Esposito, Bertolino, Scarabino, Latorre, Blasi, Popolizio and Di Salle2006), their reporting of only magnitude (not direction) of network activity prevents a comparison of task-negative networks between our studies. Consequently, we discuss these findings in brief.
Although we report activation of bilateral M1 (IC30) across task conditions, ICA also identified separate LM1 (IC6) and RM1 (IC14) components that were deactivated by task. LM1 showed less activity during 2-back vs. 0-back, but more activity for 0-back vs. Rest and no difference in activity between 2-back vs. Rest. RM1 exhibited less activity during 2-back vs. 0-back and 2-back vs. Rest but no significant differences in activity for 0-back vs. Rest. Since participants responded with only their right hand (corresponding to LM1), the lesser RM1 engagement during the 2-back conditions may reflect task-induced deactivation (TID), a relative decrease in neural activity within an uninvolved region in response to increased task demand elsewhere (Allison, Meader, Loring, Figueroa, & Wright, Reference Allison, Meader, Loring, Figueroa and Wright2000; Liu, Shen, Zhou, & Hu, Reference Liu, Shen, Zhou and Hu2011; Zeharia, Hertz, Flash, & Amedi, Reference Zeharia, Hertz, Flash and Amedi2012). The task-deactivation of LM1, however, is harder to interpret. Future work should explore the task-dependent interactions among these motor networks at varying levels of cognitive and psychomotor processing loads, for instance, using motor response-based paradigms with parametric and/or dual-task designs, as is the case in several variants of the n-back.
The bilateral anterior insulae (IC4), amygdalae/hippocampi (IC18), DMPFC (IC20), auditory (IC28), and VMPFC (IC35) networks also exhibited patterns of activity that seemed to reflect TID. However, the current task design was unable to determine whether these TID patterns represented a facilitatory process (i.e., processing resources were redistributed from task-irrelevant regions to support regions involved in task performance) (Arsalidou, Pascal-Leone, Johnson, Morris, & Taylor, Reference Arsalidou, Pascal-Leone, Johnson, Morris and Taylor2013; Jackson, Morgan, Shapiro, Mohr, & Linden, Reference Jackson, Morgan, Shapiro, Mohr and Linden2011; Leech, Kamourieh, Beckmann, & Sharp, Reference Leech, Kamourieh, Beckmann and Sharp2011), or reflected the attenuation of internally focused mentation (i.e., activity in self-referential regions was suspended during the externally focused, goal-directed task) (Gusnard & Raichle, Reference Gusnard and Raichle2001; McKiernan, Kaufman, Kucera-Thompson, & Binder, Reference McKiernan, Kaufman, Kucera-Thompson and Binder2003; Spreng & Grady, Reference Spreng and Grady2010). Future research should explore the involvement of differing cognitive strategies resulting in both task-induced activation and deactivation as it relates to performance of WM tasks, such as the n-back.
Finally, components representing anterior [VMPFC and PCC (IC17)] and posterior [PCC and bilateral inferior parietal (IC19)] elements of the DMN were identified as independent networks. These regions are typically represented as a single network consistently deactivated across demanding task conditions, indicating greater activity during Rest than during effortful cognition (Greicius, Krasnow, Reiss, & Menon, Reference Greicius, Krasnow, Reiss and Menon2003; Greicius, Srivastava, Reiss, & Menon, Reference Greicius, Srivastava, Reiss and Menon2004) as a result of their involvement in internally derived, self-referential thought processes, such as daydreaming, introspection, and theory of mind (Buckner, Andrews-Hanna, Schacter, Kingstone, & Miller, Reference Buckner, Andrews-Hanna, Schacter, Kingstone and Miller2008; Raichle et al., Reference Raichle, MacLeod, Snyder, Powers, Gusnard and Shulman2001; Spreng & Grady, Reference Spreng and Grady2010). We report task-dependent activity for both posterior DMN and anterior DMN. Of interest, however, only posterior DMN significantly differs across all 3 contrasts, whereas anterior DMN differs only for 2-back vs. 0-back. While the segregation of DMN into anterior and posterior networks may be attributed to network fragmentation when solving for a higher number of components (n=35), increasing evidence suggests the DMN to be inherently modular in its organization and represented by multiple interacting subsystems with differential functional specializations (Andrews-Hanna, Reidler, Huang, & Buckner, Reference Andrews-Hanna, Reidler, Huang and Buckner2010; Buckner et al., Reference Buckner, Andrews-Hanna, Schacter, Kingstone and Miller2008; Laird et al., Reference Laird, Eickhoff, Li, Robin, Glahn and Fox2009; Mayer, Roebroeck, Maurer, & Linden, Reference Mayer, Roebroeck, Maurer and Linden2010; Uddin, Kelly, Biswal, Castellanos, & Milham, Reference Uddin, Kelly, Biswal, Castellanos and Milham2009). Future studies should explore the functional relevance of these individual (sub)networks, especially in the context of WM tasks (Ray et al., Reference Ray, McKay, Fox, Riedel, Uecker, Beckmann and Laird2013). Given past studies’ usage of the LNB to characterize DMN function, the present study provides an important step toward characterizing both the engagement and disengagement of functional networks in service of LNB task performance by illustrating the emergence of WM function through the concerted efforts of functionally diverse networks.
Limitations
One limitation of ICA is that the number of components estimated defines the neuroanatomical representation of the identified networks (Abou-Elseoud et al., Reference Abou-Elseoud, Starck, Remes, Nikkinen, Tervonen and Kiviniemi2010; Pamilo et al., Reference Pamilo, Malinen, Hlushchuk, Seppä, Tikka and Hari2012). While our 35-component ICA produced predominantly stable and functionally relevant networks, some components (i.e., IC27 and IC31) appeared to merge noise artifact with the functional networks, while other networks (i.e., DMN) appeared fragmented across multiple components (IC17 and IC19). Solving for fewer components (i.e., 20, 25, 30), however, led to components with a greater merging of distinct functional networks, thus limiting their interpretability. We, thus, chose the 35-component ICA as a better representation of canonical functional networks and task-relevant cognitive processing.
The present study sought to determine the convergent/discriminant relationships of the n-back task with clinically validated NP assessments that have previously undergone construct validation. While our sample size is suitably large for a neuroimaging study, psychometric validation studies are typically conducted in much larger samples (Robertson et al., Reference Robertson, Ward, Ridgeway and Nimmo-Smith1996; Burton, Ryan, Axlerod, Schellenberger, & Richards, Reference Burton, Ryan, Axlerod, Schellenberger and Richards2003); thus, replication of our analyses in a larger sample is suggested to confirm the generalizability of these behavioral findings. Additionally, we could not feasibly control for time of NP administration in this study. While between-subject diurnal variance in performance has been reported for some measures of executive function (i.e., Wisconsin Card Sorting Task; Bennett, Petros, Johnson, & Ferraro, 2008), measures of WM (specifically, the Digit Span Test) appear resilient to such effects. To further explore diurnal variance in these measures, these effects should be assessed in a larger within-subject test–retest study design.
Finally, our LNB task was designed to quickly map brain regions subserving WM in healthy and clinical adult populations, and was thus not optimized to assess other factors of interest such as varying network recruitment with parametric load manipulation, error processing, or differential network involvement in WM subprocesses. Thus, our neuroimaging findings may be specific to the LNB task, and should be replicated using other n-back variants and neuroimaging tasks of WM. However, we present these neuroimaging and NP findings with hopes of laying the groundwork for future investigations into these areas.
CONCLUSIONS
To our knowledge, the present study is the first to comprehensively investigate the construct validity of the LNB task as a WM probe and to identify the task-related neural network representation of healthy WM function. Future work will characterize normative WM brain-behavior relationships by assessing how WM network functional organization reflects individual differences in WM ability. By modeling the neural encoding of cognitive and behavioral WM—in both healthy subjects (as described here) and in future clinical populations—we aim to differentiate normative neural processing variance from the specific disparities associated with disrupted brain function in individual patients, thereby extending this neuroimaging model into the personalized treatment of various disorders related to WM dysfunction.
As functional neuroimaging begins to play a larger role in clinical assessment, we need to better understand the relationships between neuroimaging and clinical neuropsychology, and remain receptive to questioning and modifying each in the face of evidence derived from studies such as this. As such, the present study exemplifies the value of merging functional neuroimaging and clinical neuropsychology so that these disparate fields may mutually inform one another, and thus provide a framework to further translate functional neuroimaging into clinical care.
Acknowledgments
This research was supported by the Translational Research Institute (TRI) at the University of Arkansas for Medical Sciences (UAMS) which is funded by the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program (UL1TR000039); the CTSA KL2 Scholars Program (KL2TR000063; to G.A.J.); NIH National Institute of General Medical Sciences Initiative for Maximizing Student Development Fellowship (IMSD; R25GM083247; to T.K.R.) and NIH National Institute of Drug Abuse T32 Addiction Training Grant (T32DA022981; to T.K.R.). CDK served as a member of a scientific advisory meeting for Allergan Pharmaceuticals, served as a member of the national advisory board for Skyland Trail, and is also a co-holder of U.S. Patent No. 6,373,990 (Method and device for the transdermal delivery of lithium). The authors declare no conflicts of interest. All authors contributed to the interpretation and writing of this manuscript. We thank Jonathan Young and Sonet Smitherman of UAMS for their roles in study coordination and MRI scanner operation, and Molly Robbins, Bradford Martins and Sarah Zimmerman of UAMS for their help with data entry.