Published online by Cambridge University Press: 01 July 2005
Action (verb) fluency is a newly developed verbal fluency task that requires the examinee to rapidly generate as many verbs (i.e., “things that people do”) as possible within 1 min. Existing literature indicates that action fluency may be more sensitive to frontal–basal ganglia loop pathophysiology than traditional noun fluency tasks (e.g., animal fluency), which is consistent with the hypothesized neural dissociation between noun and verb retrieval. In the current study, a series of analyses were undertaken to examine the psychometric properties of action fluency in a sample of 174 younger healthy participants. The first set of analyses describes the development of demographically adjusted normative data for action fluency. Next, a group of hypothesis-driven correlational analyses reveals significant associations between action fluency and putative tests of executive functions, verbal working memory, verbal fluency, and information processing speed, but not between action fluency and tests of learning or constructional praxis. The final set of analyses demonstrates the test–retest stability of the action fluency test and provides standards for determining statistically reliable changes in performance. In sum, this study enhances the potential clinical applicability of action fluency by providing demographically adjusted normative data and demonstrating evidence for its reliability and construct validity. (JINS, 2005, 11, 408–415.)
Action (verb) fluency is a newly developed verbal fluency task that requires the examinee to rapidly generate as many verbs (i.e., “things that people do”) as possible within one minute. Piatt and colleagues (Piatt et al., 1999a, 1999b) were the first to describe the action fluency test, which was adapted from an extensive literature indicating that the neural systems involved in the generation (e.g., naming) of nouns and verbs are dissociable (e.g., Damasio & Tranel, 1993). More specifically, prior research indicates that verb generation is primarily associated with the integrity of frontal–striatal–thalamo–cortical loops (e.g., Buckner et al., 1995; Cappa et al., 2002), whereas noun generation is more dependent on the temporal (e.g., Williamson et al., 1998) and inferior parietal cortices (e.g., Warburton et al., 1996). For example, Tranel et al. (2001) reported an association between deficits in action naming and lesions in the left frontal operculum, precentral gyrus (including the underlying white matter), and the anterior insula, whereas deficits in noun (but not action) naming were linked to anterior and inferotemporal lesions.
Developed as an extension of the observed noun–verb retrieval dissociation, action fluency is a measure of verbally mediated executive functions that is particularly sensitive (and perhaps specific) to frontal systems damage relative to traditional verbal fluency tasks in which noun (e.g., animals) or letter cues are used. The construct validity of the action fluency test is supported by several recent studies. The convergent validity of action fluency is demonstrated by its correlation with well-validated clinical tests of executive functions (Piatt et al., 1999a; Woods et al., in press). Evidence of divergent validity is provided by data showing that action fluency does not correlate with tests of posterior neocortical function (i.e., noun naming and verbal episodic memory). Data from clinical samples also support the possible dissociation between noun and verb generation as it extends to generative fluency. For example, Woods et al. (in press) demonstrated a single dissociation between noun (i.e., animal) and action fluency in persons with HIV–1 infection—a condition associated with a preferential disruption of frontostriatal circuits. Using an empirically derived cut-point, HIV–1-infected participants with impaired action fluency scores (<15) were over three times more likely to demonstrate general neuropsychological impairment on a standard battery than participants who performed within normal limits. Action fluency has also demonstrated superior sensitivity to dementia in Parkinson's disease (PDD) as compared to letter and animal fluency (Piatt et al., 1999b).
Although these early findings are encouraging, the clinical usefulness of action fluency is hampered by several important limitations to the existing literature. For instance, while education-corrected normative data have been published for use with older adults (i.e., persons 56–92 years of age; Piatt et al., 2004), no normative data exist for use with younger adults. This is a substantial gap in the literature because the relationships between demographic factors and test performance, on which normative standards are based, can vary widely across age groups (Fastenau, 1998). Relatedly, the research supporting the construct validity of action fluency has been conducted exclusively with clinical samples (Piatt et al., 1999b; Woods et al., in press) and older adults (Piatt et al., 1999a), which raises questions regarding the external validity of these promising findings in younger healthy samples. Finally, the test–retest reliability of action fluency is not known. Demonstrating the reliability of action fluency is an important step toward further establishing its construct validity; moreover, defining significant and reliable changes in action fluency performance would potentially enhance the applicability of this measure for longitudinal assessments in clinical and research settings (e.g., measuring treatment efficacy or disease-related cognitive decline).
Considering these needs, the present study was undertaken to examine the psychometric aspects of action fluency in a sample of younger healthy participants, including (1) the development of demographically-corrected normative standards for younger adults; (2) correlational analyses to examine the convergent and divergent validity of action fluency, with the hypothesis that action fluency would correlate with putative tests of executive functions, verbal working memory, verbal fluency, and information processing speed, but not with measures of learning, recognition discrimination, or constructional praxis; and (3) an evaluation of the test-retest reliability of action fluency.
Participants were 174, English-speaking participants who were enrolled in a clinical research protocol at the San Diego HIV Neurobehavioral Research Center (HNRC). Potential study participants were screened for histories of psychosis, mental retardation, current substance-related disorders (e.g., alcohol dependence), and neurological and medical conditions that might adversely impact cognitive functions (e.g., HIV infection, seizure disorders, closed head injury, neoplasms, cerebrovascular disease, etc.). Table 1 displays the sample's demographic characteristics.
All study participants provided informed, written consent. Participants were administered action fluency in the context of a broader neuropsychological, neurological, medical, and psychiatric evaluation. Examiner instructions for the action fluency test were adapted from Piatt et al. (1999a, 1999b; 2004):
I'd like you to tell me as many different things as you can think of that people do. I do not want you to use the same word with different endings, like eat, eating, and eaten. Also, just give me single words such as eat, or smell, rather than a sentence or phrase. Can you give me an example of something that people do?
If the response was unacceptable, participants were asked to provide another example of an action word (any verb response is acceptable). If the response was acceptable, the examiner stated: “That's the idea. Now you have one minute to tell me as many different things as you can think of that people do.”
The primary variables of interest were the total number of unique verbs generated in 60 s, along with the total number of perseverations (i.e., the repetition or inflection of a previously generated response, including the participant's self-generated example word) and intrusions (i.e., responses that were not verbs). Verb responses that humans could not plausibly perform (e.g., photosynthesize) and questionable noun–verb homonyms (e.g., bear) were queried by the examiner and coded as intrusions if indicated.
The broader neuropsychological test battery was administered and scored by research psychometrists in accordance with published, standardized procedures. The battery included the following tests: (1) Hopkins Verbal Learning Test–Revised (HVLT–R; Brandt & Benedict, 2001); (2) Brief Visuospatial Memory Test–Revised (BVMT–R; Benedict, 1997); (3) Controlled Oral Word Association Test (COWAT–FAS; Benton et al., 1994); (4) animal fluency (Benton et al., 1994); (5) Stroop Color-Word Test (Golden, 1978); (6) Trail Making Test, Parts A and B (Reitan & Wolfson, 1985); (7) Wisconsin Card Sorting Test–64 Card Version (WCST–64; Kongs et al., 2000); (8) Halstead Category Test (Reitan & Wolfson, 1985); (9) Paced Auditory Serial Addition Test (PASAT–200; Diehr et al., 1998); (10) Grooved Pegboard Test (Kløve, 1963); (11) Letter–Number Sequencing, Digit Symbol, and Symbol Search subtests from the Wechsler Adult Intelligence Scale–Third Edition (WAIS–III; The Psychological Corporation, 1997); and (12) the Reading/Word Decoding subtest from the Wide Range Assessment Test–Revision 3 (WRAT–3; Wilkinson, 1993).
The methodology used to derive the demographically adjusted normative standards was adapted from Heaton and colleagues (Heaton et al., 2004). First, one-sample Kolmogrov-Smirnov tests were conducted to evaluate the normality of action fluency raw scores. If a normal distribution was evident (or could be achieved through validated methods of data transformation), raw scores were then converted to scaled scores (M = 10, SD = 3) whereby higher scores reflect better performance. The fractional polynomial regression procedure (Royston & Altman, 1994) was used to examine possible linear (and nonlinear) associations between each demographic variable and the action fluency total raw score. Demographic variables demonstrating a statistically significant association with action fluency total score were then entered as predictors into the final fractional polynomial regression procedure (Royston & Altman, 1994). This procedure uses an iterative algorithm to determine which combination of demographic predictors (both linear and nonlinear) yields the most advantageous fit of the action fluency scaled score data. The residuals from the fractional polynomial regression were then used to generate predicted scaled scores from which the demographically adjusted action fluency T-scores (M = 50, SD = 10) may be derived:
Pearson product-moment correlation coefficients (or the nonparametric Spearman's rank order correlation coefficient) were used to examine the associations between raw scores on action fluency and tests selected on an a priori conceptual basis from the larger battery to explore convergent and divergent validity. For the longitudinal analyses, paired t tests and Pearson product-moment correlation coefficients (or their nonparametric counterparts) were used to examine the correspondence between the action fluency raw scores at Time 1 and Time 2. The standard deviations of the Time 1 and Time 2 difference scores were also generated, which allowed for the calculation of reliable change indices (RCIs) with 90%, 95%, and 99% confidence intervals (Chelune et al., 1993). The critical alpha level was set at .05 for all analyses, except for the construct validity correlational analyses for which an alpha of .01 was used to reduce the risk of Type I error due to multiple comparisons.
The distributions of action fluency total correct, intrusions, and perseverations are presented in Figure 1.
There was no correspondence between action fluency total raw scores and age, sex, or ethnicity (all ps > .05); however, a significant linear relationship emerged between action fluency and years of education (r = .28, p = .0002). Thus, only years of education was entered as a predictor variable in the fractional polynomial regression equation predicting action fluency total (the distributions of raw scores across three levels of education are presented in Table 2). Action fluency raw scores were converted to scaled scores. Table 3 displays the appropriate conversions for transforming raw scores to scaled scores. The fractional polynomial regression procedure revealed education to be a significant predictor of action fluency total scaled scores [R2 = .08; F(1,173) = 14.05, p = .0002]. The resultant formula for generating education correct T-scores for the action fluency total variable is displayed in Table 3.
Results revealed no correspondence between action fluency intrusions and perseverations and demographic factors of age, education, or ethnicity (all ps > .10). A sex effect was evident on perseverations such that men generated significantly fewer perseverative responses than did women (p < .01). However, the severely skewed distribution of perseverations (p < .01) would not permit the use of multivariate fractional polynomial regression methods for generating sex-corrected normative data. Therefore, descriptive statistics and methods for generating sex-corrected T-scores for perseverations are displayed in Table 4.
To examine the convergent and divergent validity of action fluency, hypothesis-driven correlational analyses were conducted between action fluency (raw scores) and measures selected from the larger neuropsychological battery (see Table 5). Results revealed significant associations between the action fluency total score and tests of verbal working memory, executive functions, fine motor skills, information processing speed, and verbal fluency (all ps < .01). In contrast, there was no correspondence between action fluency and measures of praxis, learning, or recognition discrimination (all ps > .05). Exploratory analyses revealed no significant correlations between action fluency intrusions and perseverations and the other neuropsychological tests, with the exception of a small, negative association between intrusions and BVMT-R Recognition Discrimination (p = .007).
Eighty-two of the original 174 participants (47%) underwent repeat testing (see Table 6). Although participants who were followed longitudinally were older (43.5 ± 11.0 years) than those who did not undergo repeat testing (34.6 ± 10.8 years), there were no between-groups differences in education, sex, ethnicity, handedness, estimated verbal IQ, or any action fluency variable (all ps > .05). The test–retest data in Table 7 reveals good temporal stability for the action fluency total score, but slightly poorer reliability for intrusions and perseverations. No significant practice effects were evident for any action fluency variable performance over approximately a 1-year test–retest interval. Reliable change index (RCI) confidence intervals for 90%, 95%, and 99% are displayed in Table 7 for the total correct, intrusions, and perseverations variables.
Results from this study enhance the potential clinical applicability of action fluency by providing education-adjusted normative data for use with younger adults. Since demographic factors are known to influence cognitive test performance, the use of normative standards is critical to ensure accurate interpretation of an individual's test performance relative to demographically similar others (Heaton et al., 2004). Consistent with the findings of Piatt et al. (2004), we observed a positive association between action fluency total score and educational attainment, but not with age, sex, or ethnicity. Similar relationships between verbal fluency and education have also been observed with measures of animal and letter fluency (e.g., Gladsjo et al., 1999). In contrast, no demographic variable was associated with intrusions or perseverations, with the exception of a very modest (d = 0.5) sex difference in perseverative responses. Thus, it is recommended that standard scores for action fluency perseverations be generated separately for men and women (see Table 4).
Extending prior studies of convergent validity in older adults (Piatt et al., 1999a) and persons with HIV–1 infection (Woods et al., in press), we observed significant associations between action fluency total and putative tests of executive functions, verbal working memory, verbal fluency, and information processing speed. These findings were consistent with our a priori hypotheses and indicate that action fluency shares a generally modest proportion of the variance with tests measuring these related cognitive constructs. Importantly, evidence of divergent validity was provided by the lack of significant correlations between action fluency and tests of cognitive functions more associated with the posterior neocortex (i.e., learning, recognition discrimination, and constructional praxis). It is unlikely that these non-significant correlations reflect Type II error as the current study was adequately powered to detect small-to-medium effect sizes (power = .85 to detect r = .25 with N = 174 and alpha = .01). Exploratory analyses, however, revealed no discernible pattern of correlations between action fluency intrusions and perseverations and the battery of neuropsychological tests. Such null findings raise questions regarding the convergent validity of these variables as indicators of executive functions in healthy populations; in other words, errors are so infrequently generated that they are not highly informative in nonclinical samples. Error analyses on traditional verbal fluency tests have historically yielded fairly inconsistent results (e.g., Suhr & Jones, 1998; cf. Butters et al., 1986), which may reflect the inherent difficulty in analyzing variables with low base rates (Woods et al., 2004). Whether action fluency errors possess predictive or discriminative value in clinical samples remains to be determined by future research.
Action fluency demonstrated good one-year test-retest reliability. There was no indication of a practice effect in action fluency performance at one-year follow-up. The stability coefficient and standard deviation of change scores for action fluency total scores are generally consistent with published data using letter fluency (e.g., Basso et al., 1999). The relatively lower reliability of intrusions and preseverations may be related to the restricted range of observed scores (i.e., floor effects), which is particularly problematic in research with younger, healthy adults (Woods et al., 2005). The RCIs provided in Table 7 are intended to assist clinicians and researchers in more accurately classifying statistically reliable changes in action fluency performance. Change scores that fall outside the selected RCI confidence interval are considered to represent a statistically reliable improvement or decline in performance. The use of RCIs may reduce the risk of classification errors that can result from attempting to estimate practice effects and normal test-retest variability to determine whether a meaningful change in performance has occurred without the aid of empirical standards. Studies are nevertheless needed to evaluate the predictive validity of these RCI data in detecting significant changes in various clinical samples (i.e., sensitivity).
It is important to highlight the limitations to the external validity of the current study. Although the study participants had a broad range of demographic characteristics, the sample was largely Caucasian, young, and had attained an average of 14 years of education. Study sample demographic characteristics are a particularly important consideration when using regression-based normative standards, which can be misleading in the event that an individual's particular demographic characteristics are not represented in the normative sample (see Fastenau, 1998; Fastenau & Adams, 1996). While the incorporation of the nonlinear fractional polynomial regression equations somewhat mitigates the risk of misclassification (Heaton et al., 2004), prudent use of these normative data require that a given client's demographic characteristics are compatible with that of the standardization sample. To this end, the Piatt et al. (2004) normative sample is recommended for use with older adults given that the current sample contained only 7 persons age 60 years or older. Moreover, only 10% of the present sample reporting having attained less than a high-school education, which indicates that caution regarding possible false positive classification errors is warranted when applying these normative data to clients with lower levels of education. Finally, the normative sample was largely Caucasian, which may have restricted our ability to detect ethnicity differences in action fluency performance if they truly exist.
Although a convergence of research shows that the neural networks involved in generating nouns and verbs are dissociable, these processes likely overlap to some degree. For example, Tranel et al. (2001) found that left premotor/prefrontal lesions were associated with impairments in both action and object naming. Moreover, the conceptual knowledge of actions is related to frontal, as well as posterior neocortical areas, including left parietal and posterior middle temporal regions (Tranel et al., 2003). Cerebellar structures have also been linked to verb processing (e.g., Sach et al., 2004), although the nature and extent of this link remains controversial (Richter et al., 2004). As a verb generation task, it is likely that action fluency requires a distributed neural network that includes the frontal lobes, as well as more posterior aspects of the neocortex [e.g., posterior middle temporal (MT) region] and adjoining white matter pathways (Tranel et al., 2003). Accordingly, interpretation of action fluency as a pure measure of frontal lobe functions is imprudent, despite promising evidence of its divergent validity.
Future studies may consider examining multiple trials of action fluency that incorporate unique rule-guided search strategies. For example, restrictions might be placed on the generation of inflected verbs (e.g., Sach et al., 2004), noun–verb homonyms (e.g., Tranel et al., 2005), or other conceptual factors, such as actions that require tools or that can only be performed by using one's hands (e.g., Kemmerer & Tranel, 2000). Extending the verb generation literature (e.g., Buckner et al., 1995) to action fluency, another interesting possibility would be to place systematic restrictions on the semantic relatedness of dyadic noun–verb switching trials. The incorporation of additional trials might allow the test user to better delineate the specific nature of the action fluency deficit, as well as perhaps enhance the (already strong) reliability of action fluency.
In sum, findings from this study support the potential clinical application of action fluency by providing demographically adjusted normative data in younger adults, actuarial standards for reliable change, and evidence of construct validity. The development and validation of novel measures of executive functioning such as action fluency are worthwhile endeavors because traditional neuropsychological tests of this domain often lack specificity (Alexander & Stuss, 2000). To this end, action fluency may provide a measure of frontal systems function with superior sensitivity and specificity relative to the traditional letter and animal fluency tasks (Piatt et al., 1999a, 1999b; Woods et al., in press), which place greater demands on posterior neocortical functions (e.g., Pihlajamaki et al., 2000). Accordingly, action fluency might be a useful tool to complement the existing armamentarium of clinical and research neuropsychologists. Cautious interpretation of action fluency is nevertheless recommended, as neuroimaging research indicates that although verb generation reliably activates frontal systems, it also requires the contribution of the posterior neocortex (albeit perhaps to a lesser extent than noun generation; e.g., Perani et al., 1999). The action fluency test will require further validation using neuroimaging technologies, as well as in clinical studies of populations with quantified frontal and posterior neocortical lesions, temporolimbic pathology (e.g., temporal lobe epilepsy), and various neurodegenerative disorders (e.g., Alzheimer's disease and frontotemporal dementia).
The San Diego HNRC group is affiliated with UCSD, the Naval Hospital, San Diego, and the San Diego Veterans Affairs Healthcare System, and includes: Director: Igor Grant, M.D.; Co- Directors: J. Hampton Atkinson, M.D., J. Allen McCutchan, M.D.; Center Manager: Thomas D. Marcotte, Ph.D.; Naval Hospital San Diego: Mark R. Wallace, M.D. (P.I.); Neuromedical Core: J. Allen McCutchan, M.D. (P.I.), Ronald J. Ellis, M.D., Ph.D., Scott Letendre, M.D., Rachel Schrier, Ph.D.; Neurobehavioral Core: Robert K. Heaton, Ph.D. (P.I.), Mariana Cherner, Ph.D., Joseph Sadek, Ph.D., Steven Paul Woods, Psy.D., Corinna Young, Ph.D.; Imaging Core: Terry Jernigan, Ph.D. (P.I.), John Hesselink, M.D., Michael J. Taylor, Ph.D.; Neuropathology Core: Eliezer Masliah, M.D. (P.I.), Dianne Langford, Ph.D.; Clinical Trials Component: J. Allen McCutchan, M.D., J. Hampton Atkinson, M.D., Ronald J. Ellis, M.D., Ph.D., Scott Letendre, M.D.; Data Management Unit: Daniel R. Masys, M.D. (P.I.), Michelle Frybarger, B.A. (Data Systems Manager); Statistics Unit: Ian Abramson, Ph.D. (P.I.), and Deborah Lazzaretto, M.S.
This study was supported by the following grants from the National Institutes of Health: MH62512, DA12065, and MH59745. The views expressed in this article are those of the authors and do not reflect the official policy or position of the Department of the Navy, Department of Defense, nor the United States Government. The authors thank Daniel Tranel, Ph.D., Michael Weinborn, Ph.D., and an anonymous reviewer for their helpful comments. We also thank Deborah Lazzaretto and Jennifer Marquie-Beck for their assistance with statistical analyses and figure preparation, respectively. Portions of these data were presented at the 33rd Annual Meeting of the International Neuropsychological Society in St. Louis, MO, USA.