Introduction
Alzheimer's disease (AD) typically presents with early impairments on tasks involving episodic memory and progresses to more global impairments including language and executive functioning (Braaten, Parsons, McCue, Sellers, & Burns, Reference Braaten, Parsons, McCue, Sellers and Burns2006). The most consistently found language deficit in early-stage AD is impaired word finding, particularly if given a target semantic category to guide the speeded generation of words (Braaten et al., Reference Braaten, Parsons, McCue, Sellers and Burns2006). Consequently, verbal fluency tests are frequently used in clinical settings to aid in the diagnosis of AD, and early-stage patients typically show greater semantic versus phonemic fluency impairment, presumably due to disproportionate effects in the temporal versus the prefrontal brain regions (Henry & Crawford, Reference Henry and Crawford2004).
Declines in semantic fluency total word production are found consistently in individuals with AD compared to healthy older adults, whereas the effect of AD on phonemic fluency performance is typically much smaller (Crossley, D'Arcy, & Rawson, Reference Crossley, D'Arcy and Rawson1997; Haugrud, Lanting, & Crossley, Reference Haugrud, Lanting and Crossley2010; Henry, Crawford, & Phillips, Reference Henry, Crawford and Phillips2004). In addition, individuals with AD produce fewer atypical or low frequency exemplars than normal adults (Sailor, Antoine, Diaz, Kuslansky, & Kluger, Reference Sailor, Antoine, Diaz, Kuslansky and Kluger2004).
Troyer, Moscovitch, and Winocur (Reference Troyer, Moscovitch and Winocur1997) proposed that verbal fluency performance can be divided into clustering and switching components. Clustering involves the production of words within a semantic or phonemic subcategory and is proposed to rely primarily on temporal lobe processes. Switching refers to the ability to shift between clusters and is proposed to rely primarily on prefrontal lobe functions.
The model of Troyer et al. (Reference Troyer, Moscovitch and Winocur1997) predicts that individuals with AD will show smaller cluster sizes with relatively intact switching rates due to decreases in efficient access to semantic knowledge. These results have been found by some researchers (Troyer, Moscovitch, Winocur, Leach, & Freedman, Reference Troyer, Moscovitch, Winocur, Leach and Freedman1998), while other studies have only partially supported this theoretical difference (Haugrud et al., Reference Haugrud, Lanting and Crossley2010). Previous researchers in this area have examined groups of individuals diagnosed with AD at varying stages of the disease (Beatty, Testa, English, & Winn, Reference Beatty, Testa, English and Winn1997; Epker, Lacritz, & Munro Cullum, Reference Epker, Lacritz and Munro Cullum1999; Haugrud et al., Reference Haugrud, Lanting and Crossley2010; Troster et al., Reference Troster, Fields, Testa, Paul, Blanco, Hames and Beatty1998; Troyer et al., Reference Troyer, Moscovitch, Winocur, Leach and Freedman1998), which could explain differences between studies.
Modifications to the scoring procedures established by Troyer and colleagues (Reference Troyer, Moscovitch and Winocur1997) have been proposed. For example, Abwender, Swan, Bowerman, and Connolly (Reference Abwender, Swan, Bowerman and Connolly2001) proposed two types of switching strategies. Hard switching occurs between two single, non-clustered words or between a clustered word and a single word and is believed to result from the speeded nature of verbal fluency tasks. Cluster switching occurs between two groups of clustered words and is believed to reflect mental flexibility. Lanting, Haugrud, and Crossley (Reference Lanting, Haugrud and Crossley2009) examined the number of novel clusters accessed, the number of clusters returned to in the same trial, and the percentage of clustered words. These variables were included to address limitations of the Troyer and colleagues (Reference Troyer, Moscovitch and Winocur1997) model that included single words as a cluster with a score of zero. Finally, Haugrud and colleagues (Reference Haugrud, Lanting and Crossley2010) proposed that errors and perseverations should be removed from calculations of clustering and switching as these intrusions artificially inflate the cluster size scores for individuals with AD.
The current study used the methods of scoring proposed by Troyer et al. (Reference Troyer, Moscovitch and Winocur1997), Abwender et al. (Reference Abwender, Swan, Bowerman and Connolly2001), and Lanting et al. (Reference Lanting, Haugrud and Crossley2009) to investigate verbal fluency in individuals diagnosed with early stage AD compared to healthy older adults. The current project had three goals: (1) to examine the variables of Abwender et al. (Reference Abwender, Swan, Bowerman and Connolly2001) and Lanting et al. (Reference Lanting, Haugrud and Crossley2009) in a group diagnosed with early-stage AD and, consistent with Haugrud et al. (Reference Haugrud, Lanting and Crossley2010), to examine these variables with errors removed; (2) to determine which of these scoring systems and variables best differentiate AD from healthy aging, contributing to our understanding of fluency decline in AD; and, (3) to use a computerized scoring procedure to generate clustering and switching variables to improve scoring accuracy and reliability.
Based on the two component model of verbal fluency and results from previous research, we hypothesized that the AD group would produce fewer total words on both verbal fluency tasks when compared to the healthy older adult group. Furthermore, we hypothesized that the AD group would produce smaller average cluster sizes on both fluency tasks when compared to the healthy older adult group and fewer total switches on the semantic task. Due to disease-related effects on the semantic store (Braaten et al., Reference Braaten, Parsons, McCue, Sellers and Burns2006), we hypothesized that the AD group would produce fewer novel and repeated clusters, fewer cluster switches, and smaller percentage of clustered words than the healthy older adult group on the semantic task, but would show no differences on these variables on the phonemic task.
Methods
Participants
All data for this study were collected in compliance with the ethical regulations of the University of Saskatchewan. The current study used archival data from a subsample of participants (26 healthy older adults) recruited for a neuropsychological investigation of normal aging chosen for comparable age, years of education, and reading ability to 26 individuals diagnosed with AD according to the NINDS-ADRDA criteria (McKhann et al., Reference McKhann, Drachman, Folstein, Katzman, Price and Stadlan1984) recruited from an Aging Research and Memory Clinic. Results for total word production, average cluster size, and total switches, based on hand scoring of data from the current participants, have been reported previously by Haugrud and colleagues (Reference Haugrud, Lanting and Crossley2010). The current study extends this past work to include additional fluency variables not previously analyzed in an AD group. The healthy older adult group (15 females; 11 males) had a mean age of 70.5 (SD = 7.7) with an average of 11.9 (SD = 2.6) years of education. The Alzheimer's disease group (16 female; 10 males) had a mean age of 70.6 (SD = 7.6) with 11.4 years of education (SD = 3.4) and an average Mini-Mental State Examination (MMSE) (Folstein, Folstein, & McHugh, Reference Folstein, Folstein and McHugh1975) score of 24.7 (SD = 2.9). The groups did not differ in age, F(1,50) = 0.001; p = .971; η 2 = .001, or education, F(1,50) = 0.410; p = .525; η 2 = .008. Similarly, there was no significant difference between the healthy older adult group and the Alzheimer's disease group on the Wide Range Achievement Test-3 reading subtest (WRAT-3; Wilkinson, Reference Wilkinson1993), F(1,40) = 0.274; p = .604; η 2 = .007, and the average scaled scores indicated average reading level for both groups (M = 101.7; SD = 11.4 and M = 101.3; SD = 11.7, for the normal and AD groups, respectively).
Materials
Participants completed the Controlled Oral Word Association Test (COWAT; Benton & Hamsher, Reference Benton and Hamsher1989) as a measure of phonemic fluency and the Animal Naming test (AN; Spreen & Strauss, Reference Spreen and Strauss1991) as a measure of semantic fluency.
Procedures and Scoring
The COWAT consists of three 60-s trials requiring participants to produce as many words as possible that begin with the letters “C,” “F,” or “L.” On the Animal Naming (AN) test, participants are given 60 s to produce as many animal names as possible.
The verbal fluency variables were calculated both with intrusions (i.e., errors and perseverations) included and excluded. Detailed scoring procedures for the calculation of average cluster size and number of switches have been previously reported (Troyer et al., Reference Troyer, Moscovitch and Winocur1997, 2000), as have the procedures for calculating hard and cluster switches (Abwender et al., Reference Abwender, Swan, Bowerman and Connolly2001).
For the current study, a computer program was developed to generate the verbal fluency scores and to increase the reliability of the scoring procedures. The computer program is written in Python programming language and relies on word lists to group output according to scoring procedures. In a slight modification to the original scoring measures of Troyer and colleagues (Reference Troyer, Moscovitch and Winocur1997), only the criterion of the same first two letters was used as a cluster for the phonemic task. The computer program was not able to score phonemic clusters that are homonyms, differ by a vowel sound or rhyme (Troyer et al., Reference Troyer, Moscovitch and Winocur1997). Using this computer scoring method, the verbal fluency scores were calculated quickly and were highly consistent with those obtained in previous studies using hand scoring methods, demonstrating the efficacy of the computer scoring program. Participant scores of average cluster size and number of switches differed slightly using the computer scoring program compared to the hand scoring method previously published in Haugrud et al. (Reference Haugrud, Lanting and Crossley2010). The largest difference was in the control group semantic average cluster size (M = 1.29; SD = 0.82; and M = 1.01; SD = 0.57, for Haugrud et al. [2010] and the current study, respectively) and the smallest was in the control group phonemic switches where the results were identical. The differences between the scores reported by Haugrud et al. (Reference Haugrud, Lanting and Crossley2010) and the current computer generated scores were not statistically significant and reflect slight modifications to the scoring procedures using the computer program and the challenges associated with reliably hand scoring these variables.
The calculations for the remaining variables have been described by Lanting and colleagues (Reference Lanting, Haugrud and Crossley2009). For the calculation of novel and repeated clusters, clusters were defined by the criteria of Troyer and colleagues (Reference Troyer, Moscovitch and Winocur1997) for the semantic tasks, and included words with the same first two letters for the phonemic task. For the semantic task, when a word could be clustered into two different categories, the superordinate category of living environments was used, as described by Troyer and colleagues (Reference Troyer, Moscovitch and Winocur1997). Novel and repeated clusters were calculated both including and excluding single words as clusters. Finally, the percentage of clustered words per task was calculated.
Average cluster size was calculated according to the original method of Troyer et al. (Reference Troyer, Moscovitch and Winocur1997) and re-calculated with single words excluded.
Results
For semantic and phonemic fluency tasks, separate one way analyses of variance (ANOVAs) were performed on all verbal fluency variables and partial η 2 is used as a measure of effect size. When errors and perseverations were removed from the calculation of verbal fluency variables, effect sizes for the significant findings were larger, and consistent with the hypotheses of the current study and past research. As a result, the following results are presented with intrusions removed.
Semantic fluency
Refer to Table 1 for the means and standard deviations of the semantic verbal fluency scores according to group. When compared to the healthy older adult group, the AD group produced significantly fewer total words, F(1,50) = 42.854, p < .001, and significantly fewer total switches, F(1,50) = 24.831, p < .001, hard switches, F(1,50) = 10.244, p = .002, and cluster switches, F(1,50) = 7.050, p = .011. The groups did not differ for semantic fluency average cluster size or for percentage of clustered words, but the AD group produced significantly fewer novel clusters, F(1,50) = 20.154, p < .001, and repeated clusters, F(1,50) = 15.792, p < .001, than the healthy older adult group, including fewer multiple word novel clusters, F(1,50) = 16.583, p < .001, and multiple word repeated clusters, F(1,50) = 4.181, p = .046. Examination of average cluster size excluding single words did not differentiate the AD group from the healthy older adult group.
Table 1 Semantic verbal fluency scores (SD) for participants with Alzheimer’s disease (N = 26) and for a comparison group of healthy older adults (N = 26)

aAverage cluster size excluding single words.
*p < .05; **p < .01; ***p < .001.
Phonemic Fluency
Refer to Table 2 for the means and standard deviations of the phonemic verbal fluency scores according to group. The AD group produced fewer phonemic fluency total words than the healthy older adult group, F(1,50) = 5.602, p = .022. Groups did not differ on number of switches, number of hard or cluster switches, or on average cluster size. The AD group produced significantly fewer novel clusters, F(1,50) = 4.992, p = .030, and multiple word repeated clusters, F(1,50) = 8.521, p = .005, but there was no group difference for repeated clusters or on multiple word novel clusters. The AD group compared to the healthy older adult group produced significantly smaller average cluster size scores when single words were excluded, F(1,50) = 8.878, p = .004.
Table 2 Phonemic verbal fluency variables for participants with Alzheimer's disease (N = 26) and for a comparison group of healthy older adults (N = 26)

aAverage cluster size excluding single word clusters.
*p < .05; **p < .01; ***p < .001.
Discussion
Measures of effect size in the current study demonstrate that semantic fluency total word production best differentiates AD from healthy aging, closely followed by semantic fluency total switches. The variables of Abwender et al. (Reference Abwender, Swan, Bowerman and Connolly2001) did not add further information as both hard and cluster switching differentiated groups. Excluding single words from the analysis, consistent with Lanting et al. (Reference Lanting, Haugrud and Crossley2009), did not better differentiate AD from the healthy control group; however, the number of novel clusters accessed did differentiate the AD group from the healthy control group on both the phonemic and semantic tasks. Overall, clustering and switching variables showed larger effects in differentiating groups during semantic versus phonemic fluency tasks, indicating these variables are most informative when examining the effects of AD on semantic verbal fluency.
In contrast to the study hypothesis, healthy older adults and AD participants produced equivalent average cluster size scores during semantic fluency. Haugrud et al. (Reference Haugrud, Lanting and Crossley2010), using the same data set, found that males with AD, but not females, produced significantly smaller average cluster sizes than the healthy comparison group. Clarifying sex differences is an important direction for future research. Alternatively, contrasting findings might result from the use of the computerized scoring system in the current study that produced smaller differences in cluster size and switching scores compared to the hand scoring procedure used by Haugrud et al. (Reference Haugrud, Lanting and Crossley2010). Given that small changes in scoring consistency can change group effects on measures of clustering, average cluster size might not be the most effective method for differentiating AD from healthy aging. The current study was the first to use a computerized scoring system to calculate clustering and switching scores. Use of this program, in contrast to hand scoring procedures, was efficient and reliable and is strongly recommended for future research on clustering and switching variables.
McDowd and colleagues (Reference McDowd, Hoffman, Rozek, Lyons, Pahwa, Burns and Kemper2011) concluded that, compared to verbal ability, working memory, and inhibition, processing speed better predicts total correct responses and number of clusters produced on verbal fluency tasks in an older adult group. In the current study, lower fluency production in the AD group resulted from lower cluster production or switching rates. This effect was larger for novel clusters compared to repeated clusters. Novel cluster generation might therefore be a measure of intact semantic memory access in AD, rather than speed of processing. Alternatively, reduced switching rates for AD compared to normal participants could result in the reduced number of novel clusters. Future research using regression modeling is needed to investigate measures of executive functioning, processing speed, and access to semantic memory as predictors of novel cluster generation in pathological and normal aging.
In summary, the current study found that total words, number of switches, and number of novel clusters best differentiate healthy older adults and AD participants, with the effects being larger on semantic compared to phonemic fluency. In addition, this study demonstrated the value of using a computerized scoring program to examine clustering and switching strategies in verbal fluency. Results should be replicated with a larger sample to support current findings and to investigate relationships among fluency variables and measures of processing speed, executive functions and semantic access for both normal and cognitively impaired males and females.
Acknowledgments
This research was supported in part by a Frederick Banting and Charles Best Canada Graduate Scholarship from the Canadian Institutes of Health Research awarded to Nicole Haugrud. There were no conflicts of interest for the authors in conducting this research.