A Verbal Naming Test for use with Older Adults
Development and Initial Validation
Neuropsychologists who work with older adults typically assess word-finding, or naming, when evaluating for the possible presence of Alzheimer’s disease (AD) or primary progressive aphasia, or when evaluating patients after a history of stroke or other lesions in the language-dominant hemisphere. The prevalence of dementia in adults age 71 and older has been estimated to be 14%, with rates increasing to 24% for individuals age 80–89 and 37% for those individuals age 90 and older (Plassman et al., Reference Plassman, Langa, Fisher, Heeringa, Weir, Ofstedal and Wallace2007). Both the National Institute on Aging (NIA)/Alzheimer’s Association diagnostic guidelines (McKhann et al., Reference McKhann, Knopman, Chertkow, Hyman, Jack, Kawas and Phelps2011) and Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5; American Psychiatric Association, 2013) include impaired word-finding or naming in their diagnostic guidelines for dementia or neurocognitive disorder. Other types of dementia (e.g., primary progressive aphasia) are also characterized by a profound deficit in word-finding (Gorno-Tempini et al., Reference Gorno-Tempini, Hillis, Weintraub, Kertesz, Mendez, Cappa and Grossman2011; Lezak, Howieson, Bigler, & Tranel, Reference Lezak, Howieson, Bigler and Tranel2012). Damage to the left hemisphere, through strokes or other forms of acquired brain injury and temporal lobe epilepsy can lead to aphasia. Nearly all patients with aphasia have a word-finding deficit (Beeson & Rapcsak, Reference Beeson and Rapcsak2006; Blumenfeld, Reference Blumenfeld2010; Goodglass & Wingfield, Reference Goodglass and Wingfield1997; Laine & Martin, Reference Laine and Martin2006) that can be long-lasting even as aphasia symptoms improve (Goodglass & Wingfield, Reference Goodglass and Wingfield1997). Evaluation of word-finding is thus paramount in clinical neuropsychological practice for accurate diagnosis of neurocognitive disorders and proper evaluation of cognitive deficits following damage to the language-dominant hemisphere.
Neuropsychologists assess for word-finding difficulty with tests such as the naming subtest of the Neuropsychological Assessment Battery (NAB; Stern & White, Reference Stern and White2003; White & Stern, Reference White and Stern2003), the Boston Naming Test (BNT; Kaplan, Goodglass, & Weintraub, Reference Kaplan, Goodglass and Weintraub1978, Reference Kaplan, Goodglass and Weintraub2001), and the naming subtest of the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS; Randolph, Reference Randolph1998), in which an examiner presents photographs (NAB) or line drawings (BNT, RBANS) to patients (e.g., of a lion, a compass, etc.) and asks the patient to name them. Older patients often have visual impairment, however, which can interfere with the validity of these measures. Vision loss is among the top three causes of physical disability in older adults (Lee & Coleman, Reference Lee and Coleman2004), and the prevalence rate for vision loss ranges from 10% for adults aged 55–64 to over 20% after age 75 (Ryskulova et al., Reference Ryskulova, Turczyn, Makuc, Cotch, Klein and Janiszewski2008). In addition, over 700,000 adults are completely blind (Ryskulova et al., Reference Ryskulova, Turczyn, Makuc, Cotch, Klein and Janiszewski2008). Thus, a key cognitive ability of word-finding may not be adequately assessed for over 20% of older patients. The prevalence rate of blindness and low vision will increase in the coming years due to the growing population of older adults (Eye Diseases Prevalence Research Group, 2004). Additional cortical problems such as visual field cuts can pose challenges to valid test administration. Niemeier (Reference Niemeier2010) noted how visual impairment in the context of neuropsychological testing can lead to misdiagnosis, and in such situations neuropsychologists should use tests that involve alternative sensory modalities (e.g., audition) whenever possible. Thus, it would be advantageous to have a word-finding test that can be used with patients with visual impairment.
A non-visual naming test would also be an effective way to assess word-finding in patients who have intact vision. Evidence suggests that a non-visual, auditory/verbal measure of word-finding may detect mild cognitive impairment and/or dementia more effectively than picture-naming tests (Hodges and Patterson, Reference Hodges and Patterson1995, Miller, Finney, Meador, & Loring, Reference Miller, Finney, Meador and Loring2010). This may be due to aspects of the functional neuroanatomy of word-finding; word-finding and semantic processing is a complex ability often mediated by several neuroanatomical correlates including lateral temporal, temporoparietal, and anterior temporal regions (Chiang et al., Reference Chiang, Mudar, Spence, Pudhiyidath, Eroh, DeLaRosa and Hart2014; Grossman et al., Reference Grossman, McMillan, Moore, Ding, Glosser, Work and Gee2004). The anterior temporal lobes may serve as storehouses of general semantic knowledge (Brier, Maguire, Tillman, Hart, & Kraut, Reference Brier, Maguire, Tillman, Hart and Kraut2008), or knowledge of more specific concepts such as social conceptual processing (Simmons & Martin, Reference Simmons and Martin2009). Although some studies have linked picture-naming performance to anterior temporal lobe volume (Balthazar et al., Reference Balthazar, Yasuda, Pereira, Bergo, Cendes and Damasceno2010), other studies (Hamberger, Goodman, Perrine, & Tamny, Reference Hamberger, Goodman, Perrine and Tamny2001; Malow et al., Reference Malow, Blaxton, Sato, Bookheimer, Kufta, Figlozzi and Theodore1996) suggest that picture naming is more closely associated with temporo-parietal brain regions, while anterior temporal regions may be more related to word-finding in response to verbal definitional prompts. As such, visual naming tests may not adequately assess aspects of word-finding in patients with primarily anterior temporal dysfunction (e.g., subtypes of frontotemporal lobar degeneration, temporal lobe epilepsy, and focal acquired brain injuries).
In addition, verbal word-finding tasks may be a more ecologically-valid way to assess this ability. The word-finding problems associated with AD tend to occur in everyday conversational speech (Nebes, Reference Nebes1989). Indeed, dysnomia has been defined in the context of speech (Goodglass & Wingfield, Reference Goodglass and Wingfield1997). As a result, the diagnostic criteria for AD focus on difficulty while speaking (McKhann et al., Reference McKhann, Knopman, Chertkow, Hyman, Jack, Kawas and Phelps2011), as opposed to difficulty identifying pictured objects. Existing word-finding measures, however, do not assess word-finding during actual speech but in the context of looking at a picture and naming it, potentially limiting their ecological validity. Furthermore, self-reported difficulty finding words while speaking has also correlated with word-finding in response to a verbal definition but not with naming in response to a picture (Hamberger, Seidel, McKhann, & Goodman, Reference Hamberger, Seidel, McKhann and Goodman2010). Picture-naming tests such as the BNT may tap into “inability to recognize… common objects” or “object agnosia” as described in the McKhann et al. (Reference McKhann, Knopman, Chertkow, Hyman, Jack, Kawas and Phelps2011) criteria as examples of impaired visuospatial abilities (p. 265), which is distinct from word-finding difficulty. Therefore, a verbally-based, non-visual word-finding test may enable clinicians to more accurately detect difficulty in finding words while speaking.
Hamberger and Seidel (Reference Hamberger and Seidel2003) created an auditory naming test in their studies on temporal lobe epilepsy. Their measure was developed in an epilepsy clinic and used with a sample of adults with temporal lobe epilepsy. Their normative data are available for 100 healthy adults with a mean age of 34.3 (SD=11.0) and for 56 additional patients of similar age with epilepsy. The Hamberger and Seidel measure represents a valuable addition to the neuropsychological tool kit with a promising methodology. However, the normative sample does not include individuals older than 64, rendering it difficult for clinicians to determine whether an older patient’s performance is normal or impaired. Also, some items may be objectionable to older patients (e.g., “what an old man uses to walk with”) or have become outdated (e.g., “the white stuff used to write on a blackboard”). In addition, the stimuli on this measure consist of words commonly used in everyday spoken language, with frequencies of usage significantly higher than the average frequency of usage of words on the BNT or NAB Naming subtest (Yochim, Rashid, Raymond, & Beaudreau, Reference Yochim, Rashid, Raymond and Beaudreau2013). Other researchers have suggested that future confrontation naming tests should use predominantly low-frequency stimuli (Randolph, Lansing, Ivnik, Cullum, & Hermann, Reference Randolph, Lansing, Ivnik, Cullum and Hermann1999) to render the scale more sensitive to emerging difficulty retrieving words quickly (Goodglass, Kaplan, & Barresi, Reference Goodglass, Kaplan and Barresi2001; Goodglass & Wingfield, Reference Goodglass and Wingfield1997) and to word-finding errors (Kirshner, Webb, & Kelly, Reference Kirshner, Webb and Kelly1984; Skelton-Robinson & Jones, Reference Skelton-Robinson and Jones1984). The aim of this study was to develop a similar non-visual word-finding measure that improved upon the contribution of Hamberger and Seidel (Reference Hamberger and Seidel2003) by developing it in an older adult population and by using items of low frequency to increase its sensitivity.
Development of our word-finding measure, the Verbal Naming Test (VNT), followed several critical guidelines to improve upon other related measures. First, a unique aspect of our non-visual word-finding measure was the use of current word frequency ratings of spoken language in its development (Brysbaert & New, Reference Brysbaert and New2009), rather than relying upon frequency ratings for written language based on 1961 American literature (Francis & Kučera, Reference Francis and Kučera1982; Kučera & Francis, Reference Kučera and Francis1967; for more details about difference in these two word frequency systems see Yochim et al., Reference Yochim, Rashid, Raymond and Beaudreau2013). Second, unlike most other existing naming tests (i.e., with the exception of the Action Naming Test; Obler & Albert, Reference Obler and Albert1979), this one includes verbs as stimuli to be named in addition to nouns. Although naming of verbs has not been shown to change with age as much as the naming of nouns (Nicholas, Barth, Obler, Au, & Albert, Reference Nicholas, Barth, Obler, Au and Albert1997), aphasia has been shown to affect verb-naming and noun-naming differently (Kohn, Lorch, & Pearson, Reference Kohn, Lorch and Pearson1989; Miceli, Silveri, Nocentini, & Caramazza, Reference Miceli, Silveri, Nocentini and Caramazza1988). Noun and verb naming also differ with regard to more temporal lobe activity for naming nouns and more frontal cortex involvement for naming verbs (Damasio & Damasio, Reference Damasio and Damasio1992; Damasio & Tranel, Reference Damasio and Tranel1993). The inclusion of 10 verbs on the proposed measure will allow some exploration of neuroanatomical differences in verb and noun naming, and the implications for detecting and tracking progression of AD.
Neuropsychological measures are available for assessing cognitive domains other than word-finding (e.g., attention, memory) without the need for intact vision. The absence of a non-visual measure of word-finding is, therefore, unique in the field of clinical neuropsychology. Such a measure would enable the evaluation of word-finding ability in patients with visual impairment, as well as over the telephone for patients who are unable to travel to a clinic, or in the emerging field of tele-neuropsychology (Cullum, Weiner, Gehrmann, & Hynan, Reference Cullum, Weiner, Gehrmann and Hynan2006; Grosch, Gottlieb, & Cullum, Reference Grosch, Gottlieb and Cullum2011). The aim of this project was to fill this void by creating a verbal (non-visual) measure of word-finding and to report on its preliminary psychometric properties.
Methods
Participants
This study was conducted under approval from the Institutional Review Board in compliance with the Helsinki Declaration. The sample included 92 patients in a neuropsychological assessment clinic, for whom diagnoses and other neuropsychological test data were available. Another 39 participants took the measure as part of an unrelated research study without receiving diagnoses or taking other neuropsychological measures used in this study, and these 39 participants were included in the sample used for Rasch analyses only. Self-reported ethnic background was 74.8% European American, 12.2% African American, 6.1% Hispanic American, 5.3% Asian American, and 1.6% Other. There were 123 males (94%) and 8 females (6%). Mean age, education, and test performance are provided in Table 1.
Table 1. Participant characteristics (N=131)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921003755483-0284:S1355617715000120:S1355617715000120_tab1.gif?pub-status=live)
Note. Three outliers were removed from the clinic sample after Rasch analyses were completed with the entire sample.
Measures
Patients in the neuropsychology clinic completed the following measures, and raw scores from these were used in correlation analyses:
NAB Naming Test
The Neuropsychological Assessment Battery (NAB) contains two forms of a measure of word-finding, described above, in which patients are shown a photograph and asked to name the pictured object. Form 1 was used in this study. The reliability of scores from this measure has been demonstrated in a nationally representative sample closely matched to U.S. Census data, and the validity of scores from this measure has been demonstrated in healthy participants of all ages as well as in a sample of patients with aphasia (mean age of 59 years, 93% male, and 82% European American; White & Stern, Reference White and Stern2003). The validity of scores from the measure was also demonstrated in a sample of healthy adults with a mean age of 75.4 (Yochim, Kane, & Mueller, Reference Yochim, Kane and Mueller2009). Patients who scored more than one SD below the published normative sample on this measure were defined as having dysnomia for analyses described below.
California Verbal Learning Test-Second Edition
The California Verbal Learning Test, second edition (CVLT-II; Delis, Kramer, Kaplan, & Ober, Reference Delis, Kramer, Kaplan and Ober2000) is a commonly-used measure of verbal memory (Rabin, Barr, & Burton, Reference Rabin, Barr and Burton2005) with extensive normative data. The test involves having an examinee learn a list of 16 words over five trials, and then recalling the words 20 min later. Scores from this measure have high internal consistency based on split-half reliability, r=.94, in a national sample of 1,087 adults age 16–89 closely matched to U.S. Census data, and scores have demonstrated validity as a measure of verbal memory in a sample of 62 healthy adults with a mean age of 36.8 years, and in a variety of studies cited in the manual (Delis et al., Reference Delis, Kramer, Kaplan and Ober2000). The measure produces numerous scores, two of which were used in this study: the total words learned in Trials 1–5, and the number of words recalled after a long delay (Long Delay Free Recall).
D-KEFS Verbal Fluency
This subtest, used as a measure of language output and of executive functioning, is part of the Delis-Kaplan Executive Function System (D-KEFS; Delis, Kaplan, & Kramer, Reference Delis, Kaplan and Kramer2001). Participants are asked to verbally generate in 1 min as many words as possible that start with the same letter or belong to a certain semantic category. This task is completed three times with three different letters (F, A, and S) and two times with different categories (e.g., animals, boys’ names). Internal consistency reliability for scores on this measure has been shown to range from .85 to .87 for letter fluency and .64 to .76 for category fluency among adults age 60–89 (Delis et al., Reference Delis, Kaplan and Kramer2001). Test–retest reliability has been shown to be .88 for letter fluency scores and .82 for category fluency among healthy adults age 50–89 (Delis et al., Reference Delis, Kaplan and Kramer2001).
D-KEFS Trail Making Test
The D-KEFS Trail Making Test involves a series of 5 conditions: visual scanning, number sequencing, letter sequencing, number-letter switching, and motor speed. This study incorporated the fourth condition, Number-Letter Switching, as a measure of complex attention and executive functioning. On the fourth condition, Number–Letter Switching, the participant draws a line connecting dots, alternating between dots containing numbers and dots containing letters. Scores from this measure have sufficient test–retest reliability in a sample of 36 healthy adults age 50–89 (Delis et al., Reference Delis, Kaplan and Kramer2001) and have been used extensively as a measure of executive functioning in studies on patients with frontotemporal dementia (e.g., with a mean age of 59.6, Huey et al., Reference Huey, Goveia, Paviol, Pardini, Krueger, Zamboni and Grafman2009) and patients with frontal lesions with a mean age of 65.5 (Yochim, Baldo, Nelson, & Delis, Reference Yochim, Baldo, Nelson and Delis2007). For this study, the total completion time was used as a measure of set-switching, with lower scores indicating better performance.
Judgment of Line Orientation
The Judgment of Line Orientation test (Benton, Sivan, Hamsher, Varney, & Spreen, 1994; Benton, Varney, & Hamsher, Reference Benton, Varney and Hamsher1978) is a measure of visuoperceptual functioning. Participants are shown two lines and asked to pick which two lines, from a stimulus array below, match the spatial orientation of the initial two lines. Scores from the measure had high test–retest reliability (r=0.90) in a sample of 37 patients (Benton et al., Reference Benton, Sivan, de Hamsher, Varney and Spreen1994), strong convergent validity with other visuospatial measures among patients with cerebrovascular lesions with a mean age of 66 (Trahan, Reference Trahan1998), and a strong relationship with right parietal functioning among patients with focal brain damage and a mean age of 51 (Tranel, Vianna, Manzel, Damasio, & Grabowski, Reference Tranel, Vianna, Manzel, Damasio and Grabowski2009).
Selection of Items for the Verbal Naming Test (VNT)
We enlisted Brysbaert and New’s (Reference Brysbaert and New2009) listing of words at various frequencies of usage to develop the Verbal Naming Test. A listing of words was generated with frequencies less than 6/1,000,000, which characterizes items 30–60 of the BNT (Yochim et al., Reference Yochim, Rashid, Raymond and Beaudreau2013). This listing of words was then carefully scanned for possible items, in consultation with the authors on this study and other neuropsychologists. Special care was taken in the choice of stimuli to ensure the following: (1) Items should be spoken rarely enough to be difficult to generate, increasing the measure’s sensitivity, while at the same time (2) being relatively free of relationships with education. (3) Items should be as culture-free and non-objectionable as possible (e.g., a noose is not included as a stimulus). (4) Items should be able to be defined with as short a definition as possible (e.g., “a baby cow”), to ensure comprehension difficulties do not interfere substantially. Pilot testing has shown that even patients with comprehension impairment generate phonemic paraphasic approximations of the target item, indicating they comprehended the stimulus prompt. (5) Items should have a definition prompt that should only elicit one word (i.e., avoiding stimulus words that have synonyms) to increase the standardization of administration and scoring. For example, the item “what you push a baby in to go for a walk” was eliminated because it unexpectedly generated several different responses (stroller, perambulator, carriage, buggy). (6) Items should not be limited to concrete nouns, but also include abstract nouns (e.g., “decade”) and verbs. (7) Items should be avoided if the clearest definition included the target word, or a variation of the target word, itself (e.g., “the thing you use to clean your teeth” for toothbrush). Using these criteria, 60 items were generated, ranging in frequency from 0.55 to 5.29 per million, with two items at the beginning of the test with very high frequency (7.49 and 10.51), to ensure patients experience some success at the beginning of the task. Items were chosen such that there are approximately equivalent numbers of items at five frequency levels of 0.0–0.9, 1.0–1.9, 2.0–2.9, 3.0–3.9, and 4.0–5.3. Sixty items were selected to be similar to the widely used Boston Naming Test. In addition, 60 was thought to be an adequate number of items because elimination of problematic items would still leave enough items for the measure to be reliable and valid. Items were ordered in decreasing frequency, such that the measure becomes more difficult with each item. There are no discontinue rules for administration at this time.
On the test, after a definitional prompt is given, patients have 10 s to independently generate the word, or an approximation of the word such that the examiner can identify it as the target item. This ensures that regional variations in pronunciation of the word are not scored as incorrect. After 10 s, a phonemic cue is provided (e.g., “it starts with ‘um-’”). If the patient cannot generate the word within 10 s, the next item is administered. Clinicians should differentiate between articulation errors and phonemic or semantic paraphasic errors. The total score on the measure is the number of items correctly answered without phonemic cues. A delay of 10 s has been used on other established naming tests (i.e., the NAB Naming test, Stern & White, Reference Stern and White2003). We also selected 10 s over longer delays provided by other naming tasks (e.g., 20 s on the BNT) because difficulty generating names within only 2 s has been shown to be a sensitive indicator of neurological disorders such as AD and temporal lobe epilepsy (Hamberger & Seidel, Reference Hamberger and Seidel2003; Miller et al., Reference Miller, Finney, Meador and Loring2010) and healthy controls and patients are able to generate names within 7–8 s on naming tasks (Bell, Seidenberg, Hermann, & Douville, Reference Bell, Seidenberg, Hermann and Douville2003). Moreover, the diagnostic guidelines for dementia due to AD include “hesitations” in finding words as a symptom of AD, and taking longer than 10 s to find a word would indicate the presence of such hesitations. A 10-s time limit also prevents unnecessarily tiring or discouraging of patients.
Statistical Analyses
Rasch analyses (Bond & Fox, Reference Bond and Fox2007) were completed with the Winsteps software program. Items that showed evidence of bias were sought through Rasch analyses of differential item functioning between European American and non-European American participants. We dichotomized ethnicity in this way because there were not enough African American, Asian American, and Hispanic American participants to perform Rasch analyses specific to each ethnic group. Only eight females participated in this study, thus formal evaluation of bias related to sex was not possible. After these items were removed, other items showing statistically significant poor infit and outfit characteristics in Rasch analyses (i.e., less than 0.5 or greater than 1.5) were removed.
After Rasch analyses were completed, only the clinic patients were used for the following analyses. Correlations with demographic variables and with measures considered highly related (NAB Naming), semi-related (CVLT-II and D-KEFS Verbal Fluency), and unrelated (Judgment of Line Orientation and D-KEFS Trail Making) were calculated. T tests were calculated to determine whether patients with dysnomia performed significantly worse on the measure than patients without dysnomia. (Patients can have dysnomia but not dementia, and patients can have dementia without dysnomia. In our sample, 61% of the patients with dysnomia received diagnoses of dementia, 50% of the patients with dementia had dysnomia, and 18% of the sample had both dementia and dysnomia.) Logistic regression was conducted to determine the measure’s effectiveness in detecting dysnomia. ROC analyses were conducted to arrive at an optimal cut score to use for diagnosing patients as with or without dysnomia. Sensitivity, specificity, positive predictive value, and negative predictive value were calculated. T tests were also calculated to determine if patients without cognitive diagnoses performed better than those with MCI, and if patients with MCI performed better than patients with dementia.
Results
Two items that prompted multiple answers because of multiple synonyms were removed. In Rasch analyses, point-measure correlations ranged from 0.35 to 0.59. Only one item showed significant differential item functioning (Dif contrast=1.71, p<.05) between European Americans and non-European Americans, and it was removed. Two other items showed poor infit (Mean square for both items=0.33) and outfit (Mean square for both items=0.01) that was statistically significant (zstd=−2.0) and they were removed, arriving at a scale with 55 items (see Appendix). Males (mean=48.2; SD=5.4) obtained similar scores as the eight females (mean=48.1; SD=6.5).
All subsequent analyses included only the clinic patients. Participants whose total scores on the naming task were more than two SDs below the mean (n=3) were removed for all the following analyses, leading to a total of 89 participants for the following analyses. In the clinic sample, 23 patients were cognitively normal, 30 received diagnoses of MCI, and 36 received diagnoses of dementia due to various causes (11 due to AD, 8 due to AD and vascular disease, 5 due to vascular disease, 3 due to Parkinson’s disease, 4 due to multiple causes, and 5 unspecified).
The 55-item scale had a Cronbach’s alpha of 0.84. Correlations with other measures used in neuropsychological evaluations are presented in Table 2. As can be seen, the measure correlated highly with the NAB Naming test, measures of verbal fluency, and the CVLT-II.
Table 2. Correlations between Verbal Naming Test and other variables (N=89)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921003755483-0284:S1355617715000120:S1355617715000120_tab2.gif?pub-status=live)
*p<.05.
**p<.01.
Patients with dysnomia (n=28; 32% of the clinic sample), defined by NAB Naming performance more than one SD below the mean, performed significantly worse on the Verbal Naming Test (mean=41.9; SD=5.3) than did those without dysnomia (mean=50.0; SD=3.6); t(84)=8.2; p<.001. Logistic regression, using VNT scores as a predictor of dysnomia, found a C index of 0.89. The VNT was a significant predictor of dysnomia, Wald (1)=22.8, p<.001, with an overall correct classification rate of 83.7%. An ROC analysis was also conducted to determine how well the Verbal Naming Test detected dysnomia. This analysis found the area under the curve (AUC) to be 0.89. With a cut score of 47.5, sensitivity was 86%, specificity was 74%, positive predictive value was 62%, and negative predictive value was 92%. A cut score of 46.5 resulted in sensitivity of 79%, specificity of 85%, positive predictive value of 71%, and negative predictive value of 89%. We suggest the use of 46.5 as a cut score for diagnosing word-finding impairment, because it results in the least spread between sensitivity and specificity and positive and negative predictive value. This value is also approximately 1 SD below the mean VNT score of clinic patients with normal word-finding.
Patients with no cognitive impairment (n=23) performed significantly better (mean=51.6; SD=3.6) than patients diagnosed with MCI (n=30, mean=48.1; SD=4.9), t(51)=2.9; p<.01. Patients diagnosed with MCI performed significantly better than patients diagnosed with dementia (n=36; mean=43.7; SD=5.6); t(64)=3.4; p<.01.
Discussion
This study presents psychometrics for the newly developed Verbal Naming Test that improves upon existing word-finding tests in the following ways: (1) stimuli were chosen based on how frequently words are used in everyday spoken language; (2) stimuli were chosen only if rarely used in everyday conversation, increasing the sensitivity of the measure to mild word-finding deficits; (3) the measure can be used in patients with visual impairment; and (4) the measure can be administered in tele-neuropsychological evaluations. Tele-neuropsychology is a growing area in clinical neuropsychology (Cullum et al., Reference Cullum, Weiner, Gehrmann and Hynan2006; Grosch et al., Reference Grosch, Gottlieb and Cullum2011), and this measure can be used in such evaluations, without concerns over how best to show visual stimuli to patients over the teleconference medium.
Preliminary data suggest that this measure has strong psychometric characteristics, with high internal consistency and validity. Convergent validity was demonstrated by its large correlation with another word-finding measure and medium-large correlations with related measures of verbal fluency and verbal memory, while discriminant validity was confirmed by weaker correlations with less-related constructs (i.e., measures of visuoperception and set-switching). The measure also showed strong sensitivity, specificity, positive predictive value, and negative predictive value in detecting dysnomia with a cut score of 46.5 of 55 (i.e., with scores of 46 or below indicating impairment). There were significant differences in scores between patients with no impairment and with MCI, and between patients with MCI and with dementia, indicating the measure can be useful in detecting these conditions in older adults.
To our knowledge, this is the first word-finding test to incorporate frequency of usage in everyday spoken language in the choice of stimuli. This methodology should make this measure more externally valid, and the choice of rarely used words should increase its sensitivity to early word-finding deficits. As patients develop word-finding difficulty, these problems first occur when thinking of rarely used words, as opposed to commonly-used words (Kirshner et al., Reference Kirshner, Webb and Kelly1984; Skelton-Robinson & Jones, Reference Skelton-Robinson and Jones1984).
Efforts were made to create a measure that is free from associations with education. Unfortunately, with the removal of three outliers from the data, the correlation with education became significant and medium in size (0.31), indicating that education accounts for 9.6% of the variance in performance. In contrast, the NAB Naming Test did not correlate significantly (r=.19; p=.09) with education in this sample, although the NAB Naming Test has correlated (r=0.32; p<.05) with education in a non-clinical community sample (Yochim et al., Reference Yochim, Kane and Mueller2009). The BNT has consistently been found to correlate with education (for review, see Strauss, Sherman, & Spreen, Reference Strauss, Sherman and Spreen2006), accounting for 10–13% of the variance in BNT performance. It is difficult, if not impossible, to create a word-finding test that is completely independent of education, as patients can only have familiarity with the stimuli through some degree of education. Future investigations with this measure will determine if specific items are highly impacted by education, and these items can be removed to arrive at a shorter form. Future normative data can also be stratified by education.
Another limitation is that, while the participants were from fairly diverse ethnic backgrounds, the study included only eight women (five of whom were clinic patients), because it was conducted in a Veterans’ health care system. In this sample, males and females on average obtained similar Verbal Naming Test scores, although prior work (Randolph et al., Reference Randolph, Lansing, Ivnik, Cullum and Hermann1999; Welch, Doineau, Johnson, & King, Reference Welch, Doineau, Johnson and King1996; Zec, Burkett, Markwell, & Larsen, Reference Zec, Burkett, Markwell and Larsen2007) has found males and females to perform differently on the BNT. Future research with larger numbers of men and women varying in age, education, and ethnicity will determine if there is differential item functioning based on these demographic variables and the measure’s generalizability to patients with diverse ethnic, educational, and socioeconomic backgrounds. The measure would also benefit from the development of versions in other languages. Like other naming measures, it should only be used cautiously, if at all, for patients who speak English as a second language. Clinicians should use their best judgment in determining whether a patient has been fluent enough in English, for a long enough period of time, for this test to be a valid measure of a patient’s word-finding ability. Research on the use of this measure with younger populations would also help to ensure it works well with younger adults, although most common causes of impaired word-finding (e.g., Alzheimer’s disease, strokes in the language-dominant hemisphere) occur primarily with older adults. Furthermore, the development of an alternate form in future studies will increase the usefulness of the measure for tracking change in word-finding ability over time while limiting practice effects.
Like all neuropsychological measures, sensory (e.g., hearing) impairment must be accounted for in the interpretation of performance. The use of phonemic cues assists clinicians in this regard. If a patient correctly produces a target word after a phonemic cue, it can be assumed that the patient correctly heard the item prompt. Pilot testing has shown that even patients with comprehension impairment generate phonemic paraphasic approximations of the target item, or will often say “I know what it is but I can’t think of it”, indicating they comprehended the stimulus prompt. If pure word deafness or auditory agnosia are suspected, it is recommended that clinicians supplement this measure with a picture-naming test. Fortunately, both of these conditions are known to be rare (Bauer, Reference Bauer2012), and likely to be detected and observed across measures administered as part of a larger battery (e.g., verbal memory testing). If a clinician observed impaired performance on the VNT in the context of completely preserved spontaneous speech, then syndromes such as these two would be in the diagnostic differential.
Rasch analyses were conducted to evaluate infit and outfit of items, and to assess for the presence of differential item functioning between European American and non-European American participants. Only two items showed significantly poor infit and outfit, and one item showed differential item functioning by ethnicity. While our study benefitted from 26% of our sample identifying as ethnic minorities, future research should continue to assess whether particular items have different difficulty levels for various ethnic groups, particularly for the rapidly growing Hispanic American population.
This study presents a new measure to add to the toolkit for clinical neuropsychologists, especially those who perform evaluations for older adults. Because approximately 20% of older adults have some degree of visual impairment (Ryskulova et al., Reference Ryskulova, Turczyn, Makuc, Cotch, Klein and Janiszewski2008), neuropsychologists need assessment tools that do not require strong visual perception. This measure may also prove to be useful for assessing word-finding in all patients, regardless of their visual acuity. Other cognitive domains (e.g., memory) can be assessed in patients with visual impairment, and this measure will enable the assessment of another critical cognitive domain (i.e., word-finding) in such patients. Other advantages include the lack of a need to use a stimulus booklet to administer the measure, and the measure’s brief administration time (mean of 9.4 min, SD=2.7 in our sample), which is shorter than the typical BNT administration time (10–20 min; Strauss et al., Reference Strauss, Sherman and Spreen2006). The measure will also facilitate the increased use and development of tele-neuropsychological evaluations, which will help increase the provision of services to patients in rural settings who may not have access to neuropsychology clinics.
Acknowledgments
There are no conflicts of interest to report. This study was financially supported by the Sierra Pacific Mental Illness Research, Education, and Clinical Center (MIRECC) at the VA Palo Alto Health Care System (B.P.Y.); an Alzheimer’s Association New Investigator Research Grant (S.B., NIRG-09-133592); and the Department of Defense Telemedicine and Advanced Technology Research Center (TATRC, J.K.F., W81XWH-12-1-0584).
APPENDIX: VERBAL NAMING TEST
VERBAL NAMING TEST
1ST EDITION (VNT-1)
Name: ___________________________________ Date: _______________________
SAY: “Now we are going to do something different. I’m going to describe an object or a verb and I want you to tell me the name of what I am describing.”
After each prompt, allow the examinee 10 s to respond. If an incorrect response is given, say “No, it’s something else” and allow the examinee the remainder of the initial 10 s to respond. If no correct response is provided during the initial 10 s, provide the phonemic cue, saying “It starts with the sound… (underlined part of word)”. If after 10 s from the phonemic cue they have not provided the correct word, proceed to the next item.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922013351-66579-mediumThumb-S1355617715000120_taba1.jpg?pub-status=live)