Crossref Citations
This article has been cited by the following publications. This list is generated based on data provided by
Crossref.
Kremmel, Benjamin
2021.
SELLING THE (WORD) FAMILY SILVER?.
Studies in Second Language Acquisition,
Vol. 43,
Issue. 5,
p.
962.
Gablasova, Dana
and
Brezina, Vaclav
2021.
WORDS THAT MATTER IN L2 RESEARCH AND PEDAGOGY.
Studies in Second Language Acquisition,
Vol. 43,
Issue. 5,
p.
958.
Brown, Dale
Stewart, Jeffrey
Stoeckel, Tim
and
McLean, Stuart
2021.
THE COMING PARADIGM SHIFT IN THE USE OF LEXICAL UNITS.
Studies in Second Language Acquisition,
Vol. 43,
Issue. 5,
p.
950.
Dang, Thi Ngoc Yen
2021.
SELECTING LEXICAL UNITS IN WORDLISTS FOR EFL LEARNERS.
Studies in Second Language Acquisition,
Vol. 43,
Issue. 5,
p.
954.
Reynolds, Barry Lee
Xie, Xiaowen (Serina)
and
Pham, Quy Huynh Phu
2022.
Incidental vocabulary acquisition from listening to English teacher education lectures: A case study from Macau higher education.
Frontiers in Psychology,
Vol. 13,
Issue. ,
Alenkina, Tatiana Borisovna
2022.
From Genre Approaches in Teaching English as a Second Language to the Integrated Cross-Cultural Model of Genre Pedagogy: A Theoretical Review.
Pedagogy. Theory & Practice,
Vol. 7,
Issue. 4,
p.
361.
Edmonds, Amanda
Clenton, Jon
and
Elmetaher, Hosam
2022.
Exploring the construct validity of tests used to assess L2 productive vocabulary knowledge.
System,
Vol. 108,
Issue. ,
p.
102855.
2022.
Research Methods in Vocabulary Studies.
Vol. 2,
Issue. ,
Kremmel, Benjamin
Indrarathne, Bimali
Kormos, Judit
and
Suzuki, Shungo
2023.
Unknown Vocabulary Density and Reading Comprehension: Replicating Hu and Nation (2000).
Language Learning,
Vol. 73,
Issue. 4,
p.
1127.
Hsieh, Hsiaolin
Wiles, Simon
and
Solano-Flores, Guillermo
2023.
How different are English learners from their native English-speaking peers? Evidence of equivalent lexical competence in classroom conversations.
International Multilingual Research Journal,
Vol. 17,
Issue. 2,
p.
87.
Vilkaitė-Lozdienė, Laura
and
Vilkienė, Loreta
2023.
Vocabulary size estimates for Lithuanian native speakers.
ITL - International Journal of Applied Linguistics,
Vol. 174,
Issue. 2,
p.
177.
Marsden, Emma
Dudley, Amber
and
Hawkes, Rachel
2023.
Use of word lists in a high‐stakes, low‐exposure context: Topic‐driven or frequency‐informed.
The Modern Language Journal,
Vol. 107,
Issue. 3,
p.
669.
Ha, Hung Tan
Le, Linh Thi Thao
Ho, Nam Thi Phuong
and
Trang, Nguyen Huynh
2023.
Measuring native‐speaker vocabulary size. PaulNation and AverilCoxhead, John Benjamins Publishing Company, Amsterdam/Philadelphia, 2021, xiii, 160 pp., Hardbound: ISBN 9789027208149, EUR 95.00; Paperback: ISBN 9789027208132, EUR 33.00; e‐Book: ISBN 9789027260291, EUR 95.00.
International Journal of Applied Linguistics,
Vol. 33,
Issue. 1,
p.
88.
Iwaizumi, Emi
and
Webb, Stuart
2023.
To What Extent Do Learner‐ and Word‐Related Variables Affect Production of Derivatives?.
Language Learning,
Vol. 73,
Issue. 1,
p.
301.
Reynolds, Barry Lee
Xie, Xiaowen (Serina)
and
Pham, Quy Huynh Phu
2023.
The potentials for incidental vocabulary acquisition from listening to computer science academic lectures: a higher education corpus-based case study from Macau.
Frontiers in Psychology,
Vol. 14,
Issue. ,
Iwaizumi, Emi
and
Webb, Stuart
2024.
The effect of test format on productive recall of derivatives.
ITL - International Journal of Applied Linguistics,
Vol. 175,
Issue. 2,
p.
242.
Dudley, Amber
Marsden, Emma
and
Bovolenta, Giulia
2024.
A Context-Aligned Two Thousand Test: Toward estimating high-frequency French vocabulary knowledge for beginner-to-low intermediate proficiency adolescent learners in England.
Language Testing,
Vol. 41,
Issue. 4,
p.
759.
Milliner, Brett
Lange, Kriss
Matthews, Joshua
and
Umeki, Riko
2024.
Examining EFL learners’ comprehension of derivational forms: The role of overlap with base word knowledge, word frequency, and contextual support.
Language Teaching Research,
Stoeckel, Tim
McLean, Stuart
Kim, Young Ae
Shinhara, Yukie
and
Patterson, Allie Spencer
2024.
The role of context in the word family debate.
System,
Vol. 126,
Issue. ,
p.
103453.
Dudley, Amber
and
Marsden, Emma
2024.
The lexical content of high‐stakes national exams in French, German, and Spanish in England.
Foreign Language Annals,
Vol. 57,
Issue. 2,
p.
311.
Words are defined and categorized in many ways. One approach to defining words relates to the different forms in which they occur. The most common of the classifications related to word form are the word type, lemma, flemma, and word family. Word types consist of each unique word form. If we operationalize words as word types then record, records, and unrecorded are different words. Lemmas are made up of a headword and its inflections, all of which have the same part of speech. If we classify words as lemmas then a headword (e.g., record) and its inflections (records, recorded, and recording) would be categorized as the same word. Flemmas are a more recent classification type and are similar to lemmas but do not take part of speech into consideration (record and records would make up one lemma as a noun, and record, records, recorded, and recording would make up another lemma as a verb. However, the items in both of these lemmas would be included in one flemma). Word families are made up of a headword, its inflections, and derivations.Footnote 1 If we use word family as the category, then we would also include derivations such as prerecord, recorder, and unrecorded along with their inflections (prerecords, prerecorded, prerecording, records, recording, recordings, recorder, recorders) within the word family for the headword record. Thus, word types provide the narrowest definition of words in these examples, while word families provide the broadest definition.
The greatest value of larger lexical units may lie in pedagogy. Presenting headwords together with their inflections and derivations may provide a shortcut to lexical development. It is likely easier to learn different forms of the same words than to learn the same number of unrelated words. Moreover, learning headwords together with their related forms is likely beneficial for learning the inflectional and derivational systems. The greatest value of smaller lexical units may lie in research. Measuring knowledge of smaller lexical units should provide more precise findings than when using larger units because the smaller the lexical unit that is used on a vocabulary test, the more representative that test is of the vocabulary that is assessed. Moreover, ranking words according to their frequency in language is more precise when using smaller units because the ranking is more representative of the headwords in the list.
Although the preceding discussion promotes the value of larger lexical units for pedagogy and smaller lexical units for research, it would likely be misleading to suggest that one lexical unit is most appropriate for all contexts, whether within or across research and pedagogy. The reason for this is that there are several factors that likely affect the value of a lexical unit. The most significant factors might be vocabulary size, morphological knowledge, and proficiency with each of these factors interrelated to some degree (Bertram et al., Reference Bertram, Laine and Virkkala2000; Nagy et al., Reference Nagy, Anderson, Schommer, Scott and Stallman1989; Wysocki & Jenkins, Reference Wysocki and Jenkins1987). Smaller lexical units appear more sensible with less proficient learners who are unable to recognize the similarities between different forms of a word. In contrast, larger lexical units appear more sensible with more proficient learners who have gained knowledge of the inflectional and derivational systems. Thus, the proficiency of learners should be reported in any discussions of the appropriacy of different lexical units.
The type of lexical knowledge, receptive and productive, is another factor likely to affect the value of the lexical unit. The most commonly presented argument for using word families as the lexical unit is that if learners have knowledge of the form-meaning connections of a family member (e.g., pleasant) as well as knowledge of the morphological system then they may be able understand other unfamiliar members of the family (e.g., pleasantly, unpleasant) when they are encountered in context (Nation, Reference Nation2016; Nation & Webb, Reference Nation and Webb2011; Vilkaitė-Lozdienė & Schmitt, Reference Vilkaitė-Lozdienė, Schmitt and Webb2020). There is support from L1 research for this argument (Wysocki & Jenkins, Reference Wysocki and Jenkins1987). However, there are no studies that have investigated the extent to which derivatives of known L2 headwords can be successfully inferred during reading, listening, and viewing. In contrast, research tends to indicate that both L1 speakers and L2 learners find it challenging to produce all of the derivatives of headwords (Iwaizumi & Webb, Reference Iwaizumi and Webb2021; Schmitt & Zimmerman, Reference Schmitt and Zimmerman2002). Moreover, being able to use a word correctly does not ensure that other morphologically related forms of that word can be used correctly. Thus, researchers tend to agree that word families are not an appropriate lexical unit for measuring productive knowledge (Nation, Reference Nation2016; Nation & Webb, Reference Nation and Webb2011; Vilkaitė-Lozdienė & Schmitt, Reference Vilkaitė-Lozdienė, Schmitt and Webb2020).
How might the lexical unit affect research and pedagogy?
L2 learning
There is little research investigating the degree to which the lexical unit influences L2 vocabulary learning. Moreover, there is also a lack of clarity about the extent to which words and their inflected and derived forms are taught and learned together. The similarity between inflected and derived forms should make it easier to learn the different members of lemmas and word families than to learn unrelated words. However, this variation in form may increase the difficulty of learning words encountered in L2 input at least initially before the morphological system is learned. For example, it is reasonable to question whether encountering the same unfamiliar word type repeatedly when reading or listening or encountering different inflected and derived forms of unfamiliar words affects comprehension and incidental vocabulary learning gains. This is because variation in the forms of unfamiliar items may make it less likely that they are recognized and understood (Reynolds, Reference Reynolds2013). The degree to which derived and inflected forms of known words are recognized when they are encountered in meaningful contexts is a useful avenue for further research.
An advantage to researching vocabulary learning using larger lexical units may be that it has greater ecological validity than using smaller lexical units. Teachers and learners are unlikely to control for word form variation during the learning process except in the early stages of lexical development when learners lack knowledge of inflectional and derivational affixes. Once learners have gained knowledge of the English inflectional system and some knowledge of the highest frequency affixes, encounters with different infected and derived forms are likely to be viewed as opportunities to further develop and strengthen vocabulary knowledge. The disadvantage of researching vocabulary learning using larger lexical units is a lack of clarity of findings (Reynolds, Reference Reynolds2013; Reynolds & Wible, Reference Reynolds and Wible2014). Research investigating L2 vocabulary learning has rarely reported whether word types, lemmas, or word families were learned. Reynolds and Wible (Reference Reynolds and Wible2014) found that within incidental vocabulary learning research the lexical unit differed with some studies using word types (e.g., Rott, Reference Rott1999), and other studies using lemmas (e.g., Webb, Reference Webb2007) and word families (e.g., Pellicer-Sánchez & Schmitt, Reference Pellicer-Sánchez and Schmitt2010). Moreover, Reynolds (Reference Reynolds2015) found some evidence that variation in word form during reading impacted vocabulary learning. Fewer words that varied in form over two to four encounters were learned than those that had no variation in form with no difference in the amount of learning between inflected and derived forms. This is a useful starting point for further research. Examining the degree to which morphological complexity affects vocabulary learning and retention would be a useful area for further research. There would also be value in investigating other questions such as: To what extent are the inflected and derived forms of words learned together? Is it more effective to learn the same members of a word family together or apart? To what extent do learners with different vocabulary sizes have knowledge of the L2 inflectional and derivational systems? How many times do learners need to encounter unfamiliar L2 derivations to recognize and recall their meanings?
Word lists
There are several reasons why word lists are closely linked with the topic of lexical units. First, the lexical unit varies between word lists. The Academic Word List (Coxhead, Reference Coxhead2000), Nation’s (Reference Nation2006) British National Corpus word lists, and Nation’s (Reference Nation2012) British National Corpus/Corpus of Contemporary American English word lists are all made up of word families. There are also several lemma-based word lists. Brezina and Gablasova’s (Reference Brezina and Gablasova2015) New General Service List and Gardner and Davies (Reference Gardner and Davies2014) Academic Vocabulary List were both developed using lemmas as the unit of counting words. There are also several lists with multiple lexical units. There are flemma and word family versions of the Academic Spoken Word List (Dang et al., Reference Dang, Coxhead and Webb2017), and word type and lemma-based versions of the Essential Word List (Dang & Webb, Reference Dang, Webb and Nation2016b). There is also a version of Gardner and Davies (Reference Gardner and Davies2014) Academic Vocabulary List that consists of word families to go along with the lemma-based version from which it was originally developed. Second, word lists are used as the source of lexical frequency information in lexical profiling studies (for a review of these studies see Nurmukhamedov & Webb, Reference Nurmukhamedov and Webb2019). This has led to studies recommending vocabulary learning targets indicative of listening (Dang & Webb, Reference Dang and Webb2014; Van Zeeland & Schmitt, Reference Van Zeeland and Schmitt2013), reading (e.g., Nation, Reference Nation2006; Webb & Macalister, Reference Webb and Macalister2013), and viewing comprehension (e.g., Webb & Rodgers, Reference Webb and Rodgers2009). Because the word lists used in lexical profiling studies have used word families as the lexical unit, all these vocabulary learning targets have consisted of learning certain numbers of word families. If the word lists used in lexical profiling studies used a different lexical unit, the targets might be slightly different. Third, word lists are also used to source items according to their frequencies in tests such as the Vocabulary Levels Test (Nation, Reference Nation1983; Schmitt et al., Reference Schmitt, Schmitt and Clapham2001; Webb et al., Reference Webb, Sasao and Ballance2017) and the Vocabulary Size Test (Coxhead et al., Reference Coxhead, Nation and Sim2015; Nation & Beglar, Reference Nation and Beglar2007).
The lexical unit of the items that make up a word list may affect its validity in two ways. First, smaller lexical units should provide greater transparency about the relative value of the words in a list. This is because there is less ambiguity about the words that provide value within the lexical unit. For example, the frequencies of the different members of the word family for the headword replace in Mark Davies’s (Reference Davies2008–) Corpus of Contemporary American English are as follows: replace 32215, replaced 29221, replacement 16774, replacing 10873, replaces 3251, replacements 2334, irreplaceable 1020, replaceable 561, replacer 92, and replacers 13. The variation in frequencies among the different members makes the value of each item less transparent. If replace is in a list made up of word types its value is clear. If replace is in a list of lemmas, the value of its items is less transparent because the frequencies of the members range from 3251 to 32215 occurrences. If replace is in a list of word families the value of each item within the family is much more opaque with eight members being relatively frequent and two members being infrequent. Second, word lists are typically created in relation to the amount of lexical coverage that they provide in corpora. This is sensible because the greater the lexical coverage that a list provides, the greater its potential value to learners. However, because larger lexical units are made up of a greater number of word types than smaller units, lists that use larger lexical units are likely to account for more coverage. For example, the 1,000 most frequent word types, flemmas, and word families accounted for 76.46%, 80.97%, and 82.95% coverage of a 14-million-word corpus (Nation, Reference Nation2016). These differences in coverage make it challenging to evaluate the value of word lists made up of different lexical units (Dang & Webb, Reference Dang and Webb2016a).
It is important to note that word lists are primarily created as resources to aid the learning of vocabulary and much of the discussion in relation to the appropriacy of lexical units in word lists is focused on research rather than pedagogy. The presentation of words in larger lexical units would appear to have value for teaching and learning because it allows learners and teachers to quickly find and study a word, its inflections, and derivations and this may lead to more efficient gains in lexical knowledge. However, there is no empirical support for this assumption, and therefore it would be useful to examine this in future research.
Assessing vocabulary knowledge
The advantage of using larger lexical units such as word families in tests is that, by measuring knowledge of the form-meaning connections of morphologically unrelated words (e.g., play, take, keep rather than play, plays, playful), tests tap into L2 learning of distinct words without tapping into knowledge of the morphological system. The advantage of using smaller lexical units in tests of form-meaning connection is that by assessing knowledge of both morphologically related and unrelated words, a test should provide a more precise measurement of lexical knowledge (Kremmel, Reference Kremmel2016). However, the smaller the lexical unit, the larger the number of words that would require assessment. For example, Nation (Reference Nation2016) reported that the most frequent 1000 word families were made up of 3,281 lemmas and 6,838 word types. Measuring a much greater sample of items requires a much greater number of test items. If we were to follow the 30 items per 1000 word ratio used in earlier versions of the Vocabulary Levels Test (Schmitt et al., Reference Schmitt, Schmitt and Clapham2001; Webb et al., Reference Webb, Sasao and Ballance2017), we would go from a 30-item test to measure knowledge of 1000 word families to a 98-item test to measure knowledge of the 3,281 lemmas, and 205-item test to measure knowledge of 6,859 word types. Thus, it would probably make little sense to measure vocabulary size or levels with smaller lexical units. However, there might be great value in developing and validating tests of form-meaning connection designed to evaluate the vocabulary knowledge of beginning L2 learners who are still in the process of learning word parts. It would be useful to create tests measuring knowledge of the most frequent 800 lemmas in Dang and Webb’s (Reference Dang, Webb and Nation2016b) Essential Word List, which accounts for 75% of spoken and written English, or the 2,494 lemmas in Brezina and Gablasova’s New General Service List (accounting for 80–82% of the corpora from which it was derived) for a more ambitious evaluation of beginner vocabulary knowledge.
Tests such as the Vocabulary Levels Test (Nation, Reference Nation1983; Schmitt et al., Reference Schmitt, Schmitt and Clapham2001; Webb et al., Reference Webb, Sasao and Ballance2017) and the Vocabulary Size Test (Coxhead et al., Reference Coxhead, Nation and Sim2015; Nation & Beglar, Reference Nation and Beglar2007) were developed to provide a reliable measure of receptive knowledge of the form-meaning connections of words across different word frequency levels. These tests use word families as the lexical unit. This means that the tests include one item for each family (e.g., admire) that is assessed without measuring knowledge of its other family members (admires, admired, admiring, admirable, admirably, admiration, admirer, admirers, admiringly). Although tests that have used word families as the lexical unit were not developed to evaluate knowledge of other family members, there might be the assumption that these tests indicate knowledge of not only the item included in the test, but of all members of a word family for each item. Two earlier studies indicate that this is unlikely to be correct at least using receptive recall test formats with minimal or no contextual information provided to cue responses (McLean, Reference McLean2018; Ward & Chuenjundaeng, Reference Ward and Chuenjundaeng2009). The degree to which knowledge of headwords indicate knowledge of other family members using recognition formats such as multiple-choice or matching is yet to be examined in research, but would be a useful follow-up to these studies. There is also a need to examine the degree to which factors such as test format, contextual cues, item (headword, inflection, and derivation) frequency, receptive vocabulary knowledge, and proficiency affect the degree to which learners are able to demonstrate receptive and productive knowledge of L2 headwords, inflections, and derivations. However, perhaps of greatest value would be the development of tests designed to measure derivational knowledge of words at different frequencies. Receptive and productive tests of derivational knowledge could be used together with tests measuring knowledge of form-meaning connection and word parts to assess L2 learner vocabulary knowledge more accurately.
Lexical coverage and profiling
Lexical coverage refers to the percentage of known words encountered in input. Research indicates that 95% lexical coverage can provide adequate reading comprehension (Laufer, Reference Laufer, Lauren and Nordman1989), but that 98% coverage may be optimal (Hu & Nation, Reference Hu and Nation2000; Schmitt et al., Reference Schmitt, Jiang and Grabe2011). Research also indicates that 90% lexical coverage may be sufficient for listening (Van Zeeland & Schmitt, Reference Van Zeeland and Schmitt2013) and viewing comprehension (Durbahn et al., Reference Durbahn, Rodgers and Peters2020). However, as lexical coverage increases beyond 90%, comprehension is likely to improve. Taken together, studies of lexical coverage indicate that the more words that are known in L2 input, the more likely that L2 input will be understood.
The extent to which inflected and derivative forms affect comprehension in studies of lexical coverage has not been examined. Recent studies present contrasting arguments about how the lexical unit may affect lexical coverage. Brown (Reference Brown2018) found that 13.4% of the members of the most frequent 5000 word families in Nation’s (Reference Nation2006) British National Corpus word lists were derivations. This led him to suggest that this proportion of derivations may reduce lexical coverage of written text (a larger percentage of words would be unknown) and, in turn, inhibit reading comprehension. Brown et al. (Reference Brown, Stoeckel, Mclean and Stewart2020) also argue that if L2 learner knowledge is evaluated using tests that use word families as the unit of counting, lexical coverage and comprehension of L2 input may be overestimated if learners cannot understand derivatives. In contrast, Laufer and Cobb (Reference Laufer and Cobb2020) conducted a corpus-driven study of several written text types and found that relatively few derivations were encountered in the texts and a large proportion of those that were encountered included the highest frequency affixes. This led them to suggest that lexical coverage is unlikely to be affected by the use of word families as the lexical unit.
It is important to note that studies of lexical coverage tend to use carefully controlled research designs which involve replacing varying proportions of lower frequency words encountered in a text with pseudowords to provide an accurate estimate of lexical coverage (e.g., Hu & Nation, Reference Hu and Nation2000; Van Zeeland & Schmitt, Reference Van Zeeland and Schmitt2013). These studies include derivations as running words in the texts and so it would appear that lexical coverage findings are based to some degree on word families as the lexical unit. However, the degree to which the proportion of derived and inflected forms within a text affect both comprehension and lexical coverage thresholds remains to be examined and would be a useful direction for further research.
Lexical profiling research indicates vocabulary learning targets that may be sufficient for reading, listening, and viewing comprehension. For example, lexical profiling studies indicate that knowledge of the most frequent 3000 word families may be sufficient to understand television (Webb & Rodgers, Reference Webb and Rodgers2009), the most frequent 4000 word families may allow comprehension of academic lectures (Dang & Webb, Reference Dang and Webb2014), and the most frequent 8000–9000 word families may be sufficient to understand most forms of written text (Nation, Reference Nation2006). There tends to be an assumption in lexical profiling studies that learners who have achieved these learning targets are likely to have learned the inflectional and derivational systems. However, the extent to which learners have morphological knowledge in relation to different vocabulary levels remains to be explored. If learners are unable to understand derivative and inflected forms of headwords encountered during reading, listening, and viewing, then these learning targets might be too low. The only studies that have examined comprehension with learners at differing vocabulary levels have involved comprehension of television. However, both, Rodgers (Reference Rodgers2013) and Durbahn et al. (Reference Durbahn, Rodgers and Peters2020) found that learners who knew fewer than the 3000 word family vocabulary learning target (Webb & Rodgers, Reference Webb and Rodgers2009) were able to understand different TV programs. Further research investigating the degree to which learners with varying L2 vocabulary levels can understand different types of L2 input is needed.
Conclusion
Recently, discussions of lexical units have presented flemmas and word families (McLean, Reference McLean2018) and lemmas and word families (Brown et al., Reference Brown, Stoeckel, Mclean and Stewart2020) as dichotomous options of which one is more appropriate than the other. It is useful to question and investigate the appropriacy of lexical units. However, it would be surprising if one lexical unit makes the most sense for all learners and all aspects of L2 research and pedagogy. With little L2 research conducted on the appropriacy of the different lexical units, researchers should be cautious not to overgeneralize findings. This article has argued that the selection of a lexical unit should depend on several factors. These include learner variables such as vocabulary size, morphological knowledge, and proficiency, the purpose of the lexical unit (research, pedagogy), and the type of use (vocabulary learning, measuring vocabulary knowledge, developing word lists and vocabulary tests, lexical coverage and profiling). This article has also tried to highlight several of the areas in which future research on lexical units is warranted.