Introduction
The lexical abilities of bilingual children support and interact with phonological and grammatical abilities (Kehoe & Girardier, Reference Kehoe and Girardier2020; Pham, Reference Pham2016; Simon-Cereijido & Gutiérrez-Clellen, Reference Simon-Cereijido and Gutiérrez-Clellen2009), and are therefore crucial for language acquisition, literacy development, and academic success (Janus, Labonté, Kirkpatrick, Davies & Duku, Reference Janus, Labonté, Kirkpatrick, Davies and Duku2019; Krenca, Segers, Chen, Shakory, Steele & Verhoeven, Reference Krenca, Segers, Chen, Shakory, Steele and Verhoeven2020; O'Connor, O'Connor, Tarasuik, Gray, Kvalsvig & Goldfeld, Reference O'Connor, O'Connor, Tarasuik, Gray, Kvalsvig and Goldfeld2018; Prevoo, Malda, Mesman & van IJzendoorn, Reference Prevoo, Malda and Mesman2016). Lexical abilities are not a uniform construct but comprise several different components and stages of word processing. In order to reach a better understanding of the complexity of word processing, psycholinguistic models have been proposed. In particular, the logogen model, first introduced by Morton (Reference Morton1969), describes distinct input and output components of oral language processing (see Figure 1). For the purpose of the present study, the distinction between semantic aspects of word processing (located in the semantic system) and the processing of the phonological word form (located in the phonological input and output lexicons) is of special interest. Although three formats of word forms have been described in the literature – the phonological word form, the morphological word form, and the orthographic word form (Berninger, Abbott, Thomson, Wagner, Swanson, Wijsman & Raskind, Reference Berninger, Abbott, Thomson, Wagner, Swanson, Wijsman and Raskind2006) – we always refer to the phonological word form throughout this paper.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220620142801934-0219:S1366728921000936:S1366728921000936_fig1.png?pub-status=live)
Figure 1. Oral language route of the Logogen model (based on Kay, Lesser & Coltheart, Reference Kay, Lesser and Coltheart1996; Johnson, Paivio & Clark, Reference Johnson, Paivio and Clark1996; and Coltheart, Reference Coltheart2004).
So far, the majority of studies on lexical competence in bilinguals has been based on the assessment of vocabulary size, often using word-picture matching or picture naming tasks (e.g., Ehl, Bruns & Grosche, Reference Ehl, Bruns and Grosche2020; Hoff & Ribot, Reference Hoff and Ribot2017; Hoff, Rumiche, Burridge, Ribot & Welsh, Reference Hoff, Rumiche, Burridge, Ribot and Welsh2014; Karlsen, Lyster & Lervåg, Reference Karlsen, Lyster and Lervåg2017; Schaefer, Ehlert, Kemp, Hoesl, Schrader, Warnecke & Herrmann, Reference Schaefer, Ehlert, Kemp, Hoesl, Schrader, Warnecke and Herrmann2019; Vermeer, Reference Vermeer2001). These traditional tasks do not allow a distinction between different components of word processing – in particular, between the semantic system or the phonological input and output lexicons. Other studies focused on semantic processing (e.g., Schwartz & Katzir, Reference Schwartz and Katzir2012; Vermeer, Reference Vermeer2001). To capture word form processing more accurately, specific tasks have to be applied. However, targeted investigations into word form processing at school age are sparse (e.g., Gangopadhyay, Ellis Weismer, & Kaushanskaya, Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019; Oppenheimer Fleury & Brandão de Avila, Reference Fleury F and de Avila CR2015) although word form processing contributes to the further development of vocabulary and literacy skills in primary school age (Verhoeven, van Leeuwe & Vermeer, Reference Verhoeven, van Leeuwe and Vermeer2011).
A few studies attempted to disentangle lexical knowledge and lexical processing (e.g., Gangopadhyay et al., Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019; Marchman, Bermúdez, Bang & Fernald, Reference Marchman, Bermúdez, Bang and Fernald2020; Marchman, Fernald & Hurtado, Reference Marchman, Fernald and Hurtado2010; Windsor & Kohnert, Reference Windsor and Kohnert2004). Therefore, the present study aims at deepening this line of research by investigating word form processing largely independent of lexical knowledge and semantic aspects of word processing. We seek to pinpoint word form processing using specific tasks, compare word form processing of bilingual and monolingual children, and identify and control for factors that may influence their performance.
In the following sections, we will first review current research on overall vocabulary size, followed by studies focusing on word form processing located in the phonological input or output lexicon.
Vocabulary size in bilingual children
When vocabulary size is assessed in a single language, bilingual children have smaller vocabularies in their second language than monolingual peers. This has been shown for word comprehension and word production, for several language pairs, like Urdu or Punjabi and Norwegian (Karlsen et al., Reference Karlsen, Lyster and Lervåg2017), Russian and Hebrew (Schwartz & Katzir, Reference Schwartz and Katzir2012), Japanese and English (Kuo, Uchikoshi, Kim & Yang, Reference Kuo, Uchikoshi, Kim and Yang2016), or Polish or Turkish and German (Schaefer et al., Reference Schaefer, Ehlert, Kemp, Hoesl, Schrader, Warnecke and Herrmann2019), as well as for diverse language combinations (e.g., Bialystok, Luk, Peets & Yang, Reference Bialystok, Luk, Peets and Yang2010; Dixon, Thomson & Fricke, Reference Dixon, Thomson and Fricke2020; Hutchinson, Whiteley, Smith & Connors, Reference Hutchinson, Whiteley, Smith and Connors2003).
Another method of assessing vocabulary size is based on conceptual vocabulary, i.e., on the number of lexical concepts for which a child knows a word in at least one of his or her languages (Core, Hoff, Rumiche & Señor, Reference Core, Hoff, Rumiche and Señor2013; Gross, Buac & Kaushanskaya, Reference Gross, Buac and Kaushanskaya2014). A study on conceptual vocabulary size considering lexical entries in the first language (L1; here: Albanian, Somali, Tamil, Turkish, Urdu/Punjabi, or Vietnamese) and/or the second language (L2; here: Norwegian), demonstrated that receptive vocabulary size of bilingual children did not differ from monolingual children at age 6 to 13 (Monsrud, Rydland, Geva, Thurmann-Moe & Halaas Lyster, published online June 18, Reference Monsrud, Rydland, Geva, Thurmann-Moe and Halaas Lyster2019). The same outcome was shown for bilingual children acquiring Spanish and English simultaneously compared to monolingual English-speaking children aged 5 to 7, but the expressive conceptual vocabulary was significantly larger in the monolingual compared to the bilingual group (Gross et al., Reference Gross, Buac and Kaushanskaya2014). In contrast, Ehl et al. (Reference Ehl, Bruns and Grosche2020) did not discover any significant differences between simultaneous and sequential 5- to 10-year-old Turkish–German bilinguals and German monolinguals concerning the size of the expressive conceptual vocabulary. Two further studies did not find any disadvantage regarding the expressive vocabulary in 4- to 7-year-old and 6- to 10-year-old Russian-German bilinguals compared to monolinguals (Klassert, Gagarina & Kauschke, Reference Klassert, Gagarina and Kauschke2014; Montanari, Abel, Graßer & Tschudinovski, Reference Montanari, Abel, Graßer and Tschudinovski2018).
To summarize, prior work suggests that vocabulary size is limited when the second language of bilingual children is compared to monolinguals. When conceptual vocabulary is considered, the results are heterogeneous: some studies found limitations in bilinguals, while others did not. The studies reported here are based on picture naming and/or word-picture matching tasks that involve the semantic system and the representation of word forms.
Word form processing in bilingual children
Abilities in word form processing can be investigated through word form-related tasks such as auditory lexical decision (LDT; Claessen & Leitão, Reference Claessen and Leitão2012; Hein & Kauschke, Reference Hein and Kauschke2020; Jones & Brandt, Reference Jones and Brandt2018; Ripamonti, Lucchelli, Lazzati, Martini & Luzzatti, Reference Ripamonti, Lucchelli, Lazzati, Martini and Luzzatti2017), rapid naming (RNT; Dixon et al., Reference Dixon, Thomson and Fricke2020; Hein & Kauschke, Reference Hein and Kauschke2020; Messer & Dockrell, Reference Messer and Dockrell2006), and rhyming (Spencer, Doyle, McNeil, Wambaugh, Park & Carroll, Reference Spencer, Doyle, McNeil, Wambaugh, Park and Carroll2000; van Goch, McQueen & Verhoeven, Reference van Goch, McQueen and Verhoeven2014). The LDT enables the assessment of the quality of stored word form representations (Claessen & Leitão, Reference Claessen and Leitão2012) in the phonological input lexicon. This task encompasses the speeded judgment of randomly presented spoken words as real words or pseudowords by pressing a button or indicating “yes” (real word) or “no” (pseudoword) (Goldinger, Reference Goldinger1996). The position and degree of pseudoword manipulation seem to play a crucial role in the level of difficulty of this task (Jones & Brandt, Reference Jones and Brandt2018), which might be moderated by the phonological system of the first language (Van der Feest & Fikkert, Reference van der Feest and Fikkert2015). Gangopadhyay et al. (Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019) examined 9- and 10-year-old English-speaking monolingual and English-Spanish speaking bilingual children through the LDT at two time points (T1 and T2) one year apart. The results revealed no significant differences between monolingual and bilingual children with respect to reaction times at T1 and T2, but monolingual children were more accurate than bilingual children at T1. At T2 this difference disappeared indicating that bilingual children caught up to their monolingual peers within one year. Windsor and Kohnert (Reference Windsor and Kohnert2004) also found significant differences in an LDT in 8- to 13- year-old English-speaking monolinguals and English-Spanish bilinguals, but only for judgments about pseudowords where monolingual children outperformed the bilingual group. No significant differences were identified in a study by Hemsley, Holm and Dodd (Reference Hemsley, Holm and Dodd2006) who investigated English monolinguals and English–Vietnamese as well as English–Samoan bilinguals at age 11.
Rapid naming is usually assessed by rapid automatized naming (RAN; Denckla & Rudel, Reference Denckla and Rudel1976) or rapid alternating stimulus tasks (RAS; Wolf, Reference Wolf1986). Those tasks were often used in research on reading and academic outcomes (e.g., Aguilar-Mediavilla, Buil-Legaz, López-Penadés, Sanchez-Azanza & Adrover-Roig, Reference Aguilar-Mediavilla, Buil-Legaz, López-Penadés, Sanchez-Azanza and Adrover-Roig2019; Bellocchi, Tobia & Bonifacci, Reference Bellocchi, Tobia and Bonifacci2017), indicating a moderate-to-strong relationship between reading performance and rapid naming, especially with regard to alphanumeric stimuli (Araújo, Reis, Petersson & Faísca, Reference Araújo, Reis, Petersson and Faísca2015). In addition, the RNT can be used to assess the lexical retrieval of word form representations from the phonological output lexicon (Hein & Kauschke, Reference Hein and Kauschke2020). In RNTs, familiar items have to be named repeatedly and as fast and accurately as possible after a practicing phase (Norton & Wolf, Reference Norton and Wolf2012). Due to the practicing phase, the items become familiar to the child, reducing the role of semantic information. On the other hand, the involvement of the phonological output lexicon is stronger because of the time pressure and the repeated presentation of the same stimuli. Thus, the RNT requires the same processing components as the picture naming task, but with a different weighting.
Oppenheimer Fleury and Brandão de Avila (Reference Fleury F and de Avila CR2015) did not observe any differences in an RNT between Brazilian Portuguese- and English-speaking bilinguals and Brazilian Portuguese monolinguals at primary school age. Similar results were found by Geva and Farnia (Reference Geva and Farnia2012) in 10-year-old English monolingual and bilingual children speaking English and another language, and by Dixon et al. (Reference Dixon, Thomson and Fricke2020), who compared 8-year-old English-speaking monolingual children with children speaking English and another language. Differences between English monolingual and English–Sylheti bilingual children aged 10 to 12 were apparent in one subtest of the RNT (subtest: numbers) with faster response times in bilingual children but not for the second subtest using common objects (Frederickson & Frith, Reference Frederickson and Frith1998). This finding was supported by Hutchinson, Whiteley, Smith and Connors (Reference Hutchinson, Whiteley, Smith and Connors2004), who investigated 6-year-old English-speaking monolinguals and British Asian children learning English as an additional language.
Finally, word form related abilities can be assessed by rhyming tasks (Grofčíková & Máčajová, Reference Grofčíková and Máčajová2021; Milberg, Blumstein & Dworetzky, Reference Milberg, Blumstein and Dworetzky1988). In this complex task, a child has to deal with word forms on a metalinguistic level by activating phonologically similar word forms in the phonological input and/or output lexicon. Previous studies with bilingual children used different forms of rhyming tasks with a different weighting of sublexical and lexical processing: Spoonerisms require word segmentation and synthesis in order to substitute sublexical units, like phonemes or syllables, in real words and pseudowords. Receptive tasks on rhyme awareness measure the ability to identify rhyming words by segmenting the words into their sublexical components onset and rhyme, and to identify and compare the rhymes. In rhyme production tasks, a word rhyming with a stimulus word has to be retrieved from the phonological output lexicon.
Using a rhyme awareness task, Goriot, Unsworth, van Hout, Broersma and McQueen (Reference Goriot, Unsworth, van Hout, Broersma and McQueen2021) found no group differences in 4-to-7-year-old Dutch monolingual children compared to Dutch–English bilingual children. Similar findings were reported by Soleimani and Arabloo (Reference Soleimani and Arabloo2018) and Ahmadian, Bahrami and Amini (Reference Ahmadian, Bahrami and Amini2016) in 5- and 6-year-old Persian monolinguals and Turkish–Persian or Kurdish–Persian bilinguals as well as by Frederickson and Frith (Reference Frederickson and Frith1998) for spoonerisms. In contrast, Dixon et al. (Reference Dixon, Thomson and Fricke2020) as well as Hutchinson et al. (Reference Hutchinson, Whiteley, Smith and Connors2004) revealed significantly better performance of monolingual children compared to bilingual children in spoonerisms and a rhyme awareness task.
To sum up, recent research on word form processing yielded highly discrepant findings concerning the performance of bilingual compared to monolingual children. This might be caused by the different types of tasks, varying age groups, or the kind of language combination investigated. With respect to factors influencing word form processing, it has been demonstrated that word form processing improves with increasing age in bilingual as well as in monolingual children (Bahn, Vesker, García Alanis, Schwarzer, & Kauschke, Reference Bahn, Vesker, García Alanis, Schwarzer and Kauschke2017; Gangopadhyay et al., Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019; Hein & Kauschke, Reference Hein and Kauschke2020; Li, Kirby & Georgiou, Reference Li, Kirby and Georgiou2011; Ponari, Norbury & Vigliocco, Reference Ponari, Norbury and Vigliocco2018; van den Bos, Zijlstra & lutje Spelberg, Reference van den Bos, Zijlstra and lutje Spelberg2002; Windsor & Kohnert, Reference Windsor and Kohnert2004), and can be influenced by gender favoring girls (Burman, Bitan & Booth, Reference Burman, Bitan and Booth2008; Wilsenach & Makaure, Reference Wilsenach and Makaure2018). In addition, vocabulary size seems to impact on word processing. For example, Ainsworth, Welbourne and Hesketh (Reference Ainsworth, Welbourne and Hesketh2016) showed that children's quality of word form representations significantly correlates with vocabulary. This finding supports the Lexical Restructuring Hypothesis, which postulates a connection between the specificity and granularity of word form representations and vocabulary size (Walley & Metsala, Reference Walley and Metsala2003).
So far, to our knowledge, receptive and expressive word form processing skills of bilingual children have never been examined and compared to monolingual peers in the same sample using several word form related tasks, such as LDT, RNT, and rhyming. Thus, we developed an LDT and an RNT with carefully controlled stimuli and employed these tasks in combination with a rhyming task and traditional vocabulary tests (word-picture matching and picture naming) in a diverse sample of German monolingual children and bilingual children speaking German and another language.
The main objective of the present study was to gain more detailed knowledge about word form processing, largely independent of vocabulary knowledge and semantic processing, in bilingual primary school children compared to their monolingual peers. As a starting point, we investigated a large and diverse sample of monolingual and bilingual children searching for factors that may influence the children's performance in word form processing tasks. On the basis of recent research results, age, gender, and vocabulary size were expected to have an impact on word form processing. Given that findings on bilingual children's word form abilities are heterogeneous, we were particularly interested in the question of whether the condition of language acquisition (monolingual vs. bilingual) will be influential or not. In a second step, we aimed at a deeper comparison between bilingual and monolingual children with respect to their word form processing skills, when potential influences of age, gender, and vocabulary size in the L2 are excluded. We tried to realize this by a very careful matching procedure: pairs of monolingual and bilingual children were matched for vocabulary size, age, and gender.
Hypotheses about the expected outcome are formulated per task. With regard to the LDT, differences between monolingual and bilingual children might be apparent in younger children but tend to narrow or even disappear with increasing age (Gangopadhyay et al., Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019). To check for this hypothesis, the matched group was divided into younger and older children comparing monolingual and bilingual children's word form processing abilities at different stages of development. In addition, lexical decision about pseudowords might be particularly difficult for bilingual children indicating less robust word form representations in the phonological input lexicon (Windsor & Kohnert, Reference Windsor and Kohnert2004). Based on previous findings, (Dixon et al., Reference Dixon, Thomson and Fricke2020; Geva & Farnia, Reference Geva and Farnia2012; Oppenheimer Fleury & Brandão de Avila, Reference Fleury F and de Avila CR2015) we did not expect differences between monolingual and bilingual children in rapid naming. No clear hypothesis can be formulated for the rhyming task because of the highly heterogeneous results in previous research.
Method
Participants
A total of 233 monolingual (n = 179) and bilingual (n = 54) children were recruited, all visiting regular primary schools in North-Rhine-Westphalia, Germany. The inclusion criteria for all participants were:
– non-verbal abilities in the normal range, measured by Coloured Progressive Matrices (CPM; Bulheller & Häcker, Reference Bulheller and Häcker2006): exclusion of 9 monolingual and 8 bilingual children
– absence of speech sound disorders: exclusion of 4 monolingual and 5 bilingual children
– normal general development and normal hearing based on parents’ reports: exclusion of one monolingual child
– responses above chance level in the LDT calculated by binominal distribution: exclusion of one monolingual child.
One monolingual child had to be excluded due to technical problems. As in Gangopadhyay et al. (Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019), an additional inclusion criterion for bilingual participants was the exposure to both languages at or before the age of three. So, two children had to be excluded because of their short time of contact with the German language. In sum, 8.4% of the monolingual participants and 27.8% of the bilingual participants had to be excluded. The final sample (N = 202, see Table 2) consisted of 163 monolingual German-speaking children and 39 bilingual children, speaking German and another language. According to parents and/or teachers, the bilingual children (n = 39) were simultaneous (n = 22) or early L2 learners (n = 16; missing data: n = 1) who acquired the German language at or before the age of three, and one of 14 different languages by birth: Russian (n = 12), Turkish (n = 11), Moroccan Arabic (n = 2), English (n = 2), Romanian (n = 2), Spanish (n = 2), Polish (n = 1), Bosnian (n = 1), Greek (n = 1), Bulgarian (n = 1), Italian (n = 1), Albanian (n = 1), Pari (n = 1), and Kurdish (n = 1). This diversity reflects the typical distribution of multilingual children in German primary schools (Statistisches Bundesamt (Destatis), 2020). The parents of the bilingual children stated a mean onset of acquisition of the L2 of 12.42 months of age (SD = 15.71, range = 0–36; missing data: n = 1) and a mean time of exposure of 85.63 months (SD = 17.62; range = 45–114). Due to a strongly increased language input entering the German educational system (kindergarten and school) an L2 dominance can be assumed for all bilingual children after 2–4 years of consistent exposure (Jia, Kohnert, Collado & Aquino-Garcia, Reference Jia, Kohnert, Collado and Aquino-Garcia2006; Oller & Eilers, Reference Oller2002; Oller, Jarmulowicz, Pearson & Cobo-Lewis, Reference Oller, Jarmulowicz, Pearson, Cobo-Lewis, Durgunoglu and Goldenberg2011; Pham & Kohnert, Reference Pham and Kohnert2014).
To match bilingual children to monolingual German-speaking children without a selection bias and control for confounding effects of age, gender, and vocabulary size, a propensity-matching procedure was used (see below). As a consequence of this matching procedure, the final subsample for the group comparison consisted of 31 bilingual (onset of acquisition [German]: M = 10.67, SD = 15.23; time of exposure [German]: M = 85.63, SD = 16.73; missing data: n = 1) and 31 monolingual children who did not differ significantly with regard to age, gender, non-verbal intelligence (CPM), months of education (since entering school), as well as vocabulary test scores in a standardized German vocabulary test (Wortschatz- und Wortfindungstest für 6- bis 10-Jährige, WWT; Glück, Reference Glück2011, including a picture naming task and a word-picture matching task; see Table 1). The mean scores of the monolingual and bilingual children, divided into annual age groups, were within the normal range for monolinguals in the German vocabulary test for the respective age group.
Table 1. Participant characteristics after propensity matching
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220620142801934-0219:S1366728921000936:S1366728921000936_tab1.png?pub-status=live)
aM (SD). bCPM (Bulheller & Häcker, Reference Bulheller and Häcker2006). craw score in the WWT (Glück, Reference Glück2011)*p < .05, **p < .01, ***p < .001
In order to compare monolingual and bilingual children's word form processing at different stages of development, the matched subsample was separated into younger and older children (6- and 7-year-old bilingual children and their matched peers: n = 24; Mage = 7;2, SD = 0;9; female: 46%; bilingual's mean time of exposure to German in months = 76.08, SD = 15.66; mean onset of acquisition in months = 8.0, SD = 14.77; 8- and 9-year-old bilingual children and their matched peers: n = 38; Mage = 8;6, SD = 0;8; female: 39%; bilingual's mean time of exposure to German in months = 92.00, SD = 14.54; mean onset of acquisition in months = 12.44, SD = 15.68).
Propensity matching
In order to compare monolingual and bilingual children's processing abilities on the word form level without an influence of age, gender, and vocabulary size, a propensity score matching was used to select appropriate groups of monolingual and bilingual children from the larger sample of 202 children (39 bilingual children, 163 monolingual children). The matching procedure should equate the groups to the greatest extent possible on the factors age, gender, and vocabulary size, which might have an impact on word processing (see Introduction section). The propensity score creates groups based on a balanced set of categorial and continuous variables (Rosenbaum & Rubin, Reference Rosenbaum and Rubin1983). Using IBM SPSS Statistics for Macintosh, Version 26.0, the score was calculated via logistic regression models for every child using the variables age, gender, raw score in the picture naming task, and raw score in the word-picture matching task. The nearest-neighbor matching method without replacement was then used to match one bilingual child to one monolingual child allowing a divergent propensity score (caliper width) of 0.2 (Austin, Reference Austin2011) between each pair of children. Eight of the bilingual children had to be excluded because an appropriate monolingual matching partner was lacking.
Material
A German standardized vocabulary test (WWT; Glück, Reference Glück2011), containing a word-picture matching task and a picture naming task, was applied to assess receptive and expressive vocabulary size. The items vary in frequency and number of syllables and include objects, actions, antonyms, and hypernyms.
A rhyming task, as part of a German standardized test of phonological awareness (Phonologie Modellorientiert für Kinder vom Vorschulalter bis zum dritten Schuljahr, PhoMo-Kids; Stadie & Schöppe, Reference Stadie and Schöppe2014), was used to assess rhyming abilities. In this task, a child receives a word and then has to name an appropriate rhyming word. The item set encompasses 24 one- and two-syllabic word stimuli.
In addition, two experimental tasks were designed for the present study: an auditory LDT and an RNT. The word stimuli for both tasks were taken from a German database (childLex, Schroeder, Würzner, Heister, Geyken & Kliegl, Reference Schroeder, Würzner, Heister, Geyken and Kliegl2015), which is based on a corpus of 10 million words extracted from children's books. The stimuli were carefully controlled for several (psycho-)linguistic features that have been shown to influence word processing (see Table S1, Supplementary Materials).
Auditory lexical decision task
The item set of the LDT (see Table S2, Supplementary Materials) contains 48 real words and 48 pseudowords. The 2- and 3-syllabic real words were made up of 24 nouns (concrete nouns, 12 animate, 12 inanimate) and 24 verbs (agentive and non-reflexive verbs, 12 transitive, 12 intransitive). According to one-way ANOVAs, nouns and verbs were comparable with regard to word frequency, word length, syllable structure, morphological complexity, neighborhood density, and AoA (see Table S1, Supplementary Materials). The 48 pseudowords were generated by changing a real word by permutation, addition, deletion, or substitution of one phoneme, following phonotactic rules of German (e.g., Banane [Eng. banana] – Ranane or summen [Eng. to hum] – sumpen). Additionally, the position of modification within the word (initial, medial, final), maintenance of phonetic features of the initial sounds, number of initial sounds, and word stress patterns (Befi-Lopes, Preto Ferreira da Silva & Paiva Bento, Reference Befi-Lopes, Preto Ferreira da and Paiva Bento2010; Edwards & Lahey, Reference Edwards and Lahey1996) were considered when creating the pseudowords. The real words and pseudowords were recorded by a trained female native German speaker in a soundproofed booth. One-way ANOVAs revealed no significant differences of nouns and verbs as well as real words and pseudowords concerning spoken word length (mean duration in seconds; Mnouns = .69, SD = .13; Mverbs = .75, SD = .13; F[1, 46] = .2904, p = .095, η2 = .059; Mreal words = .718, SD = .13; Mpseudowords = .764, SD = .116; F[1, 94] = 3.353, p = .070, η2 = .034) and mean pitch (in Hz; Mnouns = 171.01, SD = 8.57; Mverbs = 174.85, SD = 10.65; F[1, 46] = 1.893, p = .176, η2 = .040; Mreal words = 172.93, SD = 9.75; Mpseudowords = 174.78, SD = 9.4; F[1, 94] = .890, p = .348, η2 = .009) of the recorded stimuli.
Rapid naming task
The stimuli of the RNT (see Table S3, Supplementary Materials) consist of 36 words: 18 disyllabic, monomorphemic, concrete nouns, and 18 agentive verbs. To avoid a possible impact of reading abilities, no alphanumeric stimuli were used. According to the Mann-Whitney-U test, nouns and verbs did not differ significantly with regard to frequency, word length, syllable structure, neighborhood density, and AoA (see Table S1, Supplementary Materials). The item set encompassed six subsets of words from familiar semantic categories: two noun subsets (body parts, e.g., Nase, Eng. nose, and animals, e.g., Spinne, Eng. spider), two verb subsets (movements, e.g., gehen, Eng. to walk, and childlike activities, e.g., malen, Eng. to paint), as well as two subtests with alternating stimuli (nouns and verbs associated with the domain ‘animals’ e.g., Vogel and fliegen, Eng. bird and to fly, and nouns and verbs associated with the domain ‘food’, e.g., Apfel and schneiden, Eng. apple and to cut). A professional illustrator created colored drawings of the RNT items. A survey with ten adults and seven children (see pilot study below) ensured that the illustrations elicited the expected target words. Drawings with a naming agreement below 80% were adjusted.
Both experiments (LDT and RNT) were implemented using the OpenSesame software (Mathôt, Schreij & Theeuwes, Reference Mathôt, Schreij and Theeuwes2012). In the LDT, the verbal stimuli were presented via headphones (AKG Acoustics k240 studio). The task contained a familiarization phase and a subsequent testing phase. The latter included 96 items, which were verbally presented in randomized order, interrupted by short breaks after 25 items each. The assignment of the buttons was pseudo-randomized across children to control for biases due to the participant's handedness. Each trial began with a tone (440Hz) for getting the child's attention, followed by the presentation of the stimulus. Responding to the stimulus by pressing one of the buttons automatically initialized the next trial. OpenSesame stored the accuracy values and response times per trial for later analysis.
In the RNT, the order of the subtests was pseudorandomized across children. A short break between the subtests should maintain the child's attention and motivation. The items of each subset (n = 6) were presented six times in a randomized order, avoiding the immediate repetition of the same item. The RNT in the present study is comparable to the traditional RAN/RAS task (see above) regarding several aspects: the time pressure, the repeated presentation of the stimuli, the practicing phase, and the familiarization with the stimuli before testing. However, the present RNT and the traditional RAN/RAS differed in one important aspect, which is the mode of presentation. In the traditional RAN/RAS task, all stimuli of a subtest are presented at the same time, whereas in the present RNT, stimuli were shown one after the other. This discrete way of presentation provides the possibility to measure the response time of each word production separately. Moreover, mutual interference due to simultaneous item presentation, which might lead to increased response times, can be avoided. Each trial was announced by a visual and auditory signal before a target picture was shown on the screen for about 1560 ms (nouns) and 1760 ms (verbs, mixed categories) on average. The time of presentation for each item was determined based on previous research: naming latencies of monolingual and bilingual children have been reported to vary between 1 and 2 seconds (e.g., Coady, Reference Coady2013; Dockrell, Messer & George, Reference Dockrell, Messer and George2001; Jia et al., Reference Jia, Kohnert, Collado and Aquino-Garcia2006; Kohnert, Bates & Hernandez, Reference Kohnert, Bates and Hernandez1999). The shorter presentation time of nouns was motivated by a naming advantage for nouns (e.g., Dockrell et al., Reference Dockrell, Messer and George2001; Jia et al., Reference Jia, Kohnert, Collado and Aquino-Garcia2006; Kambanaros & Grohmann, Reference Kambanaros and Grohmann2011; Kauschke & von Frankenberg, Reference Kauschke and von Frankenberg2008). A pilot study with seven 5- to 10-year-old children (Mage = 7;6; SD = 1;7) was conducted before the main study. The degree of difficulty, the imageability of the items, and the presentation time of the stimuli turned out to be appropriate for children at that age.
Procedure
Ethical approval was given by the Ethics Committee of the dbs (Deutscher Bundesverband für akademische Sprachtherapie und Logopädie; file reference: 16-10005-KA-KSpKo) and families provided informed consent, which was signed by parents and children. In addition, parents filled in a questionnaire about their child's development concerning hearing, cognition, and speech and language development.
Data collection took place at school or home. Children were tested in a quiet room two times for about 45 minutes each. The first session included the vocabulary test (WWT) and the rhyming task (PhoMo-Kids). The second session comprised the two experimental tasks (LDT and RNT) and the CPM. Short breaks between the tasks sustained children's attention and small gifts after every session promoted their motivation. During the experimental tasks, each participant was seated in front of a laptop (Samsung, model: RV520). Child-oriented instructions were given for the LDT to press the button with a happy green smiley for an existing German word or the button with a sad orange smiley for a word without meaning as fast and accurately as possible. For the RNT, the participant was seated in front of the laptop screen; a microphone (Sennheiser, ME64/K6) was located about 10 cm in front of him or her. For later analysis, the experiment was recorded via audio software (Audacity 2.1.2). Before starting each subtest of the RNT, each item was introduced and discussed. The testing phase began when the child could reliably name the pictures of the following subset. Then the participant was instructed to name the pictures as fast and accurately as possible.
Data Analysis
LDT
The dependent variables accuracy and reaction time were calculated for each participant and item. Mean accuracy for one single item (Rauferei, Eng. brawl) was below 50% across all children. In consequence, this item and the corresponding pseudoword were excluded from later analysis (2.1%). Technical problems caused further exclusions of a maximum of three items per participant (3.1%). Hence, 18405 valid reactions of the sample (5.650 of the matched subsample) could be considered for the analysis of accuracy. For reaction times, a stepwise procedure to identify outliers lead to exclusion of (a) incorrect responses (full sample: 7.1%; subsample: 8.5%), (b) reactions below 200 ms (full sample: 2.2%; subsample: 2.0%), which might indicate responses by pure chance, (c) reactions above 6000 ms (full sample: 0.1%; subsample: 0.2%) representing slipping attention (Edwards & Lahey, Reference Edwards and Lahey1996; van den Boer, de Jong & Haentjens-van Meeteren, Reference van den Boer, de Jong and Haentjens-van Meeteren2012), and (d) reaction times below or above 2.5 SD of the participants mean (full sample: 1.4%; subsample: 2.3%), since such reactions were atypical for this certain participant. Afterward, 89.54% (subsample: 87.01%) of all analyzable reaction time values remained for the analysis of the LDT.
RNT
The analysis of the RNT was conducted on the basis of the recorded audio files. For accuracy, each response was classified as correct production, incorrect production, or invalid trial. Invalid trials were due to signal overload and had to be eliminated (0.9%). To ensure inter-rater reliability for the classification of accuracy, two independent raters analyzed a random subsample of children (n = 10). Cohens Kappa yielded high agreement (r = .844**). The reaction time of every correct response was measured by manual analysis with the software “Audacity 2.1.2”. Currently, this method is the gold standard to assess speech onset (Roux, Armstrong & Carreiras, Reference Roux, Armstrong and Carreiras2017). In some responses, acoustic interference impeded the detection of the word onset. Therefore, these reaction times had to be excluded from the analysis (5.99%). The inter-rater reliability for reaction times was measured by intraclass correlation (ICC) indicating excellent reliability. The average measure ICC was .992 with a confidence interval from .992 to .993 (F[134,443] = 1667, p = .000).
Results
Predictors of word form processing (full sample)
Descriptive data regarding age, gender, non-verbal intelligence, months of education, and the performance in the different tasks are given in Table 2 for the whole sample as well as for monolingual and bilingual children separately.
Table 2. Descriptive data of the sample
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220620142801934-0219:S1366728921000936:S1366728921000936_tab2.png?pub-status=live)
aMean (SD), [range]. bpercentile rank in the CPM (Bulheller & Hacker, Reference Bulheller and Häcker2006). cWWT (Glück, Reference Glück2011). draw score in the PhoMo-Kids (Stadie & Schöppe, Reference Stadie and Schöppe2014).
Stepwise multiple linear regressions were performed (see Table 3), using IBM SPSS Statistics for Macintosh (Version 26.0) on the basis of the data of the whole sample of monolingual and bilingual children (see descriptive data in Table 2). The regression analyses aimed at assessing the contribution of age, gender, the performance in the picture naming and word-picture matching test, and the type of language acquisition (monolingual or bilingual) (= potential predictors) on the results in word form processing tasks. The dependent variables were accuracy in rhyming, accuracy and reaction time in the LDT (real words and pseudowords as well as pseudowords only), and accuracy and response time in the RNT. Pearson's correlation analysis showed strong correlations between picture naming and word-picture matching (r(200) = .754, p = .000). To prevent multicollinearity, these variables were separated for the regression analysis: The performance in word-picture matching served as a potential predictor for performance in the LDT, both being comprehension tasks, and the performance in picture naming served as a potential predictor for the RNT as well as the rhyming task, all three being production tasks. The stepwise regression analyses were stopped when the inclusion of a further predictor did not provide an additional significant increase of R2 at p < .05. This approach allowed us to identify the best predictive variables for different aspects of word form processing without significant data loss. Children with a performance below 3 SD in the dependent variable of each regression were excluded (see Table 3).
Table 3. Results of stepwise multiple linear regression analysis: variables associated with word form processing at p < .05 and tabulated in order of selection.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220620142801934-0219:S1366728921000936:S1366728921000936_tab3.png?pub-status=live)
*p < .05, **p < .01, ***p < .001
The stepwise regression analyses revealed a crucial contribution of age, vocabulary size, and gender (with better performance of girls) on word form processing (i.e., on accuracy scores and reaction times in the LDT, the RNT, and the rhyming task), while bilingualism failed to explain significant amounts of additional variance (see Table 3). Results showed that age and picture naming explained a significant amount of variance regarding rhyming performance (adjusted R2 = .212, F(2, 198) = 26.958, p = .000) and response time in the RNT (adjusted R2 = .389, F(2, 197) = 64.327, p = .000). Gender, age, and performance in picture naming significantly explained variance of accuracy values in the RNT (adjusted R2 = .275, F(3, 198) = 26.412, p = .000). Variance in accuracy in the LDT can significantly be explained by age, the performance in the word-picture matching task, and gender (adjusted R2 = .282, F(3, 194) = 26.765, p = .000), whereas reaction time values in the LDT for real words and pseudowords (adjusted R2 = .275, F(2, 197) = 38.708, p = .000) as well as for pseudowords only (adjusted R2 = .290, F(2, 199) = 42.026, p = .000) can be explained by age and the performance in the word-picture matching task. Variance in accuracy for pseudowords only (LDT) can significantly be explained by age (adjusted R2 = .135, F(1, 198) = 32.045, p = .000). The adjusted R2 for the overall models indicates a moderate to high goodness-of-fit according to Cohen (Reference Cohen1988).
Group comparisons between monolingual and bilingual children (matched samples)
Having confirmed that word form processing is influenced by age, gender, and vocabulary size, we aimed at comparing word form processing in monolingual and bilingual children, when these factors are balanced. Therefore, two subgroups of monolingual and bilingual children were matched by means of propensity score (see above) accepting a smaller sample size. This pairwise matching procedure yielded a monolingual and a bilingual group of 31 children each, which did not differ significantly with regard to age, gender, non-verbal intelligence, months of education, and receptive and expressive vocabulary size (see Table 1). Afterward, the groups were compared regarding their word form processing abilities when age, vocabulary size, and gender were equalized between the groups. Descriptive data of monolingual and bilingual children after propensity matching on the rhyming task and both experimental tasks are given in Table 4. According to t-tests for independent samples, the matched groups of monolingual and bilingual children did not differ significantly in any aspect of word form processing.
Table 4. Decriptive data and t-tests for independent samples of monolingual and bilingual children after propensity matching.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220620142801934-0219:S1366728921000936:S1366728921000936_tab4.png?pub-status=live)
aM (SD).
Age-related group comparisons
To compare monolingual and bilingual children's word form processing at different stages of development, the matched subsample was divided into two subgroups: younger (6- and 7-year-old bilingual children and their matched peers; n = 24) and older (8- and 9-year-old bilingual children and their matched peers; n = 38) children. According to the Mann-Whitney-U test, monolingual and bilingual children in the younger as well as in the older group did not differ with regard to age, non-verbal intelligence, months of education, and scores in the word-picture matching task and the picture naming task (see Table 5). In addition, χ2 tests confirmed that monolingual and bilingual children in the younger (female monolinguals: 42%, female bilinguals: 50%; p = .682) as well as in the older group (female monolinguals: 42%, female bilinguals: 39%; p = .740) did not differ significantly with regard to gender.
Table 5. Monolingual vs. bilingual children at different stages of development depending on age
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220620142801934-0219:S1366728921000936:S1366728921000936_tab5.png?pub-status=live)
aM (SD). bCPM (Bulheller & Häcker, Reference Bulheller and Häcker2006). cWWT (Glück, Reference Glück2011).
*p < .05, **p < .01, ***p < .001
In the younger age group, significant differences were found in the LDT for pseudowords only (accuracy; p = .036), and in the RNT (accuracy; p = .015). The older age groups did not differ in their performance in any of the word form processing tasks (see Table 5).
Discussion
The purpose of the current study was to examine word form processing in bilingual children at school age by conducting carefully designed tasks that assess relevant aspects of word form processing in the receptive (LDT) and expressive (RNT, rhyming) modality, in addition to traditional vocabulary tests (picture naming and word-picture matching). Regression analyses based on a sample of 202 monolingual and bilingual children revealed that age, gender, and vocabulary size but not bilingualism predict the performance in word form processing tasks. After matching monolingual and bilingual children for age, gender, and vocabulary size the two groups were compared, and no significant group differences were found in word form processing tasks. Dividing the matched subsample into younger and older children yielded significant differences between monolingual and bilingual children in the younger group with regard to lexical decision and rapid naming. In the older group, monolingual and bilingual children did not differ significantly in their word form processing abilities. Due to small sample sizes, these results on younger and older subgroups have to be interpreted cautiously.
As expected, age contributed significantly to explain performance in word form processing tasks. This finding is in line with previous research results showing that word form processing abilities improve with age (Bahn et al., Reference Bahn, Vesker, García Alanis, Schwarzer and Kauschke2017; Hein & Kauschke, Reference Hein and Kauschke2020; Ponari et al., Reference Ponari, Norbury and Vigliocco2018; van den Bos et al., Reference van den Bos, Zijlstra and lutje Spelberg2002; Gangopadhyay et al., Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019; Li et al., Reference Li, Kirby and Georgiou2011; Windsor & Kohnert, Reference Windsor and Kohnert2004). Gender also contributed significantly to predict performance in the LDT and the RNT (but not in the rhyming task) in the sense that girls responded more accurately than boys. This result is consistent with previous research on rapid naming (Burman et al., Reference Burman, Bitan and Booth2008; Wilsenach & Makaure, Reference Wilsenach and Makaure2018), but is a new finding regarding lexical decision. In addition, receptive vocabulary size predicted performance in the LDT, and expressive vocabulary size predicted performance in rhyming and rapid naming. These results corroborate the results of Ainsworth et al. (Reference Ainsworth, Welbourne and Hesketh2016), who found a similar relationship in children aged 3 to 4. In line with the Lexical Restructuring Hypothesis (Walley & Metsala, Reference Walley and Metsala2003), our findings point to a sustaining relationship of vocabulary size and word form processing: related to the growing vocabulary, word form representations in the phonological input and output lexicon become more fine-grained through middle childhood. Finally, bilingualism did not contribute significantly to explain children's performance in any of the tasks. Hence, word form processing of monolingual and bilingual children seems to be influenced by the same factors, without putting bilinguals at a disadvantage.
In light of the results of the regression analyses, the question arose whether monolingual and bilingual children will differ in their word form processing abilities when matched for age, gender, and vocabulary size. With the matching for vocabulary size, we avoided that limitations in word form processing may have emerged as a simple result of limited lexical size. Group comparisons after matching did not reveal significant differences between monolingual and bilingual 6- to 9-year-old children. Thus, the results of the group comparisons support the results of the regression analyses indicating that word form processing, largely independent of vocabulary knowledge, does not differ between monolingual and bilingual children.
With regard to the performance in the three word-form related tasks (lexical decision, rapid naming, and rhyming), no significant differences were identified between bilinguals and monolinguals across all primary school children. The outcome for the LDT matches previous results of Hemsley et al. (Reference Hemsley, Holm and Dodd2006). We further expected that differences between monolingual and bilingual children in the LDT will be apparent in younger children with a tendency to decline with increasing age. In line with our expectation, the results demonstrated that at the beginning of primary school bilingual children were less effective than monolingual children in processing word form representations in the phonological input lexicon. This discrepancy seemed to disappear with increasing age and language experience during primary school. Since Gangopadhyay et al. (Reference Gangopadhyay, Ellis Weismer and Kaushanskaya2019) found a similar developmental change in bilingual children's lexical decision abilities later, i.e., between age 9 and 10, longitudinal research is needed to investigate how long this discrepancy persists, and at what age exactly bilingual children's word form processing abilities stabilize and resemble those of their monolingual peers. Second, on the basis of Windsor and Kohnert (Reference Windsor and Kohnert2004) who investigated children aged 8 to 13 years, monolingual children were hypothesized to perform better than bilingual children in categorizing pseudowords in the LDT. The incorrect classification of a pseudoword (e.g., Menole) as a real word (Melone [Eng. melon]) in particular points to less detailed and stable word form representations in the phonological input lexicon. Although our data for the subgroup of younger children point into a similar direction, the group comparisons including all children did not show differences between monolingual and bilingual children. As described above (Jones & Brandt, Reference Jones and Brandt2018), the difficulty to classify pseudowords correctly depends on the way of pseudoword manipulation. The more difficult a task becomes, the more likely slight uncertainties – for example, due to the acquisition of two or more languages, might become visible. Thus, different ways of pseudoword manipulation associated with the level of difficulty of the LDT might be an explanation for the heterogeneous results. Additionally, Windsor and Kohnert (Reference Windsor and Kohnert2004) focused on 8- to 13-year-old sequential bilinguals with 4 to 8 years of contact to the L2, whereas, in the present study, 6- to 9-year-old bilingual children, who were simultaneous or early L2 learners with 4 to 9 years of contact to the L2, were investigated. Thus, due to their earlier onset of acquisition, our bilingual participants may have established more robust and stable word form representations in the phonological input lexicon enabling them to classify pseudowords similar to their monolingual peers.
With regard to the RNT, it should generally be borne in mind, that the RNT in the present study and the traditional RAN task, which was often used in previous research, differ in some characteristics and might therefore assess slightly different aspects of word processing. Nevertheless, our results confirmed previous studies on rapid naming (Dixon et al., Reference Dixon, Thomson and Fricke2020; Geva & Farnia, Reference Geva and Farnia2012; Oppenheimer Fleury & Brandão de Avila, Reference Fleury F and de Avila CR2015): Bilingual children showed no disadvantage in rapid naming, i.e., in word form processing in the phonological output lexicon. However, we observed similar age-related changes as in the LDT: monolingual and bilingual children differed in the younger age group whereas no differences could be found in the older group. In terms of the logogen model, the phonological output lexicon might develop on the basis of the phonological input lexicon (Hein & Kauschke, Reference Hein and Kauschke2020). Thus, it cannot be ruled out that younger bilingual's difficulties to retrieve word forms from the phonological output lexicon are the consequence of a less fine-grained phonological input lexicon.
In the rhyming task, monolingual and bilingual children did not show any differences in word form processing. This absence of a significant difference in rhyming is also consistent with some earlier findings (Ahmadian et al., Reference Ahmadian, Bahrami and Amini2016; Frederickson & Frith, Reference Frederickson and Frith1998; Goriot et al., Reference Goriot, Unsworth, van Hout, Broersma and McQueen2021; Soleimani & Arabloo, Reference Soleimani and Arabloo2018). Other studies showed an advantage for monolingual children (Dixon et al., Reference Dixon, Thomson and Fricke2020; Hutchinson et al., Reference Hutchinson, Whiteley, Smith and Connors2004). One explanation for this ambiguous research status might be the use of various kinds of rhyming tasks, as indicated above, with a different weighting of lexical and sublexical processing. These methodological differences limit the comparability of results among rhyming studies.
The rich diversity of our sample may be considered as a limitation of the present study. We investigated bilingual children who were simultaneous or early L2 learners acquiring German as their ambient language at or before the age of three, and a total of 14 different languages by birth. This heterogeneity reflects the typical and natural distribution of children visiting primary schools in Germany. Nevertheless, future research is needed to examine word form processing in bilinguals with specific language combinations. Those studies should control the item set for cognates and false friends considering the level of similarity between the two languages. An investigation of specific language combinations will also facilitate the additional assessment of language competencies in the heritage language. Due to the variety of language combinations in the present study, it was neither feasible to control the items for cognates and false friends nor to assess proficiency in the heritage languages.
Previous research on lexical competence has shown that bilingual children have smaller vocabularies than monolingual children when a single language is considered, while conceptual vocabulary often resembles that of monolingual peers. The present study, however, did not focus on general vocabulary size, but on word processing in the mental lexicon – in particular, on the access to the phonological input lexicon, and on word retrieval from the phonological output lexicon, with as little involvement of general vocabulary skills as possible. In fact, it turned out that these psycholinguistic processes of word form processing were not affected by bilingual language acquisition, since it is not the knowledge that counts, but the effectiveness of processing. Thus, the present findings provide new insights into lexical processing of bilingual school-age children, and point to similar word form-related processing abilities in monolingual and bilingual children, with the tendency of an initial weakness that seems to be overcome during primary school age.
Acknowledgements
We like to thank all schools, families, and children for their participation; Elisabeth Beckermann, Dominique Polomka, Lezlie Paulina Cuevas Guerra, Sandra Grom, and Claudia Scharfscheer for their help in data collection and/or data analysis; Ulrike Domahs, Frank Domahs, Sebastian Niehüser, and Stefanie Türk for their statistical advice; and Michael Cysouw, Daniela Bahn, Anna Rosenkranz, and Lakshmi Kalyan Sunku for their helpful comments.
Supplementary material
For supplementary material accompanying this paper, visit https://doi.org/10.1017/S1366728921000936
Table S1. (Psycho-)linguistic features of the item sets in the Lexical decision task and the Rapid naming task.
Table S2. Item set of the auditory lexical decision task.
Table S3. Item set of the rapid naming task.
Competing interests
The authors declare none.