The lexicon has often been identified as one of the ‘special’ components of language (e.g., Hauser et al. Reference Hauser, Chomsky and Fitch2002, Pinker and Jackendoff Reference Pinker and Jackendoff2005).Footnote 1 According to Wilkins (Reference Wilkins1972: 111–112), ‘while without grammar very little can be conveyed, without vocabulary nothing can be conveyed’. Therefore, studying lexical development is of major importance. Better understanding of lexical development is also important because of the role the lexicon plays in linguistic development in general. First, early lexical and grammatical developments are strongly correlated: without a critical mass of words, children may not be able to develop later morphosyntactic competence (see Bates and Goodman Reference Bates, Goodman, Tomasello and Bates2001, for a review). Second, lexical abilities are correlated to later linguistic abilities such as writing and reading. For example, evidence for the importance of vocabulary size in learning to read has been provided by longitudinal studies showing that syntactic and vocabulary performance in kindergarten predicts reading performance (Catts et al. Reference Catts, Fey, Zhang and Tomblin1999) and reliably discriminates between poor and normal readers (e.g., Catts et al. Reference Catts, Fey, Zhang and Tomblin1999, Hagtvet Reference Hagtvet2003, Share and Leikin Reference Share and Leikin2004). Lexical development can be considered as a good basis for the prediction of future problems in reading and writing.
The child's early productive lexicon is quantitatively and qualitatively different from the adult's lexicon (Fenson et al. Reference Fenson, Dale, Reznick, Bates, Hartung, Pethick and Reilly1993). Between one and three years old, a child's lexicon is limited, and first words can be semantically and phonologically different from the lexical target. There is a consensus that with age, the child's lexicon increases and words become more accurate at all levels of linguistic precision. Most studies (for counter-argumentation see Ganger and Brent Reference Ganger and Brent2004) agree as well that expressive vocabulary has a nonlinear development, with a slow increase in size before the so-called lexical spurt characterized by a sharp acceleration of vocabulary learning rate (especially nouns) around the 50-word stage (Reznick and Goldfield Reference Reznick and Goldfield1992, Poulin-Dubois et al. Reference Poulin-Dubois, Graham and Sippola1995). The grammatical composition of children's lexicons is different from adults’ lexicons, and changes with lexical size: smaller lexicons are composed mainly of nouns, predicates emerge in most of the cases during the lexicon spurt, and closed class items when lexicons are composed of 400 different words (Jackson-Maldonado et al. Reference Jackson-Maldonado, Thal, Marchman, Bates and Gutierrez-Clellen1993, Bates et al. Reference Bates, Marchman, Thal, Fenson, Dale, Reznick, Reilly and Hartung1994, Caselli et al. Reference Caselli, Bates, Casadio, Fenson, Fenson, Sanderl and Weir1995, Maital et al. Reference Maital, Dromi, Sagi and Bornstein2000, Bornstein et al. Reference Bornstein, Cote, Maital, Painter, Park, Pascual, Pêcheux, Ruel, Venuti and Vyt.2004). These findings are consistent across a wide range of languages, even though some differences related to language type (Tardif Reference Tardif, Hirsh-Pasek and Golinkoff2006, Ma et al. Reference Ma, Golinkoff, Hirsh-Pasek, McDonough and Tardif2009) or methodological issues (Tardif et al. Reference Tardiff, Gelman and Fan1999, Clark Reference Clark2003) were shown. However, very few studies were able to show why children systematically produce some words at an earlier stage than other words. Questions about the acquisition order of specific words are still not completely answered: What factors guide word learning? Do these factors interact with one another? Does the weight of their influence change with age? And finally, do they play the same role according to the phonetic, phonological or grammatical nature of words? Several factors influence quantitative and qualitative development of first word productions (Bornstein et al. Reference Bornstein, Cote, Maital, Painter, Park, Pascual, Pêcheux, Ruel, Venuti and Vyt.2004). Among them, two factors have frequently been taken into consideration: phonetic and phonological development, and input lexical characteristics (Stoel-Gammon Reference Stoel-Gammon2011). In learning words, children must gain knowledge of form-to-function mapping in their native language as well as learn the articulatory and phonatory movements needed to produce words in an adult-like manner (Stoel-Gammon and Vogel Sosa Reference Stoel-Gammon, Sosa, Erika and Shatz2007).
Children's first words display common trends across languages in terms of phonetic and phonological characteristics. Strong similarities, especially in sound types and sound combinations across different languages, have been frequently documented (Locke Reference Locke1983, Stoel-Gammon Reference Stoel-Gammon1985, Davis and MacNeilage Reference Davis and MacNeilage2000): children prefer to produce oral and nasal stops and glides with labial or coronal place of articulation. In addition, three CV cooccurrences are frequent in first words: labial consonant + central vowel, coronal consonant + front vowel and dorsal consonant + back vowel. These segments and cooccurrences are considered ‘simple’ as they can be produced with mandibular oscillations accompanied by phonation, without the independent use of articulators (MacNeilage and Davis Reference MacNeilage, Davis, de Boysson-Bardies, de Schonen, Jusczyk, MacNeilage and Morton1993, Reference MacNeilage and Davis2000). All these crosslinguistic studies suggest a near-universal basis for early phonological acquisition (Locke Reference Locke1983, Stoel-Gammon Reference Stoel-Gammon1985, Davis and MacNeilage Reference Davis and MacNeilage2000) and the influence of articulatory constraints on first word production: children's early productions are constrained by their production limitations (Davis and MacNeilage Reference Davis and MacNeilage2005).
The influence of phonetic and phonological development on first word production is also supported by studies of novel word production: infants tend to produce novel words composed of sounds already present in their phonetic inventory (Leonard et al. Reference Leonard, Schwartz, Chapman, Rowan, Prelock, Terrell, Weiss and Messik1982, Schwartz and Leonard Reference Schwartz and Leonard1982). Ferguson and Farwell (Reference Ferguson and Farwell1975) observed individual patterns of word selection resulting from differences in children's production capacities. They proposed that children attempt to say words with sounds and syllable structures they can accurately produce, and avoid words that are difficult for them phonologically. Using an Index of Phonetic complexity (IPC, Jakielski Reference Jakielski2000), which allows consideration of phonetic complexity in both word targets and words actually produced by children, Ranta and Jakielski (Reference Ranta and Jakielski1999) documented a lexical selectivity bias in six English-speaking children. Phonetic complexity appears to influence lexical selection in children from 16 to 20 months old, but not in children 23 months old and older. Ward (Reference Ward2001) measured the phonetic complexity of word targets and productions in children 12–24 months of age, and found that target-word IPC values increased by 32% over the 12 months of the study, while production IPC values for the children did not increase in the same period. Moreover, segments available to children in babbling (e.g., labial and coronal) are more accurately pronounced by children in their first words, and some later developing sounds and sequences are replaced by early developing sounds. Early words are filtered by immature articulatory capacities (Vihman and Croft Reference Vihman and Croft2007).
Contemporary research has convincingly demonstrated that statistical regularities in the input are available to, and used by, children as a possible bootstrap to language acquisition (e.g., Jusczyk et al. Reference Jusczyk, Luce and Charles-Luce1994, Saffran et al. Reference Saffran, Aslin and Newport1996, Garlock et al. Reference Garlock, Walley and Metsala2001). Reliable input cues have been isolated at the level of the word and at the level of the sounds that make up the word. Thus, multiple word characteristics contribute to language acquisition. Two word characteristics that have been extensively studied in relation to their influence on lexical acquisition are their frequency of occurrence in the input and the kind of phonological neighbourhood they reside in.
In the adult psycholinguistics literature, frequency effects at the single-word level have been almost universally accepted (Bybee and Hopper Reference Bybee and Hopper2001, Ellis Reference Ellis2002, Bod et al. Reference Bod, Hay and Jannedy2003). Sensitivity to frequency effects has also been demonstrated in studies of children's lexicon development: the more often a word is heard, the earlier it is learned (Patterson Reference Patterson2002, Tomasello Reference Tomasello2003, among others).
Most studies on lexical development appear to assume an important role for input frequency; but often the evidence for this hypothesis is indirect. For example, it is well established that parents who talk more to their children have children whose lexicon grows faster (Weizman and Snow Reference Weizman and Snow2001). Another indirect piece of evidence is the seminal work of Gopnik and Choi (Reference Gopnik and Choi1990) who showed that children learn a higher proportion of verbs if child-directed speech (CDS) in the language they are learning uses verbs more frequently. Only a few studies of children's speech have found that the more frequently children hear a particular word or construction, all things being equal, the earlier they acquire it. For example, Naigles and Hoff-Ginsberg (Reference Naigles and Hoff-Ginsberg1998) and Theakston et al. (Reference Theakston, Lieven, Pine and Rowland2004), have all shown that the order of emergence of particular verbs is significantly correlated with their frequency of use in language addressed to children.
Neighbourhood density (ND) of a word is a measure of the numbers of its phonological neighbours and is calculated as the number of words that differ from the target by one phoneme (Luce and Pisoni Reference Luce and Pisoni1998). Some words have many phonological neighbours, and thus reside in a so-called dense neighbourhood, whereas some others that have relatively few neighbours, and thus reside in a so-called sparse neighbourhood. Direct evidence that neighborhood density influences lexical acquisition has been provided. When children are exposed to an equal number of novel words from dense and sparse neighborhoods and learning is tracked over time, 17-month-old to 13-year-old children acquire novel words from dense neighborhoods more rapidly than novel words from sparse neighborhoods (Storkel and Rogers Reference Storkel and Rogers2000, Hollich et al. Reference Hollich, Jusczyk, Luce, Skarabela, Fish and Do2002, Storkel Reference Storkel2002).
However, both frequency and neighbourhood density cannot be considered alone, as they co-vary with other words characteristics such as word class, phonotactic probabilities (PP) or word length (WL). For instance, word frequency (WF) covaries with word length and word class: smaller words are more frequent than long words (Strauss et al. Reference Strauss, Grzybek, Altmann and Grzybek2006) and nouns and verbs are less frequent than closed class items. Covariation also exists between neigbourhood density on the one hand, and word frequency, word length and phonotactic probability, on the other: neighbourhood density is positively correlated with word frequency (Landauer and Streeter Reference Landauer and Streeter1973) and phonotactic probability (Vitevitch et al. Reference Vitevitch, Luce, Pisoni and Auer1999) and negatively correlated with word length (Pisoni et al. Reference Pisoni, Nusbaum, Luce and Slowiaczek1985, Shillcock and Bard Reference Shillcock, Bard, Altmann and Shillcock1993).
Few studies have considered all these characteristics or attempted to disentangle the influence of one from the others in the acquisition of the lexicon. For instance, Goodman et al. (Reference Goodman, Dale and Li2008) considered the influence of both frequency and word class on age of acquisition. They were able to show that lexical input frequency plays a different role in the order of acquisition of nouns and verbs vs. closed-class words: closed-class words are acquired later than nouns and verbs despite their high frequencies in the adult input. Goodman et al. also demonstrated, for the first time, an increasing role of frequency with age, taking into account the developmental nature of the process. More recently, McKean et al. (Reference McKean, Letts and Howard2013, Reference McKean, Letts and Howard2014) for English, as well as van der Kleij et al. (Reference van der Kleij, Rispens and Scheper2016) for Dutch, examined the effect of phonotactic probability and neighborhood density on word learning in typically developing children and in children with language impairment. They demonstrated that in both production and comprehension, PP had a significant influence on lexical acquisition with low ND: pseudowords in the condition with converging characteristics (low PP and low ND) were learned significantly better than those in the high PP – low ND condition. No effect of PP was found for pseudowords with high ND. Moreover, they showed a change in the influence of PP across ages, switching from a high- to a low-PP advantage. Another group of studies has considered the influence of both frequency and ND on the lexical acquisition of nouns. Storkel (Reference Storkel2009) examined the lexical properties of nouns that were learned earliest vs. words that were learned later. She found that words that were acquired earliest tended to have higher ND and higher WF than words acquired later. Stokes et al. (Reference Stokes, Bleses, Basbøll and Lambertsen2012) explored the impact of neighborhood density and word frequency but also of word length on the vocabulary size (monosyllabic words only) of Danish-speaking children. Regression revealed that ND, WF, WL, and age together predicted 47% of the variance in vocabulary size, with ND, WF, WL, and age separately accounting for 39%, 3.2%, 2.2%, and 2.8% of that variance, respectively. Children with small vocabularies had learned words that were denser and more frequent in the ambient language, and those words were shorter than the words of children with larger vocabularies. By examining the acquisition of monosyllabic noun and predicate acquisition in French children between 16 and 30 months, Kern and dos Santos (Reference Kern, dos Santos, Hickmann, Veneziano and Jisa2017) showed a similar effect of ND and WF, with ND and WF separately accounting for 32.2% and 12.8% of the variance respectively. However, when analyzing the results by separating the nouns from the predicates, they found differences: for nouns, the model predicted 64.6% of the variance whereas for predicates, the size of predicate vocabulary was not correlated with either of the two variables.
Even though the link between production constraints and input is intuitively obvious, the development of production and input effects have in general so far been studied independently in first language acquisition. While the role of input is considered important to early language learning within a function-based perspective (e.g., Snow Reference Snow1977, Gallaway and Richards Reference Gallaway and Richards1994, Cameron-Faulkner et al. Reference Cameron-Faulkner, Lieven and Tomasello2003), it has only rarely been considered in an integrative way relative to its importance in the output patterns found in speech forms in the child's early productions. Moreover, only a few studies have adopted a longitudinal perspective. In this type of research, the use of a longitudinal perspective is very important as children develop their linguistic skills in relation to their cognitive and physical development: contraints change with age and can have different weight according to the linguistic level achieved by the child and/or his/her age. Storkel (Reference Storkel2009) simultaneously observed two phonological predictors (mean frequency of segments as a function of their position and mean frequency of biphones) and two lexical predictors (neighborhood density and word length) to estimate the age of acquisition of words in English-speaking children. This shows that phonology influences the age of word acquisition over the entire duration of the investigation, whereas lexical characteristics play a role only during the first period (between 16 and 20 months). Finally, it is important to pinpoint the fact that most of the studies observed language development in English-learning children. Only few looked at different languages (Stokes et al. Reference Stokes, Bleses, Basbøll and Lambertsen2012 for English, French and Danish, Kern and dos Santos Reference Kern, dos Santos, Hickmann, Veneziano and Jisa2017 for French, and Hansen Reference Hansen2017 for Norwegian). This predominant focus on English is problematic as different languages present different specific characteristics that play a role in learning trajectories and strategies. Several examples can be given to show the necessity of taking a crosslinguistic perspective. On the one hand, each language has a specific phonetic and phonological system; this can differ from one language to another in terms of number of phonemic units as well as in terms of phonemic diversity, and phonetic and phonological development. According to the size principle defined by Lindblom and Maddieson (Reference Lindblom, Maddieson, Li and Hyman1988), small systems use only the articulatorily simplest consonants. Languages with larger inventories have both simple and articulatorily more elaborated consonants. Only languages with the largest inventories have the most complex consonants with simple and elaborated consonants. More specifically, they show how languages with smaller consonant inventories will tend to contain only those consonants that are in various ways inherently simpler (perhaps because they involve smaller movements to pronounce them, or are easier for a listener to distinguish from other sounds). However, languages differ not only in their phonemic repertoire but also in their word structure. In French, for instance, words are more often disyllabic and composed of open syllables than in English, where words are more often monosyllabic and composed of closed syllables (Delattre Reference Delattre1965). All of these different characteristics play an important role in children's phonetic and phonological development. On the other hand, children are exposed to different input, in both quantitative and qualitative terms. Children are exposed to greater or lesser amounts of speech, as well as to different types of speech according to the education level of their parents or the culture in which they grow up. Several studies showed that the input a child with high-educated parents is exposed to is at a higher level and more diversified than the input received by children with less-educated parents (Hart and Risley Reference Hart and Risley2003). Moreover, a very recent study shows also that cultural habits can affect the quantity of child-directed speech. Cristia et al. (Reference Cristia, Emmanuel, Gurven and Jonathan2017) were able to provide an estimation of how frequently children aged 0–11 years receive one-on-one verbal input among Tsimane forager-horticulturalists of lowland Bolivia. Analyses of systematic observations reveal that less than one minute per daylight hour is spent talking to children younger than four years of age, which is very little time in comparison with what is observed in western societies. These results reveal large cross-cultural variation in the linguistic experiences provided to young children. All differences that can have a direct influence on word frequency have to be taken into account for lexical development.
In this thematic issue, lexical acquisition will be viewed as emerging from motor capacities interacting with learning from the input language. Several languages belonging to different language families were studied and compared where possible. The crosslinguistic approach, with various language families and the typological differences they illuminate in varied dimensions of language complexity, provides important insights into the functioning and acquisition of language. Furthermore, the results presented here deal with the development of French and English as well as with two rarely described languages: Tunisian Arabic and Tashlhiyt, both of which have linguistic properties that make them intrinsically worthy of investigation. These languages are particularly interesting as their phonological and syllabic repertoires are composed of “complex” segments and structures, which are usually acquired later by children. Last but not least, all but one of the articles use large corpora composed of longitudinal, naturalistic data: in each case, the interactions of several mother-child dyads were regularly (usually one hour every two weeks) recorded from children's first word production until a few months after their lexical spurt.
There are seven articles in all. The first two show the influence of universal articulatory constraints as well as language-specific phonetic and phonological complexity on language development. Lahrouchi and Kern observe the influence of biomechanical constraints on babbling and first-word production in two children acquiring Tashlhiyt, a Berber language spoken in Morocco. Gayraud et al. compare the increase of word complexity in the targeted and produced productions of children acquiring French, Tunisian Arabic and Berber. The next three papers explore the relation between lexical and phonological development in French and/or English. The goal of the study by Davis et al., of four children in two language environments (French and English) was to consider the relation between children's phonological capacities and the words they choose to say in the first 50-word phase. More precisely, they compared the phonological characteristics of word targets children chose to say to their actual productions of those words, to test the hypothesis that children choose words to say that are predominantly based on their own speech production abilities in the earliest period of language development. Overall, the results did not provide a clear answer about the presence of a choice that was consistent across languages or across phonological dimensions tested. Rose and Blackmore address relations between lexical and phonological development, with an emphasis on the notion of phonological contrast. In a systematic comparison between the lexical development of two child learners of English and their acquisition of consonants in syllable onsets, they establish a developmental timeline for each child's onset consonant system, which they compare to the types of phonological contrasts that are present in their expressive vocabularies at each relevant milestone. Their data fails to return tangible parallels between the two areas of development. Zamuner and Thiessen investigate the development of new-word imitation in five English–speaking children between 1 and 2 years of age, taking into consideration both the patterns of the target words and children's productive abilities. The results support models of language development in which not only phonological and lexical representations, but also phonetic representations, play a role. Finally, the last two papers are methodological grounded, and shed light on the influence of data collection (amount and situation) on the observed results. The article by Glas et al. investigates, in three different languages (French, American English and Tunisian) the influence of activity type on the characteristics of child-directed speech, in particular on lexical diversity and grammatical complexity. Finally, Yamaguchi deals with the methodological issue of data sampling, trying to show how much data should be considered, in relation to the research questions and type of analyses.
With the work presented in these papers, we hope to have shed some light on the process of lexical development. Of course, much more remains to be done. A more integrative model should be developed, with more factors whose weight and interactions with each other would be measured longitudinally and crosslinguistically. Several other factors that clearly contribute to the effectiveness of lexical learning could be added to the model, such as phonological network (Carlson et al. Reference Carlson, Sonderegger and Bane2014 ) imageability (Ma et al. Reference Ma, Golinkoff, Hirsh-Pasek, McDonough and Tardif2009) or associative structure (Hills Reference Hills and T.2013). At this point, we are only aware of one recent article that presentes a model of lexicon development, which has identified in important number of candidate predictors of word learning in a crosssectional and crosslinguistic perspective (Braginsky et al. Reference Braginsky, Yurovsky, Marchman and C. Frank2017). Braginsky et al. considered nine predictors in their word-learning process (babiness, mean length of utterance or MLU, frequency, concreteness, solo frequency, arousal, length, valence, and final frequency), in children from 10 to 35 months old learning ten different languages (Croatian, Danish, English, French, Italian, Norwegian, Russian, Spanish, Swedish, and Turkish). Both comprehension and production of different lexical categories (nouns, predicates and function words) were taken into consideration. First, they found general consistency in the ordering of predictors across languages: babiness, frequency, MLU, and concreteness were relatively stronger predictors of age of acquisition across languages. Second, predictors varied substantially in their weights across lexical categories. For instance, frequency and concreteness was more important for nouns than for predicates. Third, predictors also changed in relative importance across development. For example, the effects of concreteness and frequency increase with age. Finally, factors were more or less predictive according to the type of competence measured; for example, word length is far more predictive for production than comprehension.
Testing such models further not only requires other factors to be added to the model, but would also require the addition of data from both other languages and other language families. One could also add children from different language communities that have been too rarely been considered so far. It would also be necessay to working on natural spontaneous data (as opposed to parental report) or experimentally collected data. Both factor and data expansion would allow our research community to confirm and expand our current knowledge.