1. Introduction
Every spoken language relies on an identifiable set of phonological contrasts to express semantic differences between both morphemes and words.Footnote 1 Phonological contrastiveness can be illustrated given any pair of semantically different words that differ by a single speech sound (e.g., big ~ pig). At the segmental level, this ‘minimal’ pair of words displays a contrast between /b/ and /p/. At the level of phonological features, the same pair exemplifies a contrast in voicing between these two consonants.Footnote 2
Given its primacy in language functioning, phonological contrast has been a central element of theoretical enquiry in phonology over the past century and a half (Dresher Reference Dresher2016). Within this literature, contrastiveness is essential to the definition of the relevant phonological ‘building blocks’ for any inventory of phonemes (see also Mackenzie Reference Mackenzie2009, Reference Mackenzie2013; Hall Reference Hall and Teddiman2014; Cowper and Hall Reference Cowper and Hall2014). However, and perhaps surprisingly, contrastiveness and the building blocks it entails do not figure as prominently within the literature on phonological development (see Dunbar and Idsardi, Reference Dunbar, Idsardi, Lidz, Snyder and Paterto appear, for an overview). The notion of phonological contrast is central to Jakobson's (Reference Jakobson1941) seminal work on this topic, and centrally relevant to virtually every theory utilized within the acquisition literature that makes reference to phonological features (e.g., Fikkert and Levelt Reference Fikkert, Levelt, Peter, Avery, Dresher and Rice2008). This notion is also inherent to psycholinguistic models of phonological development that take phonological differences between lexical items to be predictive of learners’ behaviours (e.g., Metsala and Walley Reference Metsala, Walley, Metsala and Ehri1998, Walley et al. Reference Walley, Metsala and Garlock2003). In related literature, usage-based approaches support the claim that usage frequency can predict patterns of phonological development (e.g., Vihman and Croft Reference Vihman and Croft2007, Yamaguchi Reference Yamaguchi2012). However, potential relationships between the development and usage, in both perception or production, of phonological contrasts within the child's lexicon and their impact on the acquisition of phonological productive abilities remain largely unexplored (Stoel-Gammon Reference Stoel-Gammon2011).
In this article, we add to recent literature aimed at addressing this gap. For two child learners of American English, we provide a systematic comparison between their respective patterns of phonological development, usage frequency in our data sample, and corresponding data on their developing lexicons. In a nutshell, our results suggest that the development of phonological productive abilities is governed by neither usage frequency nor the emergence of phonological contrast within the lexicon; rather, the development of these abilities appears to be best captured in terms of phonetic classes of phones and of the speech articulations they involve. We conclude that while contrast remains an essential notion to explain the shape and functioning of spoken languages, it cannot be considered a driving force in the development of phonological productive abilities. Rather, models of phonological development that place phonetic (perceptual and articulatory) categories at their core are better equipped to account for the development of phonological productive abilities in child language learners.
We begin with a survey of literature documenting relationships between lexical development and the development of phonological abilities in production, and also in the areas of phoneme awareness and word learning, and discuss its implications for the notion of phonological contrast. We then turn to our two longitudinal case studies of child learners of English, and document these children's unfolding phonological productive abilities in onset consonants, which we analyze in light of data on the development and use of these children's respective vocabularies.
2. Background
Different lines of investigation in language acquisition build on the notion of phonological contrast. We begin with a sample of the psycholinguistic literature, which generally draws on experimental evidence from speech perception, speech processing, and word learning abilities. We then turn to studies of phonological development in production, which typically draw on cross-sectional or longitudinal corpus data.
2.1 Psycholinguistic models of phonological development
Lexical contrasts play a central role in the psycholinguistic literature on phonological development, especially in models that build on the content and structure of the lexicon. For example, Charles-Luce and Luce (Reference Charles-Luce and Luce1990) describe phonological knowledge within the lexicon in terms of between-word similarity, whereby words that share phonological characteristics (e.g., pit, bit, kit, sit, …) cluster together within lexical neighbourhoods (here, the [-ɪt] neighbourhood, which involves contrasts between [p, b, k, s, …]). They observe that toddlers’ early lexical neighbourhoods are generally sparse, as they are often populated by only a single or very few word forms. Charles-Luce and Luce make the claim, later challenged, as we will see below, that these properties of early lexicons result in a low level of representational detail, as broad distinctions suffice to mark functional distinctions between the words contained within sparse neighbourhoods. Metsala (Reference Metsala1997) and Metsala and Walley (Reference Metsala, Walley, Metsala and Ehri1998) encapsulate this general hypothesis within the Lexical Restructuring Model: “the representations supporting spoken word recognition become increasingly segmental with spoken vocabulary growth, and this change makes possible explicit access to phonemic units” (Metsala and Walley Reference Metsala, Walley, Metsala and Ehri1998:89).
This general hypothesis, that contrast within the developing lexicon drives phonological development, offers an intuitive view of phonological development: Early, holistic lexical representations gradually become more refined segmentally, as functionally needed to represent phonological contrasts as they enter the child's lexicon. This hypothesis has provided a basis to capture behaviours in language acquisition and word learning (Walley Reference Walley1993, Storkel Reference Storkel2001, Hollich et al. Reference Hollich, Jusczyk, Luce, Skarabela, Fish and Do2002, Storkel Reference Storkel2002), word recognition (Metsala Reference Metsala1997, Luce and Pisoni Reference Luce and Pisoni1998), as well as the development of phonological awareness and its relation to literacy development (Metsala and Walley Reference Metsala, Walley, Metsala and Ehri1998, Metsala Reference Metsala1999, Walley et al. Reference Walley, Metsala and Garlock2003, Ainsworth et al. Reference Ainsworth, Welbourne and Hesketh2016). For example, in experimental settings, children typically acquire words from dense lexical neighbourhoods before they acquire words from sparser neighbourhoods, as long as lexical competition effects within dense neighbourhoods do not impede word learning (e.g., Hollich et al. Reference Hollich, Jusczyk, Luce, Skarabela, Fish and Do2002, Stokes et al. Reference Stokes, Klee, Carson and Carson2005).
While one can correlate neighbourhood density with requirements about functional contrastiveness, the actual role of contrast in shaping this behaviour remains unclear, given that words from dense neighbourhoods also tend to have a high frequency of occurrence in the ambient language, pointing to effects of phonotactic probability or, more generally, to pressures imposed by usage frequency (Storkel Reference Storkel2003, Reference Storkel2004; Stokes Reference Stokes2010). The relation between contrast within the lexicon and toddlers’ early phonological abilities is also challenged by Coady and Aslin (Reference Coady and Aslin2003), who compared the productive lexicons of two learners of English against properties of child-directed speech and adult lexical data. They show, in line with Storkel (Reference Storkel2001) and Hollich et al. (Reference Hollich, Jusczyk, Luce, Skarabela, Fish and Do2002), that neighbourhood-density effects relate primarily to the set of (frequent) words that children acquire early, given that these words also tend to incorporate the most frequent sounds and sound combinations of the ambient language. Following an earlier line of argument by Dollaghan (Reference Dollaghan1994), Coady and Aslin (Reference Coady and Aslin2003) contend that in order to learn phonologically similar words present in early lexicons, children's representations must already afford a considerable level of representational detail as a condition for the acquisition of these words, which would otherwise be undistinguishable from one another. Finally, concerning the notion of functional contrast, it remains unclear what types of phonological contrasts end up effectively encoded in the lexicon across the developmental period. Indeed, while lexicon-based approaches to phonological development make claims about the need for functional contrastiveness within the lexicon, these approaches do not make explicit claims about the nature of these contrasts, or whether they relate to the types of place, manner, and voicing distinctions traditionally used within the literature on phonological theory. It is indeed possible that lexical contrasts encoded in children's early lexicons transcend the types of categories defined based on the patterning of adult phonological systems. While it would be premature to draw conclusions about this question based on current knowledge, the literature on phonological perception offers some insight into these questions, as place, manner, and voicing distinctions do appear to play a role in predicting toddlers’ (and adults’) gradient patterns of phonological perception.
From a perceptual standpoint, the currently accepted view is indeed that early lexical representations contain a high level of perceptual detail, except in phonological contexts less conducive to accurate phonological perception, such as syllable codas and unstressed syllables more generally (e.g., Vihman et al. Reference Vihman, Nakai, DePaolis and Hallé2004, Swingley Reference Swingley2005, Zamuner Reference Zamuner2013; see also Zamuner Reference Zamuner2011 for a discussion of this literature). The refined perceptual abilities of toddlers are also highlighted by White and Morgan (Reference White and Morgan2008), who conducted a series of experiments during which 19-month-old toddlers were presented with pairs of known versus unfamiliar (pictures of) objects, and simultaneously exposed to auditory stimuli falling into one of five categories: correct pronunciations (e.g., [ʃu] for ‘shoe’), mispronunciations affecting a single feature (e.g., [fu], in which only place of articulation deviates from original ‘shoe’), two features (e.g., [vu], with deviant place and voicing) and three features (e.g., [ɡu], with deviant place, continuancy and voicing) and, finally, a completely novel, unrelated form (e.g., ‘dax’). The results show that these toddlers are in fact extremely sensitive to mispronunciations and, further, that this sensitivity co-varies with the relative degree of phonological deviance between a familiar word and its mispronounced variant, in ways which are qualitatively similar to adult listeners (e.g., Milberg et al. Reference Milberg, Blumstein and Dworetzky1988, Connine et al. Reference Connine, Titone, Deelman and Blasko1997 on gradient perceptual effects in adults).
This hypothesis is further supported by Swingley (Reference Swingley2009), who summarizes a sizeable body of experimental evidence where no correlations were found between young toddlers’ vocabulary sizes and their reactions to mispronounced words (see also Swingley et al. Reference Swingley, Pinto and Fernald1999, Bailey and Plunkett Reference Bailey and Plunkett2002, Swingley Reference Swingley2003). In our study below, we reach similar conclusions from the perspective of phonological production, as we fail to make correct predictions about the development of phonological productive abilities based on learners’ vocabulary data.
In sum, while the psycholinguistic literature reveals lexicon-driven effects in the area of word learning and word recognition, it is unclear whether these effects can be related to phonological contrast, a concept which, as the above studies suggest, cannot be easily applied to the content of early phonological lexicons. However, given that toddlers are equipped with both refined perceptual abilities and powerful learning mechanisms, they arguably can engage in learning the sounds and sound distributions present in their target language(s) (Curtin et al. Reference Curtin, Mintz and Byrd2001, Johnson Reference Johnson2016, Zamuner et al. Reference Zamuner, Moore and Desmeules-Trudel2016), in relative independence from lexical development (e.g., Maye and Gerken Reference Maye, Gerken, Howell, Fish and Keith-Lucas2000, Maye et al. Reference Maye, Werker and Gerken2002).
2.2 Formal models of phonological development
As mentioned in the introduction, the notion of contrastiveness is central to traditional descriptions of the phonological systems of adult languages. Surprisingly, however, this notion is seldom discussed in the literature on the development of phonological abilities, even in models anchored in formal theories of phonology. Within this literature, scholars assume (more or less explicitly) that children engage in the production of words which are represented within their lexicons, and focus on descriptive or formal properties of the child's phonological (segmental and/or prosodic) productive abilities (e.g., Smith Reference Smith1973, Spencer Reference Spencer1986, Barlow Reference Barlow1997, Freitas Reference Freitas1997, Pater Reference Pater1997, Rose Reference Rose2000, Goad and Rose Reference Goad, Rose, Kager, Pater and Zonneveld2004, dos Santos Reference dos Santos2007, Almeida Reference Almeida2011, Yamaguchi Reference Yamaguchi2012, Van ’t Veer Reference van ’t Veer2015). Perhaps the most direct references to the notion of contrast come from works on the acquisition of phonological representation in Dutch-learning children (Fikkert Reference Fikkert1994, Levelt Reference Levelt1994, van der Feest Reference van der Feest2007, Fikkert and Levelt Reference Fikkert, Levelt, Peter, Avery, Dresher and Rice2008, van der Feest and Fikkert Reference van der Feest and Fikkert2015). Outside of this body of work, contrast within the children's actual lexicons tends to be assessed indirectly, based on considerations about the phonological and phonetic properties of the target language. From there, phonological categories posited within the literature on phonological theory (e.g., phones, syllable structure constituents, phonological features) typically constitute the starting point for analysis; these units are assumed to constitute the target units to be acquired by the child. Through systematic comparisons between these target units and their renditions by the child, phonologists have uncovered systematic behaviours, generally described as part of developmental stages (see Smith Reference Smith1973 for an early example). While this method has provided substantial insight into phonological development, many questions remain, concerning, for instance, the nature of these units, their emergence, or what triggers children's transitions from one stage to the next.
Nonetheless, this literature suggests intriguing parallels between lexical and phonological development. An example of this is the phenomenon of lexical selection and avoidance, whereby children limit their attempts at words of select phonological shapes (prosodic or segmental), while they appear to systematically avoid words of other shapes (e.g., Ferguson and Farwell Reference Ferguson and Farwell1975, Leonard et al. Reference Leonard, Schwartz, Morris and Chapman1981, Schwartz and Leonard Reference Schwartz and Leonard1982, Stoel-Gammon and Cooper Reference Stoel-Gammon and Cooper1984, Stoel-Gammon Reference Stoel-Gammon2011, Vihman Reference Vihman2014; see also Kehoe Reference Kehoe2015 in the context of bilingual first language acquisition). In a related line of inquiry, scholars have more recently begun to incorporate evidence from studies in speech perception, to uncover how perception can affect lexical and phonological development (van der Feest Reference van der Feest2007, Zamuner Reference Zamuner2011, Curtin and Zamuner Reference Curtin and Zamuner2014, van der Feest and Fikkert Reference van der Feest and Fikkert2015), to investigate the effects of functional pressures such as usage frequency on acquisition (Levelt et al. Reference Levelt, Schiller and Levelt1999, Sosa and Stoel-Gammon Reference Sosa and Stoel-Gammon2012, Ota and Green Reference Ota and Green2013, Vihman Reference Vihman2014), and to develop models of phonological development informed by speech phonetics more generally (Vihman and Croft Reference Vihman and Croft2007; Menn et al. Reference Menn, Schmidt and Nicholas2009, Reference Menn, Schmidt, Nicholas, Vihman and Keren-Portnoy2013; McAllister Byun et al. Reference Byun, Tara and Rose2016). We return to the latter in our discussion below.
2.3 Relations between phonological development and the developing lexicon
Building on observations from both bodies of literature surveyed above, Stoel-Gammon (Reference Stoel-Gammon2011) points out that the relation between children's lexical knowledge and the development of their phonological systems remains relatively obscure (see also Saffran and Graf Estes Reference Saffran, Estes and Kail2006 and Curtin and Zamuner Reference Curtin and Zamuner2014 for related discussions on infant speech perception and word learning).
Based on cross-sectional data on phonological and lexical development in English, Sosa and Stoel-Gammon (Reference Sosa and Stoel-Gammon2006) associate degrees of intra-word variability in phonological production to levels of phonological development, which they correlate to vocabulary size: as children develop both their phonological systems and their lexicons, their phonological productions become more accurate and less variable. In a related study, Sosa and Stoel-Gammon (Reference Sosa and Stoel-Gammon2012) show that lexical frequency, as a measure of usage, inversely correlates with intra-word variability in production, but that lexical frequency does not correlate with phonological development. Also addressing the potential influence of usage on phonological development, Ota and Green (Reference Ota and Green2013), based on longitudinal data tracking English-learning children's phonological systems and frequency properties of their caregivers’ speech, show that the frequency of occurrence in caregiver speech of particular phonological structures (e.g., complex syllable onsets), as opposed to the frequency of individual words or phones, can be generally predictive of the rate of phonological development for these structures. However, Ota and Green also report on contradictory results, namely concerning the development of (C)Cr clusters, whose development was in fact driven by the articulatory development of the consonant /r/, independent of input frequency pressures: “Taken together with the significant main effects of cluster types, these findings underscore the independent role played by phonological structures in the development of sound production” (Ota and Green Reference Ota and Green2013:561). More recently, Zamuner et al. (Reference Zamuner, Morin-Lessard, Bouchat-Laird and Yavaş2015), based on the longitudinal tracking of one child learner of French, uncovered similar trends as they relate behaviours observed in the data to the syllable composition of frequently-occurring French words.
However, the findings reported above, in which both phonological and usage-related factors contribute different sources of explanation for behaviours observed in child language phonology, do not provide information about the types of contrasts that children encode in their lexicons across the developmental period, as these studies lack either longitudinal tracking of the evidence or independent data on the children's developing lexicons across the relevant developmental period. Our study below offers an additional step in this direction. Before we move to it, we revisit some of the theoretical and methodological challenges at hand.
3. Interim discussion
The body of research summarized above suggests various relationships, or lack thereof, between lexical development and usage, on the one hand, and the development of word recognition, learning, and phonological production abilities, on the other. Given the primordial nature of contrast in mainstream models of phonology, it is not illogical to consider that phonological contrast between the words making up the child's lexicon should play a role in the development of phonological productive abilities. However, this possibility faces a number of potentially interfering considerations. The first concerns what the definition of contrast, as a theoretical construct, actually involves, especially from the perspective of first-language learners. Among other considerations, while phonological contrasts may be accessible early to children in the area of speech perception, the reproduction of these contrasts in speech involves several additional mechanisms, for example the mapping of perceptual categories into distinctive articulatory gestures, and the combination of these gestures in speech production (see McAllister Byun et al. Reference Byun, Tara and Rose2016 for a recent summary of the relevant literature). Second is the fact that lexical neighbourhoods, and the notion of minimal pairs in general, offer an over-simplified picture of the facts. Not all contrasts between minimal pairs are equal, as they might involve more or less similar speech sound substitutions (e.g., pet ~ bet vs. set ~ met). Further complicating the picture is the fact that even individual phonemes involve different motor plans when produced across different positions (e.g., tab vs. bat, where the closure and release phases of the two consonants are contingent on their position within the word; see Pierrehumbert Reference Pierrehumbert2003). More generally, the notion of minimal pair is itself questionable at the theoretical level, as we discuss next.
3.1 Minimal pairs as by-products of phonological distributions
As many of the studies cited above show, the vocabularies of young children learning languages like English and Dutch show surprisingly few actual minimal pairs: “these results do not support a strong role for minimal pairs in helping to refine children's knowledge of the words that were tested” (Swingley Reference Swingley2009: 265). Taken more generally, the traditional notion of minimal pair should be considered as little more than an easy-to-explain shortcut for the instructor teaching phonology than an actual device for the (child) language learner. This concept is in fact particularly convenient in those languages that, like English and most European languages in general, have isolating word structures, a relatively small consonant inventory, and many commonly-used CVC(V) word forms. However, minimal pairs can be elusive in polysynthetic or agglutinating languages such as Cree, Inuktitut, Mayan Quiche, or Turkish, where word forms are often phonologically and semantically too complex to lend themselves to descriptions based on minimal pairs. While these languages may display minimal pairs at the morpheme level, these morphemes appear in isolation rarely or not at all, and the complete meanings of the relevant morphemes and morpheme combinations may only be acquired at a much later age than are the speech productive abilities required to produce the phonological contrasts they rely on (Courtney and Saville-Troike Reference Courtney and Saville-Troike2002, Rose and Brittain Reference Rose, Brittain, Pirvulescu, Cuervo, Pérez-Leroux, Steele and Strik2011). Minimal pairs are also elusive in languages with large phonological inventories, such as Kinyarwanda (Kimenyi Reference Kimenyi1979) or !Xóõ (Bradfield Reference Bradfield2014). At a more basic level, minimal pairs of words can in fact be taken as the by-product of phonological distributions of phones, whether they represent contrastive phonemes or contextual allophones, which are best understood in relation to prosodic properties such as positions within syllables and words, and relative to word stress. Excluding the phenomenon of phonetic free variation, a phone is contrastive to the extent that its occurrence is not phonologically predictable within a given environment, irrespective of the attestation of minimal pairs of words illustrating this contrast. Generalizations about distributions of phones can be attained based on relatively small samples of phonetic data, independent of the morphosyntactic or semantic properties of the language. Phone distributions are in fact central to virtually all phonological learning models (Pierrehumbert Reference Pierrehumbert2003, Lin and Mielke Reference Lin and Mielke2008, Mielke Reference Mielke2008,Footnote 3 Boomershine et al. Reference Boomershine, Hall, Hume, Johnson, Peter Avery, Elan Dresher and Rice2008, Munson et al. Reference Munson, Edwards, Beckman, Cohn, Fougeron and Huffman2011)Footnote 4 in line with experimental evidence on the learning of phonological contrasts (Maye and Gerken Reference Maye, Gerken, Howell, Fish and Keith-Lucas2000, Maye et al. Reference Maye, Werker and Gerken2002) and the acquisition of phonotactic knowledge (Zamuner Reference Zamuner2013 and references therein). All of these considerations call for empirical verification, looking into both phonological and related lexical development. We now turn to some of the challenges inherent to such investigations.
3.2 Methodological challenges and solutions
Assessment of the development of phonological abilities among language learners has been made much easier in recent years, in particular given the software programs and corpora available through the PhonBank project (http://phonbank.talkbank.org; Rose and MacWhinney Reference Rose, MacWhinney, Durand, Gut and Kristoffersen2014). However, assessments of the composition of young children's lexicons remain methodologically difficult to obtain. In a nutshell, how can one precisely assess the level of lexical knowledge of any individual speaker, especially in the case of young learners?
As reported by Stokes et al. (Reference Stokes, Kern and dos Santos2011), vocabulary development has largely been assessed from various versions and adaptations of the MacArthur-Bates Communicative Development Inventories (henceforth CDI; Fenson et al. Reference Fenson, Dale, Reznick and Bates1993, Fenson et al. Reference Fenson, Marchman and Thal2007). CDI inventories consist of periodic (typically monthly) caregiver reports on their children's word usage, which can be used to assess lexical and related phonological characteristics of these children's lexicons (see also Storkel Reference Storkel2004, Reference Storkel2006; Zamuner Reference Zamuner2009; Stokes Reference Stokes2010; Stokes et al. Reference Stokes, Kern and dos Santos2011; Zamuner et al. Reference Zamuner, Morin-Lessard, Bouchat-Laird and Yavaş2015). However, CDI inventories are limited both by children's potentially low rates of communicative behaviours and/or by unsystematic compliance to the data recording protocol on the part of the children's caregivers. These inventories are thus likely to underestimate the true extent of the child's lexicon (Paul Reference Paul2007). Nonetheless, CDI data have also been shown to be generally representative of the most prominent phonological properties of children's lexicons (Rescorla et al. Reference Rescorla, Ratner, Jusczyk and Jusczyk2005, Heilmann et al. Reference Heilmann, Weismer, Evans and Hollar2005). This level of detail, while limited in some respect, is the best metric currently available for the longitudinal tracking of expressive lexical knowledge.
Keeping these challenges in mind, we turn now to our current study, which builds on much of the research discussed above.
4. Current study
The work reported on here compares the longitudinal development of phonological abilities in two children learning English against CDI data documenting the types of phonological contrasts they had in their lexicons across the same developmental period. In line with Sosa and Stoel-Gammon (Reference Sosa and Stoel-Gammon2006, Reference Sosa and Stoel-Gammon2012) and Ota and Green (Reference Ota and Green2013), our analyses show that the development of phonological abilities is largely independent of the number and types of contrasts that children represent in their lexicons. Further, we provide additional evidence, after Demuth (Reference Demuth, Gülzow and Gagarina2007), Levelt and van Oostendorp (Reference Levelt, van Oostendorp, Los and van Koppen2007), Rose (Reference Rose, Pellegrino, Marsico, Chitoran and Coupé2009), and Rose and Inkelas (Reference Rose, Inkelas, Ewen, Hume, van Oostendorp and Rice2011), that usage frequency is also not a reliable predictor of the development of segmental productive abilities.
4.1 Methodology
We owe the datasets we analyze below to earlier empirical studies by Dr. Barbara Davis, who conducted parallel documentations of the vocabularies (through CDI reports) and phonological productive abilities (through naturalistic data recordings) of typically-developing children learning American English. We selected two of these children for analysis, Georgia and Charlotte, based on the combined availability of both CDI and speech production data over a time span during which they acquired most segmental properties of their target language. These datasets are part of the English-Davis corpus available through PhonBank <http://phonbank.talkbank.org/>. Original publications based on these data include Davis and MacNeilage (Reference Davis and MacNeilage1995) and Davis et al. (Reference Davis, MacNeilage and Matyear2002).
We first summarize the methods employed by Davis and colleagues in the collection and transcription of these data. We then describe how we organized the corpora for the purpose of our study.
4.1.1 Data collection and transcription
The participants were identified through informal referral from the surrounding community. Normal speech and language development, including absence of hearing disorders, was established through parental report. The CDI data were collected according to the standard protocol for CDI studies: parents were encouraged to record, at monthly intervals, the words they identified from their children's speech productions, using two supporting inventory questionnaires: CDI-Words and Gestures and CDI-Word and Sentences (Fenson et al. Reference Fenson, Dale, Reznick and Bates1993, Fenson et al. Reference Fenson, Marchman and Thal2007).
CDI data were collected on 37 reports documenting Georgia's expressive vocabulary development between the ages of 0;8.26 and 2;11.25, which includes 13 reports based on the Words and Gestures questionnaire, collected until the child turned 1;5, and 24 reports based on the Words and Sentences questionnaire. The CDI data for Charlotte consist of 32 reports collected between the child's ages of 1;0.26 and 2;7.23, and include two reports from Words and Gestures and 30 reports based on Words and Sentences, the latter used from the time the child was 1;3.14. In parallel to CDI data collection, actual speech production samples were gathered through naturalistic recordings, collected during the period spanning the children's late (canonical) babbling and early word production stages, until they were approximately 2;11. These recordings took place in the children's homes, while they were interacting with their parents or other individuals, also with the experimenter taking part in the interaction at times, however in ways which remained natural and observationally as neutral as possible.
The children's babbles and actual word productions were then transcribed using a combination of IPA characters and diacritics. These transcriptions were later converted for use by the Phon software program (Rose et al. Reference Rose, MacWhinney, Byrne, Hedlund, Maddocks, O'Brien, Wareham, Bamman, Magnitskaia and Zaller2006, Rose and MacWhinney Reference Rose, MacWhinney, Durand, Gut and Kristoffersen2014), and were linked (time-aligned) to the original audio recordings, which were consulted whenever it was deemed important to verify aspects of the original transcriptions.
4.1.2 Corpus preparation and data mining
The CDI reports were provided to us in the form of orthographic data transcripts in Phon format. In order to attain a maximally representative vocabulary profile of each child's lexicon, we supplemented the CDI vocabulary data with the words we found in their speech corpora at each relevant age, and which had not been documented within the CDI reports. As reported in section 4.2.1, this provided a noticeable addition to our dataset. Using a dictionary of pronounced forms (in citation form) built into Phon, we then assigned IPA transcriptions to each orthographic word represented within each dataset, which provided us with an estimate of the types of phonological units and contrasts potentially represented in the children's vocabularies throughout the development period.Footnote 5
Using algorithms built into Phon, we then labelled all the IPA transcriptions for syllable positions and obtained one-to-one phone alignments between IPA Target (model) forms and their corresponding IPA Actual (produced) forms, which we then verified manually for maximal accuracy. These aspects of coding are illustrated in Figure 1, a screen shot of a Phon record from Georgia's production corpus. Using these alignments, we tracked all patterns of segmental production, substitution, deletion or epenthesis that occurred in the data.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_fig1g.gif?pub-status=live)
Figure 1: Sample coding within Phon: Syllabification (through colour coding) and phone alignment between target and actual forms
Finally, in order to facilitate our comparisons of CDI against production data, we divided each dataset into one-month periods. After we completed these preparatory steps, we analyzed the corpora in an attempt to uncover relationships between the lexical and phonological properties of the lexical and production data.
4.2 Results
We begin our descriptions of the two datasets with an overview of both children's general levels of lexical development and overall linguistic productivity. We then continue with more detailed information about the unfolding of their phonological productive abilities in syllable onsets, which we compare to the relevant phonological content inferred from the CDI data.
4.2.1 General measures
As we can see in the next two figures, Georgia was more precocious than Charlotte in the development of her productive vocabulary. Figure 2 compares the two children based on the CDI data alone, while Figure 3 compares them based on the combined CDI and production data. A closer look at Figure 2 suggests a jump in vocabulary size for Georgia between 1;09 and 1;10, which however is not as salient when all the available data are considered in Figure 3.Footnote 6
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_fig2g.gif?pub-status=live)
Figure 2: Vocabulary size (CDI data only)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_fig3g.gif?pub-status=live)
Figure 3: Vocabulary size (number of word types recorded in CDI and production data)
The faster onset and higher rate of vocabulary development displayed by Georgia is also matched by her overall higher level of linguistic productivity, as illustrated in Figure 4 through a comparison between the two children's mean lengths of utterance throughout the period studied.Footnote 7
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_fig4g.gif?pub-status=live)
Figure 4: Mean Length of Utterance
As we can see from this last figure, the two children display qualitatively similar developmental curves, in spite of the quantitative differences between their respective datasets. The two children also developed their productive abilities in rather similar ways, as we discuss next.
4.2.2 Phonological development vs. lexical development
We begin with a summary of Georgia's and Charlotte's patterns in the development of consonants within singleton (i.e., one-consonant) syllable onsets. While other studies focusing on lexical neighbourhood development generally restrict themselves to particular word shapes, for example CVC word forms (see Zamuner Reference Zamuner2009), our aim differs in that we are interested in studying the development of phonological productive abilities in light of the phonological contrasts involved in lexical data. For sake of feasibility, we limited our research to consonants in singleton onsets. We opted for this syllable position based on robust cross-linguistic evidence that onsets typically offer a privileged position for the development of phonological contrasts (see Spencer Reference Spencer1986, Fikkert Reference Fikkert1994, Rose Reference Rose2000), a fact also verified independently in the case of Georgia (Day Reference Day2014). Also, while we considered singleton onsets in all word positions (except for /t,d/ in the flapping context), mastery was first attained in word-initial onsets for every consonant. Throughout our study, we consider a consonant to be mastered by the child when it is produced accurately in the majority (over 50%) of attempts during a one-month period, provided that the same minimum threshold of accuracy is maintained across the subsequent months.Footnote 8
In the interest of simplicity, we first report these data across three arbitrary time periods: the consonants acquired before the age of 2;0, those acquired after that age, and those which were not yet mastered by the end of the documented period, at 2;11.Footnote 9 As we can see in Table 1, for both children, early-acquired consonants include all target oral and nasal stops, glides, and voiceless, non-dental fricatives. In contrast to this, both children display slower development for voiced fricatives, liquids, and interdentals. Finally, concerning the development of affricates, Charlotte displays a more drawn-out developmental pattern than Georgia.
Table 1: Georgia's and Charlotte's general phonological development
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_tab1.gif?pub-status=live)
Keeping these general observations in mind, we now compare the development of consonants in production with that of phonological contrast as implied by each child's CDI data.
Table 2 provides a summary of Georgia's data for all target onset consonants, ordered by the age at which they were mastered or, for the consonants not mastered by the end of the observation period, listed at the bottom of the table. The first four columns list (a) the relevant target consonants, followed by the child's age (b) when these consonants were first attested in the expressive vocabulary, (c) when they were first produced in a target-like fashion, and (d) when these consonants were mastered, i.e., produced accurately over 50% of the time by the child. The remaining columns provide (e) the number of times each consonant was present in singleton onsets within attempted forms, and (f) the number of attestations of each consonant syllable onsets within the child's recorded vocabulary at the age of mastery as well as (g) a breakdown of these attestations across five general vowel types following these onset consonants, as a measure of contrastiveness in onset position, represented by A (low vowels), E (mid front vowels), I (high front vowels), O (mid back vowels) and U (high back vowels).
Table 2: Georgia's development of consonants in singleton onsets
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_tab2.gif?pub-status=live)
The first general observation we can draw from this table is that the age at which phones were first attested within the lexicon, irrespective of the type of vowels that follow within the lexical form, does not predict order of acquisition. For instance, although [b] and [ɹ] were both attested early in Georgia's lexicon (at 0;08), [b] was mastered early, at 1;00, whereas [ɹ] was not acquired until almost two years later, at 2;10. More generally, we can see that most target consonants were attested relatively early within the lexicon, at which point they either showed mastery or a slower pattern of development, discussed in section 4.3.
Table 3 replicates the analysis for Charlotte. In line with her lower rate of vocabulary development and mean lengths of utterance throughout the observation period, as observed in section 4.2.1, Charlotte acquired each target consonant at a slower rate than Georgia. However, as already reported in Table 1 above, aside from more noticeable difficulties with target affricates, the unfolding of her articulatory abilities is qualitatively similar to that of Georgia.
Table 3: Charlotte's development of consonants in singleton onsets
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_tab3.gif?pub-status=live)
Taken together, the data for both children fail to support the hypothesis that contrast within the developing lexicon drives phonological development. At the time of mastery, some consonants are found in a large number of words and contrastive environments in each of the children's respective vocabularies, while others – especially those acquired early – appear in only one or a few relevant contexts.
4.2.3 Phonological development vs. usage frequency
As usage-based approaches to language development suggest, it is possible that the figures reported above were skewed by frequency pressures, independently of phonological content within the children's lexicons, obscuring any contrast-based effects within the lexicon. However, apart from phones which are both frequent in the language and acquired early (e.g., [p, m]) or relatively infrequent and acquired later (e.g., [v, z]) this possibility is not supported by the numbers of attempts available in each corpus: some consonants, in particular obstruent and nasal stops, are acquired early despite being recorded only a handful of times in the children's attempted and actual productions by the time of mastery, while others, such as the liquids [l] and [ɹ] show late development in spite of high frequencies of occurrence within target and attempted forms. In fact, as both Day (Reference Day2014, Reference Day2015) and Blackmore (Reference Blackmore2016) suggest, the only prediction borne out by these datasets is that, by the time a given consonant is mastered, this consonant was already attested in at least some actual productions, either within babbles or actual word forms. While this observation may be taken as support for McCune and Vihman's (Reference McCune and Vihman2001) suggestion that productivity predicts development, Day (Reference Day2014, Reference Day2015) also shows that this relation is not entirely straightforward: a child may very well produce a phone in babbles without being able to reproduce a similar phone (at least as perceived by human transcribers) in actual word productions. Productivity of a given phone thus appears to be a necessary but not sufficient condition for its mastery within word productions (see also Sosa and Stoel-Gammon Reference Sosa and Stoel-Gammon2006, Reference Sosa and Stoel-Gammon2012; Sosa Reference Sosa, Peter and MacLeod2013; and references therein for related discussion). These observations are in line with outcomes of other studies available in the literature, which generally fail to support usage frequency as the driving force behind phonological development in production, even though frequency pressures may at times push development patterns in particular directions (Kehoe and Lleó Reference Kehoe and Lleó2003, Demuth Reference Demuth, Gülzow and Gagarina2007, Edwards and Beckman Reference Edwards and Beckman2008, Rose Reference Rose, Pellegrino, Marsico, Chitoran and Coupé2009, Ota and Green Reference Ota and Green2013, see also Brown Reference Brown1973 for an early critical discussion of frequency-based explanations). Returning to the data reported above in Tables 2 and 3, we can hypothesize that the slow development of the voiced fricatives [z, v], relative to their voiceless counterparts, was at least in part influenced by the low frequency of these consonants in the language. These results for Georgia and Charlotte suggest a role for practice in phonological development, a factor highlighted in many recent analyses of phonological development. However, the way and extent to which practice actually influences phonological development remains to be explored in more detail, for example in terms of how it can help the child shape stable production patterns for different phonological categories (Sosa and Stoel-Gammon Reference Sosa and Stoel-Gammon2006, Reference Sosa and Stoel-Gammon2012), a point taken up again in section 5.
4.3 The emergence of productive phonology as an independent system
The results obtained in this study thus contradict hypotheses that assign a central role to lexical contrastiveness or usage frequency in the development of phonological productive abilities. That is not to say that the lexicon is entirely irrelevant to phonological development: beyond default articulations dictated by biomechanical aspects of the vocal tract (MacNeilage and Davis Reference MacNeilage, Davis and Jeannerod1990a,Reference MacNeilage, Davis, Hardcastle and Marchalb, Reference MacNeilage and Davis2000), the commonly held view that sounds must be represented within lexical forms in order to be acquired remains central.
The observations reported above more point convincingly to the relative independence of representational units and mechanisms involved in phonological production. The existence of separate representational domains within the child's system has recently been formalized within the PRIMIR framework (Processing Rich Information from Multi-dimensional Interactive Representations; Werker and Curtin Reference Werker and Curtin2005, Curtin et al. Reference Curtin, Byers-Heinlein and Werker2011). Within PRIMIR, the input signal is processed along formally independent yet interrelated levels, called ‘planes’. While the original proposal focuses on the ‘General-Perceptual’, ‘Word’, and ‘Phoneme’ planes, these authors suggest that the system can accommodate as many planes as needed to encode relevant properties of the ambient language, as they are identified by the learner. Unavoidably, this must include representations for the types of motor-acoustic pairings central to the reproduction of perceptual phonological categories in spoken forms. This is the locus of the A-map model (McAllister Byun et al. Reference Byun, Tara and Rose2016), a novel proposal which supplements PRIMIR in the area of phonological production. Building on many of the considerations behind the Linked-Attractor model (Menn et al. Reference Menn, Schmidt and Nicholas2009, Reference Menn, Schmidt, Nicholas, Vihman and Keren-Portnoy2013), one of the aims of the A-map is to formally capture the development of abstract associations between the perceptual categories identified by the child and the articulatory dimensions involved in their reproduction within speech forms. Inherent to every formal approach to phonological feature representation since at least Jakobson (Reference Jakobson1941), this type of acoustic-articulatory pairing is also central to non-linear (quantal) approaches to segmental representation based on phonetic evidence (e.g., Halle and Stevens Reference Halle, Stevens, Wathen-Dunn and Woods1959, Reference Halle and Stevens1962, Reference Halle, Stevens, Lindblom, Öhman and Fant1979; Keyser and Stevens Reference Keyser and Stevens2006; Stevens and Keyser Reference Stevens and Keyser2010). Such pairings are also considered central in most recent discussions about the nature and origins of phonological features (Clements and Ridouane Reference Clements, Ridouane and Botinis2006; Mielke Reference Mielke2008, Reference Mielke, Ewen, Hume, van Oostendorp and Rice2011; Lin and Mielke Reference Lin and Mielke2008; Cowper and Hall Reference Cowper and Hall2014; Hall Reference Hall and Teddiman2014).
Recall from Table 1 that both Georgia and Charlotte acquired all target oral and nasal stops as well as glides relatively early, and also mastered non-dental or voiceless fricatives before dental and/or voiced ones. Recall as well that Charlotte showed difficulties mastering the production of target [ʧ] and [ʤ]. As we shift our focus away from the lexicon and consider these data from a phonological standpoint, it is striking that all of these observations point to natural classes of phones which were already firmly established within seminal works underlying modern phonological theory (Jakobson et al. Reference Jakobson, Fant and Halle1952, Chomsky and Halle Reference Chomsky and Halle1968, Trubetzkoy Reference Trubetzkoy1969). Despite potential effects from frequency such as those noted above for [z] and [v], the general developmental patterns observed in both children's datasets are generally in line with expectations about consonantal development, namely that obstruent stops, nasal, and glides be mastered early, and that fricatives, affricates, and liquids be acquired at later stages, also including the articulatory contrast between alveolar and dental fricatives (e.g., Smit Reference Smit1993, Bernhardt and Stemberger Reference Bernhardt and Stemberger1998, Sosa Reference Sosa, Peter and MacLeod2013). The same observation applies to the main patterns of substitution displayed by each child during pre-mastery stages, listed in Table 4. As can be seen there, these patterns involve phonetic dimensions expected from the properties of the target phone, for example with regard to obstruent voicing, approximant rhoticity and laterality, or, specific to Charlotte, the fricative release required in the production of affricates.
Table 4: Georgia's and Charlotte's main substitution patterns prior to mastery
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20181103122154793-0880:S0008413118000129:S0008413118000129_tab4.gif?pub-status=live)
The nature of these substitution patterns further supports a view of the phonological production system, and its acquisition, as relatively independent of the content and structure of the speaker's lexicon. While the lexicon supplies the learner with target word forms, including the sounds and sound combinations contained within these forms, the acquisition of these phonological units in production is neither dependent on the structure of the child's developing lexicon nor predictable from usage frequency alone.
5. Discussion
The results of our investigation converge nicely with those from infant speech perception studies addressing lexical knowledge, reported in section 2: the content of the child's lexicon cannot be taken as governing patterns of phonological development. Instead, the degree of phonetic detail memorized as part of word forms stored within the lexicon appears to supply the relevant information, independent of functional contrastiveness or usage frequency. As also reported in section 2, while early, sparse lexicons cannot supply children with every possible phonotactic distribution allowed by the target language (for example, word-initial [sf] clusters, while possible in English, are unlikely to be represented in the lexicons of young English-learning children), early lexicons are likely to provide perceptual targets for the most prominent sounds and sound combinations present in the language, and thus supply children with many objects to reproduce through their own speech-motor articulations. As McAllister Byun et al. (Reference Byun, Tara and Rose2016) argue, articulatory productivity depends on the stability of sensory-motor mappings across word productions, even if these mappings are inaccurate and result in phonological substitutions. Together, these observations also suggest that functional contrastiveness, and its relation to phonological awareness, arguably emerge at a later stage, as the child gradually climbs the phonological ‘ladder of abstraction’ (Munson et al. Reference Munson, Edwards, Beckman, Cohn, Fougeron and Huffman2011; see also Pierrehumbert Reference Pierrehumbert2003, Reference Pierrehumbert2016).
Returning briefly to the phenomenon of selection and avoidance discussed in our introduction, the results and discussion above suggest that the child's awareness of his/her own phonological articulatory abilities, rather than the actual content of their lexicons, might be at the source of these behaviours (Ferguson and Farwell Reference Ferguson and Farwell1975; Menn et al. Reference Menn, Schmidt and Nicholas2009, Reference Menn, Schmidt, Nicholas, Vihman and Keren-Portnoy2013; Vihman Reference Vihman2014).Footnote 10 This also implies a certain degree of separation between the lexical and phonological components of the child's developing system.
Importantly, we do not mean to dismiss the theoretical or practical relevance of the lexicon and lexical neighbourhoods in other areas of phonological representation and processing. Dense lexical neighbourhoods such as those that characterize the lexicons of more advanced child learners or adult speakers constitute powerful networks for the processing of phonological representations, whose effects have been noted in tasks such as word learning, lexical retrieval, and the detection of speech errors (Storkel Reference Storkel2006, Reference Storkel2011; Storkel et al. Reference Storkel, Armbruster and Hogan2006; White and Morgan Reference White and Morgan2008; see also Stamer and Vitevitch Reference Stamer and Vitevitch2012; Chan and Vitevitch Reference Chan and Vitevitch2015 for similar observations in second-language development). Behavioural differences in phonological processing observed across different age groups may also be tied to the relative degree of inter-connectedness within lexicons, for which lexicon size does matter (see also Pierrehumbert Reference Pierrehumbert2003, Munson et al. Reference Munson, Kurtz and Windsor2005, and references therein for additional discussion).
Finally, our argument is compatible with the view recently expressed by Sosa and Stoel-Gammon (Reference Sosa and Stoel-Gammon2012: 605) that “[i]t may be that in young children, both metrics [vocabulary size and phonological knowledge] assess the same construct: the degree of abstract phonemic knowledge”. While this must be true if the productive lexicon is used as a metric of vocabulary size, it remains unclear whether this claim can be extended to the child's receptive lexicon, the size of which is arguably larger than any assessment we can obtain from measures of the productive vocabulary, across all developmental stages. This question, as well as further explorations of the relation between lexical knowledge and phonological development, call for the incorporation of additional measures of lexical knowledge, which we hope to consider in future research.