INTRODUCTION
When infants transition from babbling to first words, they face an additional functional load in managing both their lexical and production systems for meaningful speech (Boysson-Bardies, Vihman, Roug-Hellichius, Durand, Landberg & Arao, Reference Boysson-Bardies, Vihman, Roug-Hellichius, Durand, Landberg, Arao, Ferguson, Menn and Stoel-Gammon1992; McCune & Vihman, Reference McCune and Vihman2001; Stoel-Gammon, Reference Stoel-Gammon2011). Since infants' speech production capabilities do not advance markedly at the onset of first words (Davis, MacNeilage & Matyear, Reference Davis, MacNeilage and Matyear2002), they must rely on their existing production capacities in matching phonetic forms to adult word forms (see also Vihman & Croft, Reference Vihman and Croft2007).
This study investigated the target words in infants' spontaneous speech in order to understand the phonological characteristics of developing lexicons. The focus of the study was non-adjacent consonant sequences in target words consisting of Consonant–Vowel–Consonant (C1VC2) forms in American English. The data were from infants' spontaneous daily speech during the period when they produced primarily one word at a time (i.e. first-word period). The following sections describe commonalities observed across studies on infant speech production, lexical and phonological development, and infant speech perception.
Phonetic similarities between babbling and first words in speech production
Previous studies indicate that infants' first word productions share many phonetic similarities with their own babbling (Boysson-Bardies & Vihman, Reference Boysson-Bardies and Vihman1991; Davis et al., Reference Davis, MacNeilage and Matyear2002; MacNeilage, Davis & Matyear, Reference MacNeilage, Davis and Matyear1997; Vihman, Macken, Miller, Simmons & Miller, Reference Vihman, Macken, Miller, Simmons and Miller1985). In both babbling and early word productions, labial and coronal (including dental, alveolar, and palatal) consonants are reported as the most common places of articulation, while stops, nasals, and glides are reported as common manners of articulation.
In babbling, infants typically begin with producing reduplicated syllables (e.g. Oller, Reference Oller2000), and consonant repetitions (e.g. [baba]) continue to be prominent in both babbling (Davis & MacNeilage, Reference Davis and MacNeilage1995) and word production (Davis et al., Reference Davis, MacNeilage and Matyear2002; Kim & Davis, Reference Kim and Davis2015). Since an adult lexicon consists of a wide variety of syllable types, increasing production capacities are critical for infants in matching characteristics in a variety of target words in their ambient language.
Consonant variegations in first-word productions
Consonant variegations and non-adjacent consonant sequences (i.e. [bawa], [bada]) have been studied less extensively compared to the acquisition of individual segments and syllable types. In a study of first-word productions in ten English-learning infants, Davis et al. (Reference Davis, MacNeilage and Matyear2002) reported that only about 30% of C1VC2 or Consonant–Vowel–Consonant–Vowel (C1VC2V) word types showed consonant variegation (e.g. [bada]), and approximately 70% of the first words had repeated consonants (e.g. [baba] or [dada]). Early patterns of consonant variegations also showed a preponderance of manner over place changes (i.e. [bawa] more than [bada]; Davis et al., Reference Davis, MacNeilage and Matyear2002). Davis et al. attributed these patterns to ease of articulation, because manner changes can be accomplished by degree of jaw opening within words, while place variegations require active tongue movements within word forms.
Kim and Davis (Reference Kim and Davis2015) investigated consonant repetitions in non-adjacent consonant sequences within words in children's production output. Data were from ten children between 1;0 and 3;0, collected as part of a larger project at the University of Texas at Austin. The authors analyzed non-adjacent consonant sequences with a focus on movement-based principles related to consonant repetitions observed. ‘Place repetition’ was defined as a repetition of the other consonant in place of articulation (i.e. labial, coronal, or dorsal). For example, [ɡʌk] for duck indicates dorsal repetition, while [dʌd] for duck indicates coronal repetition (Kim & Davis, Reference Kim and Davis2015). Repetition patterns in C1VC2 and C1VC2V word forms were analyzed from the overall corpus. In terms of place repetition, labial consonants triggered repetition (e.g. [bap] for top) more frequently than coronal and dorsal consonants. Coronal repetitions (e.g. [tidi] for kitty) were the second most frequent, and dorsal repetition (e.g. [kiki] for kitty) was the least frequent type of place repetition (a hierarchy of labial > coronal > dorsal).
The results of Kim and Davis (Reference Kim and Davis2015) indicated that young children produce consonant repetitions well beyond the babbling and first-word periods. Results also suggest that target word forms with variegated consonant sequences might be challenging for them to produce. While Davis et al. (Reference Davis, MacNeilage and Matyear2002) and Kim and Davis (Reference Kim and Davis2015) both investigated production output patterns, the question arises of how these consonant patterns in their production may relate to patterns in target words that infants choose to say.
Characteristics of target words during the first-word period
To understand the phonological characteristics of word forms that English-learning children attempt, Stoel-Gammon (Reference Stoel-Gammon and Paul1998) analyzed the phonological characteristics of a representative lexicon using words in the MacArthur Communicative Development Inventories (M-CDI; Fenson et al., Reference Fenson, Dale, Reznick, Thal, Bates, Hartung, Pethick and Reilly1993). A total of 598 words were selected as lexical items typically acquired by age 2;6. The results indicated that simple monosyllables were frequent in the M-CDI words analyzed. Segmentally, bilabial and coronal consonants were the most frequent places of articulation, and stop consonants were the most frequent manner of articulation in the M-CDI words. Stoel-Gammon compared the phonological characteristics of the M-CDI words with production data reported in Templin (Reference Templin1957). She found that phonological features occurring frequently in the M-CDI words (e.g. monosyllables, labials, alveolars, and stops) matched the features reported as being acquired early by Templin (Reference Templin1957). Based on these findings, Stoel-Gammon (Reference Stoel-Gammon and Paul1998) concluded that early lexical development and productive phonology are closely related.
In addition to simple segmental patterns, there may be some position-specific patterns for labials and coronals (specifically alveolar consonants) within words. Stoel-Gammon (Reference Stoel-Gammon and Paul1998) reported that both labial and alveolar consonants were frequent in word-initial position, but that alveolar consonants were by far the most frequent in word-final position in the M-CDI words for infants and toddlers from birth to 2;6. In a cross-linguistic study of English, French, Swedish, and Japanese, Boysson-Bardies et al. (Reference Boysson-Bardies, Vihman, Roug-Hellichius, Durand, Landberg, Arao, Ferguson, Menn and Stoel-Gammon1992) also reported that labial consonants were most frequent (followed by coronals and dorsals) in initial position in target words in all four languages.
Based on the review of these previous findings, Stoel-Gammon (Reference Stoel-Gammon2011) has further postulated that infants' choice of words is not only related to but influenced by their phonological development at early stages (birth to 2;6). Her arguments are based on phonetic similarities between babbling and first-word productions (e.g. Vihman et al., Reference Vihman, Macken, Miller, Simmons and Miller1985), as well as the connections between lexical and phonological characteristics (e.g. Maekawa & Storkel, Reference Maekawa and Storkel2006; Stoel-Gammon, Reference Stoel-Gammon and Paul1998; Storkel, Reference Storkel2009).
Labial–coronal effect in production
Previous studies have also indicated that non-adjacent labial–coronal sequences (e.g. pat) are more frequent than coronal–labial sequences (e.g. top), termed the ‘labial–coronal effect’, in first-word production output in American English (MacNeilage, Davis, Kinney & Matyear, Reference MacNeilage, Davis, Kinney and Matyear1999, Reference MacNeilage, Davis, Kinney and Matyear2000) and Brazilian Portuguese (Teixeira & Davis, Reference Teixeira and Davis2002). Ingram (Reference Ingram1974) originally hypothesized that the first consonant (C1) is simply more front than the second consonant (C2) in C1VC2 forms when infants advance from more basic C1V forms or reduplicated C1VC2V forms. MacNeilage and Davis (Reference MacNeilage, Davis, Knight, Studdert Kennedy and Hurford2000) proposed that C1V with a labial consonant in the initial position (e.g. [ma] or [wa]) is a basic syllable type in early speech output. They suggested that the labial–coronal effect originates from a fundamental aspect of movement organization, because it is ‘simpler’ to start with a labial consonant (no tongue movement) in the initial position and to add a coronal consonant later in the word (tongue movement required).
Rochet-Capellan and Schwartz (Reference Rochet-Capellan and Schwartz2007) also suggested a labial–coronal sequence is more ‘articulatorily stable’ than a coronal–labial sequence. They studied articulator movements in rapid syllable repetition by adult French speakers, and found that both labial–coronal and coronal–labial sequences tended to become labial–coronal sequences when produced with an increased speech rate. Both Rochet-Capellan and Schwartz (Reference Rochet-Capellan and Schwartz2007) and MacNeilage and Davis (Reference MacNeilage, Davis, Knight, Studdert Kennedy and Hurford2000) provide production-based explanations for the observed labial–coronal effect.
Labial–coronal effect in other domains
The labial–coronal effect has also been demonstrated in other domains beyond speech production. In a lexical analysis of ten different languages, MacNeilage et al. (Reference MacNeilage, Davis, Kinney and Matyear1999) reported that labial–coronal sequences occur more frequently than coronal–labial sequences in eight of the ten languages, including English and French. Studies using a large corpus of French showed that labial–coronal sequences are much more frequent than coronal–labial sequences (Gonzalez-Gomez, Hayashi, Tsuji, Mazuka & Nazzi, Reference Gonzalez-Gomez, Hayashi, Tsuji, Mazuka and Nazzi2014; Gonzalez-Gomez & Nazzi, Reference Gonzalez-Gomez and Nazzi2012). The pattern is more complicated in Japanese, which MacNeilage et al. (Reference MacNeilage, Davis, Kinney and Matyear1999) reported as a language that did not show a labial–coronal effect. Using two large corpora, Tsuji, Gonzalez-Gomez, Medina, Nazzi, and Mazuka (2012) showed that labial–coronal sequences are indeed more frequent than coronal–labial sequences in the overall adult Japanese lexicon. However, when the words were divided by manners of articulation between plosives and nasals, two different patterns emerged in Japanese. Words with nasal consonants showed a labial–coronal effect, while coronal–labial sequences were more frequent than labial–coronal sequences in words consisting of plosives (Tsuji et al., Reference Tsuji, Gonzalez-Gomez, Medina, Nazzi and Mazuka2012).
In addition, a series of experimental studies have indicated that infants are sensitive to the labial–coronal effect in speech perception. For instance, French infants showed a preference for frequent (labial–coronal) sequences over infrequent (coronal–labial) sequences in C1VC2 word forms at 0;10 (Gonzalez-Gomez & Nazzi, Reference Gonzalez-Gomez and Nazzi2012). Gonzalez-Gomez and Nazzi (Reference Gonzalez-Gomez and Nazzi2013) demonstrated that French infants were able to segment labial–coronal target words, but not coronal–labial target words, at 0;10. French infants were able to segment both labial–coronal and coronal–labial target words at 1;1 (Gonzalez-Gomez & Nazzi, Reference Gonzalez-Gomez and Nazzi2013).
Similar frequency effects were also observed in a word-learning study in French. Fourteen-month-old French infants were able to learn new words that consisted of a frequent (labial–coronal) sequence, although they were not able to learn words with an infrequent (coronal–labial) form (Gonzalez-Gomez, Poltrock & Nazzi, Reference Gonzalez-Gomez, Poltrock and Nazzi2013). In contrast, 16-month-old infants were able to learn both frequent (labial–coronal) and infrequent (coronal–labial) forms, suggesting that phonotactic characteristics of the ambient language influence early word learning.
The observed labial–coronal effect in speech perception demonstrates that it is necessary to study non-adjacent phonotactic dependencies separately from the acquisition of individual segments, because coronals occur much more frequently than labials as individual segments in the French lexicon (Gonzalez-Gomez & Nazzi, Reference Gonzalez-Gomez and Nazzi2012). These studies on infant speech perception indicate that frequencies of phonotactic dependencies could affect later word learning, because infants may be able to segment and learn words with phonotactically frequent patterns at an earlier age than those with infrequent patterns.
Purpose of this study
As Stoel-Gammon (Reference Stoel-Gammon2011) postulated, the characteristics of target words likely reflect both ambient language characteristics and infants' production capacities. The current study aims to investigate the phonological characteristics of early lexicons by analyzing non-adjacent consonant sequence patterns in target words during the first-word period.
This study focused on consonant sequences in target words consisting of C1VC2 form (e.g. dog, cat). Unless otherwise specified, ‘consonant sequence’ indicates C1 and C2 in a C1VC2 word form, rather than adjacent consonants in a cluster (e.g. C1C2V as in snow). The data for this study were target words attempted by infants in functional and spontaneous contexts.
Predictions
The following predictions are made to evaluate the postulate that selection of target words is influenced by production abilities (Stoel-Gammon, Reference Stoel-Gammon2011), and that certain consonants and consonant sequences are more accessible to production systems than other segments and consonant sequences (Davis et al., Reference Davis, MacNeilage and Matyear2002; MacNeilage & Davis, Reference MacNeilage, Davis, Knight, Studdert Kennedy and Hurford2000). Several predictions are made based on these principles:
-
1. Place-repeated sequences: Target words with Labial–V–Labial (Lab_V_Lab) and Coronal–V–Coronal (Cor_V_Cor) sequences are predicted to be frequent during the first-word period. This prediction is based on findings indicating that reduplicated syllables are more basic forms than variegated syllables in babbling (MacNeilage & Davis, Reference MacNeilage, Davis, Knight, Studdert Kennedy and Hurford2000), and that consonant repetitions are common in first-word productions (Kim & Davis, Reference Kim and Davis2015).
-
2. Labial–coronal effect: Target words with a Labial–V–Coronal (Lab_V_Cor) sequence (e.g. pat) are predicted to be more frequent than Coronal–V–Labial (Cor_V_Lab) sequences (e.g. top). Infants may initially produce a LabV syllable without a final consonant (e.g. [ma] or [ba]) (MacNeilage & Davis, Reference MacNeilage, Davis, Knight, Studdert Kennedy and Hurford2000) and later add a coronal consonant at the end of the syllable in production output (MacNeilage et al., Reference MacNeilage, Davis, Kinney and Matyear1999, Reference MacNeilage, Davis, Kinney and Matyear2000). It is predicted that early expressive vocabularies also include more words with a Lab_V_Cor sequence than those with a Cor_V_Lab sequence, reflecting the influence of infants' production capacities on their lexical choices.
-
3. Consonant sequences with dorsals: Target word types with consonant sequences involving dorsal consonants are infrequent. This prediction is based on findings that dorsal consonants are infrequent in production output during the first-word period (e.g. Davis et al., Reference Davis, MacNeilage and Matyear2002). These types include the following sequences: Labial–V–Dorsal (Lab_V_Dor), Coronal–V–Dorsal (Cor_V_Dor), Dorsal–V–Labial (Dor_V_Lab), Dorsal–V–Coronal (Dor_V_Cor), and Dorsal–V–Dorsal (Dor_V_Dor).
METHOD
Participants
The data analyzed were from the English-Davis corpus in the Child Language Data Exchange System (CHILDES) (MacWhinney, Reference MacWhinney2000). These data were originally collected for a larger longitudinal study on babbling and early speech (e.g. Davis et al., Reference Davis, MacNeilage and Matyear2002). There are data from twenty-one participants in the English-Davis corpus. Of these, data from three participants (Sa, Sad, and Wi) were excluded because their samples were mostly from the babbling period. Data from the remaining eighteen participants were analyzed.
Table 1 summarizes participant demographics. Eight of the participants were male, and ten were female. All were from monolingual English-speaking families residing in the Austin, Texas area in the United States. Typical development was established based on results within normal limits from a hearing screening, the Battelle Developmental Inventory (Guidubaldi, Newborg, Stock, Svinicki & Wneck, Reference Guidubaldi, Newborg, Stock, Svinicki and Wneck1984), as well as a parent case history. These procedures are described in Davis et al. (Reference Davis, MacNeilage and Matyear2002).
Data collection
Data for Aa, An, J, and Kat were originally collected for a dissertation study of the first-word period (Jasuta, Reference Jasuta1987). The sessions for these participants were audio-recorded bi-weekly at their home or in the University of Texas Speech and Hearing Center. Jasuta transcribed all of the data from these four participants. Only identifiable words recognized by the parent and observer were transcribed. Jasuta (Reference Jasuta1987) reported inter-transcriber reliability of 98% and intra-transcriber reliability of 94% for consonants.
For the remaining fourteen participants, the sessions were audio-recorded weekly in the home environment (see Davis et al., Reference Davis, MacNeilage and Matyear2002). Data collection started at approximately 0;7 and continued until approximately 3;0. The data were phonetically transcribed by the primary transcriber for each participant. Davis et al. reported transcription reliability of 76% for consonants.
For all eighteen participants, the data consist of functional and spontaneous speech. The criteria for a word were that the vocalization had a clear referent and an identifiable adult target word. No context-bound and infant-specific forms were analyzed (see Vihman & McCune, Reference Vihman and McCune1994). Transcriptions were formatted in CHAT (Codes for the Human Analysis of Transcripts) and analyzed using CLAN (Computerized Language Analysis; MacWhinney, Reference MacWhinney2000).
Data analysis
The data analyzed were from the first-word period. The first-word period for each participant was determined by identifying the onset and endpoint in the transcribed data.
The onset of the first-word period was determined based on the following criteria:
-
(a) The session included an identifiable word based on observer and parent agreement.
-
(b) The participant produced at least two tokens of the same word type (e.g. two tokens of dog), or one token each of two different word types (e.g. dog and cat). In other words, one token of one target word in one session was not sufficient to designate the session as the onset of the first-word period.
The endpoint of the first-word period was determined as the session in which the participant produced three different word combinations, following Aoyama, Peters, and Winchester (Reference Aoyama, Peters and Winchester2010) and Snow (Reference Snow1994). Two-word combinations were considered productive when the two words in the combination are different (e.g. no cookie, but not cookie cookie). Common adult phrases that are produced as one unit were excluded (e.g. what's that /wada/, thank you /daju/).
The onset of first word combinations was when the aforementioned criteria were met. The last session before the onset of word combinations was designated as the end point of the first-word period.
A total of 281 sessions from the eighteen participants were analyzed for their first-word period (see Table 1). Note that the chronological ages in Table 1 may not always be the exact age that the participants began producing meaningful words and when they started combining two words. For some participants, sessions analyzed may encompass their entire first-word period, while the data may only be part of their first-word period for others. For example, in the first available session for An, he already had three word types, indicating he was producing words before his first available data session at 1;8·05. For Mi, the last available datapoint was when he was 1;6·19. It was likely that his first-word period continued after the last session. Although the data may not represent all of the participants' entire first-word period, the target words in the analysis were produced during the first-word period for each of the participants.
There were a total of 21,604 tokens of word productions in the 281 transcribed samples. The focus of the analysis was C1VC2 target words produced by the participants in these sessions. The C1VC2 form was chosen as a first exemplar of non-adjacent consonant sequences in this period. The C1VC2 was also the most frequent syllable type in M-CDI words in Stoel-Gammon (Reference Stoel-Gammon and Paul1998). Using CLAN, all of the C1VC2 target words from eighteen participants from the English-Davis corpus were identified in the first-word period. C1VC2 word forms were excluded if they were repeated (e.g. look look, night-night) or produced within a longer string (e.g. what that).
Of the 21,604 tokens of word productions, 3,800 tokens (17·6%) were C1VC2 target words (such as mat, cat, and dog). The consonants in the target words were then grouped as labials (/p/, /b/, /m/, /w/, /f/, /v/), coronals (/θ/, /ð/, /t/, /d/, /s/, /z/, /n/, /l/, /r/, /ʃ/, /ʒ/, /tʃ/, /dʒ/, /j/), or dorsals (/k/, /ɡ/, /ŋ/). Words with an initial /h/ (e.g. ham, hat, hug) were excluded from the analysis, because the articulation of /h/ does not involve lip or tongue movement. These classifications of consonantal place of articulation were generally based on Davis et al. (Reference Davis, MacNeilage and Matyear2002). There are some differences between this study and previous studies, including whether /w/ was included as a labial (see Tsuji et al., Reference Tsuji, Gonzalez-Gomez, Medina, Nazzi and Mazuka2012) and whether or not to include fricatives and liquids (see Davis et al., Reference Davis, MacNeilage and Matyear2002).
These target words were then categorized into nine groups based on the consonant sequence patterns within the target word: Lab_V_Lab (e.g. pop, mom), Lab_V_Cor (e.g. mat, pad), Lab_V_Dor (e.g. make, pack), Cor_V_Lab (e.g. top, Tom), Cor_V_Cor (e.g. dad, not), Cor_V_Dor (e.g. dog, take), Dor_V_Lab (e.g. cape, gap), Dor_V_Cor (e.g. cat, can), and Dor_V_Dor (e.g. cake, king). The number of word types and tokens of each consonant sequence category were calculated for each participant.
RESULTS
Overall patterns
There were 749 word types and 3,800 word tokens produced by the eighteen participants in the data analyzed. Note that the frequencies of word types indicate a tally of the types produced by each participant, and not a number of unique word types produced by the eighteen participants. Non-adjacent consonant sequence patterns in C1VC2 target words are summarized in Table 2 (by numbers of word types and tokens) and Figure 1 (by percentages of word types and tokens).
The overall patterns were fairly consistent between word types and word tokens. The most frequent types were Lab_V_Cor and Cor_V_Cor. A Lab_V_Cor sequence occurred in 224 word types (29·9%) and 1,238 tokens (32·6%), and a Cor_V_Cor sequence occurred in 226 word types (30·2%) and 1,181 tokens (31·1%). The least frequent consonant sequence pattern was Dor_V_Dor. It occurred in just 6 word types (0·8%) and 12 tokens (0·3%). Dor_V_Lab was the second least frequent, occurring in 14 word types (1·9%) and 26 tokens (0·7%). In addition, target words that ended with a coronal consonant (Lab_V_Cor, Cor_V_Cor, Dor_V_Cor types combined) were frequent in both word types (525 word types or 70·1%) and word tokens (2,694 tokens or 70·9%). The consistency between word types and tokens indicates that these results are not simply due to a few word types (such as mom or dad) that appear frequently.
Non-parametric Friedman tests were conducted to evaluate the data statistically. Two separate tests were conducted on the word types and tokens. These tests indicated that there were statistically significant differences among the nine types of consonant sequences in target word types and tokens (χ 2 (8, N = 18) = 100·78 and 91·61, p < ·001). Wilcoxon signed-rank tests were conducted to examine the specific differences among the nine types of consonant sequences. There were 35 different 2-way comparisons among the nine types of consonant sequences. The relevant results of Wilcoxon tests are reported in the following sections. To be conservative, only the differences that were found statistically significant in the analysis for both word types and tokens are reported.
Place-repeated Lab_V_Lab, Cor_V_Cor, Dor_V_Dor types
The frequencies of target words with place-repeated sequences varied considerably. The most frequent place-repeated type was Cor_V_Cor, which was one of the most frequent types out of all nine categories (30·2% in types and 31·1% in tokens; Table 2 and Figure 1). The Lab_V_Lab sequence was not frequent (5·5% in types and 4·6% in tokens), even though it is a place-repeated form and labial consonants are reported to be acquired early in infant speech production studies (e.g. MacNeilage et al., Reference MacNeilage, Davis and Matyear1997). The Dor_V_Dor sequence was the least frequent of all nine categories, occurring less than 1% of the time in both types and tokens.
Wilcoxon signed-rank tests indicated that Cor_V_Cor was significantly more frequent than all other sequences except for Lab_V_Cor (Z = –3·12 to –3·73, p = ·001 to ·002). This analysis also indicated that Dor_V_Dor was significantly less frequent than all other sequences except for Dor_V_Lab (Z = –2·58 to –3·73, p = ·001 to ·01).
As can be seen in Table 1, the number of word types and tokens varied greatly among the eighteen participants (compare P and Re, for example). It is possible that one frequent word type by one participant could affect the frequency of a certain consonant sequence type. To address this possibility, individual data are shown for the Lab_V_Lab, Cor_V_Cor, and Dor_V_Dor target words in Table 3. Target words with a Cor_V_Cor sequence were more frequent than those with Lab_V_Lab and Dor_V_Dor sequences in seventeen out of eighteen of the participants, even though the actual numbers and percentages varied among them. The only exception was P, who produced the smallest number of tokens (only two types of Lab_V_Lab target words, and one type of Cor_V_Cor target words). For the least frequent Dor_V_Dor sequence, thirteen out of eighteen participants did not produce any target words of this type (Table 3). Individual data showed that a high frequency of Cor_V_Cor words was not due to a few favorite words repeatedly produced by one participant.
In sum, both group data (Table 2, Figure 1) and individual data (Table 3) showed that word targets with a place-reduplicated Cor_V_Cor consonant sequence occurred much more frequently than those with Lab_V_Lab and Dor_V_Dor consonant sequences in early word targets. This was demonstrated by the consistency across participants as well as consistency between word types and tokens.
Variegated sequence types: Lab_V_Cor and Cor_V_Lab sequences
Target words with a Lab_V_Cor sequence occurred significantly more frequently than those with a Cor_V_Lab sequence in both target word types (29·9% vs. 4·3%) and target word tokens (32·6% vs. 2·3%, see Figure 1). Lab_V_Cor target words were one of the most frequent categories along with Cor_V_Cor target words. In contrast, target words with a Cor_V_Lab consonant sequence were one of the least frequent categories, along with Dor_V_Lab, Dor_V_Dor, and Lab_V_Lab target words (see Table 2 and Figure 1). Wilcoxon signed-rank tests indicated that Lab_V_Cor was more frequent than Cor_V_Lab (Z = –3·73 and –3·72, p = ·001). Individual data shown in Table 4 also demonstrated that the pattern was consistent across all eighteen participants. All of them, some more than others, attempted more target words with a Lab_V_Cor consonant sequence than those with a Cor_V_Lab consonant sequence.
Sequence types including dorsals: Lab_V_Dor, Cor_V_Dor, Dor_V_Lab, Dor_V_Cor, and Dor_V_Dor sequences
Target words with a non-adjacent consonant sequence that involved a dorsal (Lab_V_Dor, Cor_V_Dor, Dor_V_Lab, Dor_V_Cor, and Dor_V_Dor) were infrequent overall (see Table 2 and Figure 1). As noted earlier, Dor_V_Dor was the least frequent of all nine sequence types (0·8% in target word types and 0·3% in tokens). The second least frequent sequence was the Dor_V_Lab sequence (1·9% in types and 0·7% in tokens), which was significantly less frequent than six other types (except for Cor_V_Lab and Dor_V_Dor) (Z = –2·96 and –3·73, p = ·001 to ·003). Frequencies of target words with Lab_V_Dor, Cor_V_Dor, and Dor_V_Cor sequences were in between (ranging from 8·8% to 10·0% in word types and from 7·2% to 10·9% in tokens; see Figure 1).
Individual data for the target words with sequences involving a dorsal consonant are shown in Tables 5 (Lab_V_Dor and Dor_V_Lab) and 6 (Cor_V_Dor and Dor_V_Cor). As in Lab_V_Lab, Cor_V_Cor, and Dor_V_Dor sequences (Table 3), the patterns in the individual data seem to be consistent as in the overall averages in Table 2. Target words with a Dor_V_Lab sequence were less frequent than words with Dor_V_Cor, Cor_V_Dor, and Lab_V_Dor sequences in almost all participants (see Tables 5 and 6). Target words with a Dor_V_Lab sequence were infrequent across all participants, and no participant had more than two types of Dor_V_Lab target words (see Table 5).
DISCUSSION
The goal of the present study was to investigate non-adjacent consonant sequence patterns in target words during the first-word period. This period of speech and language development has been characterized as having a greater functional load compared to the babbling period due to the requirement for infants to interface production capacities with word meanings (Stoel-Gammon, Reference Stoel-Gammon2011). Three predictions were tested based on a postulate that selection of word targets is influenced by production abilities (Stoel-Gammon, Reference Stoel-Gammon2011) and previous findings from infants' production capacities (e.g. Davis et al., Reference Davis, MacNeilage and Matyear2002). First, target words with place-repeated sequences (Lab_V_Lab and Cor_V_Cor) were predicted to be frequent. Second, target words with a Lab_V_Cor sequence were predicted to be more frequent than those with a Cor_V_Lab sequence. Third, target words with consonant sequences involving dorsal consonants were predicted to be infrequent.
The results of this study indicate that target words with some consonant sequences (Cor_V_Cor, Lab_V_Cor) were much more frequent than others (e.g. Dor_V_Dor) in target words, at least in C1VC2 word forms (approximately 18% of the overall data). A strong labial–coronal effect was found, as target words with a Lab_V_Cor sequence were significantly more frequent than those with a Cor_V_Lab sequence. In addition, consonant sequences involving an infrequent place of articulation in production output (i.e. dorsal, see Davis, et al., Reference Davis, MacNeilage and Matyear2002; Stoel-Gammon, Reference Stoel-Gammon1985) were also infrequent in consonant sequences in the target words.
Overall, these results demonstrate that phonological characteristics of word targets share similarities with production capacities reported in previous studies (e.g. Davis et al., Reference Davis, MacNeilage and Matyear2002). These findings are consistent with Stoel-Gammon's (Reference Stoel-Gammon and Paul1998) findings for M-CDI words, indicating that phonological characteristics of the early lexicons are similar to characteristics of productive phonology. While Stoel-Gammon analyzed individual segments and syllable shapes in M-CDI words, the current study analyzed non-adjacent consonant sequences in target words attempted in functional and spontaneous speech. The consistency between the present findings and Stoel-Gammon provides additional support for Stoel-Gammon's (Reference Stoel-Gammon2011) postulate that lexical development is influenced by production capacities in this early period of language development.
Place-repeated sequences
Place-repeated labial and coronal sequences (Lab_V_Lab and Cor_V_Cor) were predicted to be more frequent than other sequences. This prediction was based on previous results indicating that reduplicated syllables are common in babbling and during the first-word period (Kim & Davis, Reference Kim and Davis2015). The predicted pattern was observed for the coronal repetition sequence (Cor_V_Cor). Target words with a Cor_V_Cor sequence were the most frequent of all nine sequence types (30·2% in types and 31·1% in tokens; see Figure 1). However, target words with a Lab_V_Lab sequence were one of the least frequent among the nine types (5·5% in types 4·6% in tokens), and were significantly less frequent than those with a Cor_V_Cor sequence.
Several factors may help account for the high frequency of Cor_V_Cor target words. First, there are more coronal (/θ ð t d s z n l ʃ ʒ tʃ dʒ j r/) phonemes than labial (/p b m w f v/) and dorsal (/k ɡ ŋ/) phonemes in English (Ladefoged & Johnson, Reference Ladefoged and Johnson2015). Since many consonant phonemes can appear in either C1 or C2 position of a Cor_V_Cor sequence, there are more possible combinations for a Cor_V_Cor sequence than labial or dorsal repetitions in English. Second, coronals are the most frequent sound category in M-CDI words (Stoel-Gammon, Reference Stoel-Gammon and Paul1998) and also the most common place of articulation across many languages including English (Keating, Reference Keating, Paradis and Prunet1991; Ladefoged & Maddieson, Reference Ladefoged and Maddieson1996). Many of the coronal phonemes also appear frequently in English (Delattre, Reference Delattre1965). Thus, the early lexicons seem to reflect phonological characteristics of the adult lexicon in English.
Relative infrequency of Lab_V_Lab sequences is somewhat surprising, as labial consonants are reported as one of the earliest sounds to appear in infants' speech output (Davis & MacNeilage, Reference Davis and MacNeilage1995; Oller, Reference Oller2000; Vihman et al., Reference Vihman, Macken, Miller, Simmons and Miller1985). One possibility is that labial consonants do not occur very frequently in the C2 position. Approximately 70% of the target words ended with a coronal consonant in target words in this study. This pattern is consistent with previous studies showing that coronal consonants occurred more frequently than labials in the final position of M-CDI words (Stoel-Gammon, Reference Stoel-Gammon and Paul1998) and in actual productions of first words (Davis et al., Reference Davis, MacNeilage and Matyear2002). Thus, Lab_V_Lab sequences may be infrequent due to position-specific characteristics of labial consonants.
In a study of non-adjacent consonant repetition, Kim and Davis (Reference Kim and Davis2015) reported frequent labial repetitions in production output during this period. They found that labial consonants triggered repetition in sequences including labial–coronal, coronal–labial, dorsal–labial, and labial–dorsal sequences. Based on Kim and Davis (Reference Kim and Davis2015), it is possible that actual productions of the words analyzed in this study included more Lab_V_Lab sequences. Both the current outcome and results from Kim and Davis (Reference Kim and Davis2015) on place-repeated sequences (Lab_V_Lab and Cor_V_Cor sequences) indicate that it is important to analyze sequential characteristics within words in addition to the acquisition of individual segment inventories in early lexicons and production output.
Place-variegated sequences
Target word types involving sequences with dorsal consonants (Lab_V_Dor, Cor_V_Dor, Dor_V_Lab, Dor_V_Cor, and Dor_V_Dor) were predicted to be infrequent in word targets in this period. This prediction was verified, as consonant sequences in word targets with dorsal consonants were infrequent in overall word types and tokens, as well as across the eighteen participants. There are only three dorsal consonants (/k ɡ ŋ/) in the English phoneme inventory, and they do not occur as frequently as coronals in the adult lexicon in English (Delattre, Reference Delattre1965). The overall infrequency of sequences involving dorsal consonants reflects both the characteristics of the ambient language and the production capacities of infants.
It was predicted that Lab_V_Cor consonant sequences in target words would be more frequent than Cor_V_Lab consonant sequences in early expressive vocabularies. This prediction was verified as a strong labial–coronal effect was observed. Target words with a Lab_V_Cor sequence were significantly more frequent than those with a Cor_V_Lab sequence in both word types and tokens, as well as in the individual data. Gonzalez-Gomez et al. (Reference Gonzalez-Gomez, Hayashi, Tsuji, Mazuka and Nazzi2014) and MacNeilage et al. (Reference MacNeilage, Davis, Kinney and Matyear1999) compared frequency ratios between labial–coronal sequences and coronal–labial sequences to demonstrate a labial–coronal effect. In Gonzalez-Gomez et al. (Reference Gonzalez-Gomez, Hayashi, Tsuji, Mazuka and Nazzi2014), the ratios between labial–coronal and coronal–labial sequences were between 1·56 and 9·80 to 1 in French. The labial–coronal to coronal–labial ratios were 1·48 to 1 in French and 2·55 to 1 in English in MacNeilage et al. (Reference MacNeilage, Davis, Kinney and Matyear1999). The ratios in this study were 7·0 to 1 in target word types (224 vs. 32) and 14·2 to 1 in word tokens (1,238 vs. 87). These ratios are equivalent or larger than those reported in Gonzalez-Gomez et al. (Reference Gonzalez-Gomez, Hayashi, Tsuji, Mazuka and Nazzi2014) and MacNeilage et al. (Reference MacNeilage, Davis, Kinney and Matyear1999), and indicate a strong labial–coronal effect in target words during the first-word period.
Several proposals have been made for the origin of the labial–coronal effect. Infant perception studies showed that French-learning infants prefer the phonotactically more frequent labial–coronal pattern to the less frequent coronal–labial pattern (e.g. Gonzalez-Gomez & Nazzi, Reference Gonzalez-Gomez and Nazzi2012, Reference Gonzalez-Gomez and Nazzi2013). This perceptual preference may influence later word learning, as Gonzalez-Gomez et al. (Reference Gonzalez-Gomez, Poltrock and Nazzi2013) showed that French 14-month-old infants were able to learn frequent (labial–coronal) word forms but not infrequent (coronal–labial) word forms. No such difference was found among French 16-month-old infants, indicating that infants may be able to learn words with a frequent (labial–coronal) sequence earlier than those with an infrequent (coronal–labial) sequence. Moreover, similar studies in Japanese (Gonzalez-Gomez et al., Reference Gonzalez-Gomez, Hayashi, Tsuji, Mazuka and Nazzi2014; Tsuji et al., Reference Tsuji, Gonzalez-Gomez, Medina, Nazzi and Mazuka2012) have indicated that the labial–coronal effect is specific to the ambient language characteristics in infant speech perception, as French-learning and Japanese-learning infants appear to be sensitive to their own ambient language characteristics. Based on these findings, Gonzalez-Gomez et al. (Reference Gonzalez-Gomez, Hayashi, Tsuji, Mazuka and Nazzi2014) suggested that the labial-coronal effect is a reflection of infants' perceptual sensitivity to ambient language input.
Production-based proposals have also been made for the labial–coronal effect. Rochet-Capellan and Schwartz (Reference Rochet-Capellan and Schwartz2007) provided experimental evidence that labial–coronal disyllables are more articulatorily stable than coronal–labial disyllables. Tsuji et al. (Reference Tsuji, Gonzalez-Gomez, Medina, Nazzi and Mazuka2012) studied articulatory stability of labial–coronal and coronal–labial sequences in Japanese speakers using a similar methodology to Rochet-Capellan and Schwartz (Reference Rochet-Capellan and Schwartz2007). Their results showed that labial–coronal disyllables were articulatorily more stable than coronal–labial disyllables in Japanese, despite a perceptual bias toward coronal–labial sequences by Japanese speakers (Tsuji et al., Reference Tsuji, Gonzalez-Gomez, Medina, Nazzi and Mazuka2012). Davis et al. (Reference Davis, MacNeilage and Matyear2002) argue that labial consonants are more accessible to infants' production capacities because they are produced by mandibular oscillations without tongue movement. They note that adding a coronal consonant later in the word following a syllable-initial labial consonant would result in a place-variegated pattern that is more accessible for infants.
The present data offer new evidence on the labial–coronal effect in early lexicons in English. In the current study, target words from infants' functional and spontaneous speech were analyzed using data collected longitudinally. These data reflect infants' selections of target words for production, and represent the intersection between lexical and phonological development. The observed labial–coronal effect in this study coincides with findings from both infant speech perception studies (e.g. Gonzalez-Gomez et al., Reference Gonzalez-Gomez, Poltrock and Nazzi2013) and infant production studies (e.g. Davis et al., Reference Davis, MacNeilage and Matyear2002).
Further, the findings of this study coincide with research on the effects of phonotactic probabilities on early word learning. Graf Estes, Edwards, and Saffran (Reference Graf Estes, Edwards and Saffran2011) reported that 18-month-old English-learning infants were able to learn phonotactically legal labels (e.g. dref, sloob), but had difficulty with phonotactically illegal labels (e.g. dlef, sroob). Similarly, Zamuner, Gerken, and Hammond (Reference Zamuner, Gerken and Hammond2004) showed that English-learning children (aged 1;8–2;4) were able to repeat nonwords that are high in phonotactic probabilities more accurately than nonwords with low phonotactic probabilities. Similar effects of phonotactic probabilities were found in young children learning Dutch (aged 2;2–2;8; Zamuner, Reference Zamuner2009). These studies show that infants and young children are sensitive to phonotactic characteristics of the ambient language, even at an early stage of language development. The results of the current study also showed that target words with high phonotactic frequency were attempted much more frequently in functional and spontaneous speech during the first-word period. This study is methodologically quite different from experimental studies such as Graf Estes et al. (2011) and Zamuner et al. (Reference Zamuner, Gerken and Hammond2004). Yet, the findings from our study and these previous studies both suggest that phonotactic characteristics of the ambient language influence early word learning.
Individual differences
The results showed a strikingly consistent pattern across eighteen participants. For example, target words with a labial–coronal sequence were more frequent than those with a coronal–labial sequence in both word types and tokens in all eighteen participants. Target words with a Dor_V_Dor sequence were infrequent in all participants. These results indicate remarkable coherence in the types of consonant sequences in early lexicons when a large and consistently gathered corpus is available for analysis.
This finding is noteworthy, as individual differences have often been reported in speech and language development in infants and young children (e.g. Vihman & Croft, Reference Vihman and Croft2007; Vihman & Greenlee, Reference Vihman and Greenlee1987). Stoel-Gammon (Reference Stoel-Gammon and Paul1998) discussed differences in both lexical and phonological development between precocious talkers, typically developing infants, and late talkers. She suggested that the size of infants' lexicons and phonological skills develop hand in hand. In the current analysis, there was indeed a great variation in the number of word types and tokens produced by each participant. Some produced numerous word types and tokens (e.g. Re had 597 and C had 291 word types). On the other hand, others produced far fewer types or word tokens (e.g. P and Na). Thus, individual differences were observed in the size of vocabularies as well as overall volubility among these participants. However, patterns for consonant sequence types in their target words were highly similar among all eighteen participants. The observed consistency across participants, along with the consistency across word types and tokens, indicates a robust finding for consonant sequence patterns in early lexicons.
Limitations and suggestions for future studies
There are limitations in this study. First, this study only analyzed the phonological characteristics of target words, not the infants' actual production output. The actual production of these words did not have the target consonant sequence in some cases (i.e. pat could be produced as [pæ], [pæp], or [pæk]). To fully explore the nature of these patterns in actual speech output, both target characteristics and actual production output need to be analyzed in a future study.
Second, as pointed out by Stoel-Gammon (Reference Stoel-Gammon2011), many of the studies on lexical and phonological development are focused on English-speaking infants. Studies on the labial–coronal effect have been mostly conducted in French (e.g. Gonzalez-Gomez & Nazzi, Reference Gonzalez-Gomez and Nazzi2012, Reference Gonzalez-Gomez and Nazzi2013; Rochet-Capellan & Schwartz, Reference Rochet-Capellan and Schwartz2007) and Japanese (Tsuji et al., Reference Tsuji, Gonzalez-Gomez, Medina, Nazzi and Mazuka2012), although lexical patterns were examined in several more languages in MacNeilage et al. (Reference MacNeilage, Davis, Kinney and Matyear1999). It is desirable to study perception and production of the labial–coronal effect within the same language to understand this effect in a more comprehensive manner. In addition, more studies on the labial–coronal effect across different languages would affirm the generality of these robust findings for English.
Third, this study only analyzed C1VC2 word forms, which consisted of approximately 18% of overall word productions. The same non-adjacent consonant sequences occur in C1VC2V word forms (such as Patty or Tommy), but C1VC2V forms were excluded from the current analysis. In addition, there are other phonotactic dependencies including non-adjacent vowel sequences and adjacent consonant–vowel sequences. Other word forms (such as C1VC2V) and different types of sequences may reveal more about infants' learning of the phonotactic characteristics of their ambient language.
CONCLUSIONS
Infants need to acquire capacities for producing a variety of segments in sequences that match target words in their growing lexicons. Serial characteristics are thus critical to understanding the path(s) infants take to achieve intelligible meaning-based speech. This study contributes a new type of data: phonological characteristics of early expressive vocabularies from functional and spontaneous speech. The results showed consistent patterns in non-adjacent consonant sequences as reported in infant speech perception studies (e.g. Gonzalez-Gomez et al., Reference Gonzalez-Gomez, Poltrock and Nazzi2013) and in infant production studies (e.g. Davis et al., Reference Davis, MacNeilage and Matyear2002; MacNeilage et al., Reference MacNeilage, Davis and Matyear1997). Overall, this study provides strong support for Stoel-Gammon's (Reference Stoel-Gammon2011) postulate that early lexical development is influenced by productive phonology at early stages of speech and language development.