INTRODUCTION
Emergent bilingualism is a language contact situation faced by children who learn their first language (L1) from birth as the result of linguistic input in their L1-speaking home environments and develop their second language (L2) later in childhood from input received from playmates and the school (Verhoeven, Reference Verhoeven2007). The interest in language development in emergent bilinguals has increased in recent years with the global increase in population mobility and an increased number of both immigrant families and internationally adopted young children (Pollock, Reference Pollock and McLeod2007). Inevitably, new types of social context create new linguistic inputs for the children, who must build their L2 and eventually separate the two language systems. The present study is a longitudinal description of vowel development in an early sequential bilingual child. Its goal is to provide new experimental evidence for how emergent bilinguals separate the two languages and, in so doing, contribute to the long-standing debate on language separation (‘one system or two’) in bilingual children.
Vowel development has been studied primarily in monolingual children and with predominant focus on American English. Most studies are based on phonetic transcription and focus on patterns of accuracy as well as types of errors (for a review, see Stoel-Gammon & Pollock, Reference Stoel-Gammon, Pollock, Ball, Perkins, Müller and Howard2008). Using this methodology, a basic consensus is that quality differences between vowel pairs are present in the speech of most children by the age of three years (Stoel-Gammon & Herrington, Reference Stoel-Gammon and Herrington1990). In terms of general patterns of acquisition, (1) corner vowels in American English (except for /æ/) are typically acquired before non-corner vowels, (2) tense vowels are acquired before lax vowels, and (3) rhotic vowels are acquired later than non-rhotic vowels. More recent acoustic phonetic investigations further revealed that monolingual children by age 2;6 had mostly acquired vowel quantity distinctions and produced vowel duration ratios in a more adult-like manner (Buder & Stoel-Gammon, Reference Buder and Stoel-Gammon2002).
To date, little research has used acoustic analysis of productions of young preschool children in studying their vowel development. Obtaining good speech samples of children younger than 3;0 and analyzing them acoustically is a major challenge because of their high and variable fundamental frequencies. As speech analysis technology improved in recent years, acoustic studies of vowel development in children aged 1;3 – 4;0 became more frequent (see Vorperian & Kent, Reference Vorperian and Kent2007, for a review). In general, a decrease in acoustic variability in children's productions is interpreted as an increase in articulatory precision resulting from the maturation of the motor control system. For this reason, longitudinal observations of vowel development are particularly insightful because they document acoustic variability among vowels in an individual child over time. Of importance to the current study is a recent longitudinal investigation of vowel development in six monolingual American English children aged 1;6 – 4;0 (McGowan, McGowan, Denny & Nittrouer, Reference McGowan, McGowan, Denny and Nittrouer2014). This work found that the shape of the vowel space remained qualitatively constant from 2;6 through 4;0. This apparent constancy indicates that, by the age of 2;6, children have established the relations among the corner vowels and were thus able to produce categorical distinctions.
Bilingual vowel systems have been studied acoustically primarily in adults and adolescents rather than in children, with the main focus on the interaction between speakers' L1 an L2 (e.g. Flege, Reference Flege and Birdsong1999; Flege, Schirru & MacKay, Reference Flege, Schirru and MacKay2003; Guion, Flege & Loftin, Reference Guion, Flege and Loftin2000). It has been shown that both early childhood bilinguals and simultaneous bilinguals can maintain separate categories and do not necessarily merge phonetic categories for similar phonemes (Flege et al., Reference Flege, Schirru and MacKay2003). Early sequential bilinguals can partition their vowel spaces to accommodate the vowels of their L1 and L2 although their vowel productions may still differ from those of the monolingual speakers (Guion, Reference Guion2003). However, evidence also exists that early bilinguals can produce monolingual-like vowels in two languages (MacLeod, Stoel-Gammon & Wassink, Reference MacLeod, Stoel-Gammon and Wassink2009). Another relevant finding is that L1 and L2 can interact differently depending on the age at which L2 is learned. Using acoustic analysis, Baker and Trofimovich (Reference Baker and Trofimovich2005) showed that child bilinguals with extended L2 use produced L1 and L2 vowels that were more susceptible to bi-directional influences (i.e. phonetic restructuring as a function of L2 learning affected both languages). However, in adults who began learning their L2 later in life, only a unidirectional influence of the L1 on the L2 was found, which was primarily determined by cross-language similarity of L1 and L2 categories.
While studies of vowel systems in bilingual adults are important for a better understanding of the interaction between L1 and L2, their predictive power with regard to vowel development in young emergent bilingual children is relatively limited. For example, the typical within-subject acoustic variability in adult productions is smaller than in young children, whose acoustic vowel targets become less variable with age but are still lacking adult-like constancy (e.g. Assmann & Katz, Reference Assmann and Katz2000). Since a decrease in acoustic variability implies an increase in articulatory precision (Vorperian & Kent, Reference Vorperian and Kent2007), the path and pace of the acquisition of the acoustic vowel targets in L2 is also dependent upon the maturation of motor control. Thus, in young children, the route of L2 vowel acquisition rests not only on the quantity and quality of the linguistic input but also on highly individualized development of motor skills. Due to the complexity of such interactions, bidirectional and unidirectional influences of one vowel system on the other are more difficult to predict in young children compared with adults.
A longitudinal acoustic phonetic study by Simon (Reference Simon2010) illustrates the complexity of the interactions between L1 and L2 over time from a somewhat different viewpoint, examining the acquisition of English (L2) voice contrast in word-initial stops by a young sequential bilingual native Dutch (L1) child. Both English and Dutch have a voiced-voiceless stop contrast. However, while the English contrast is represented as a short-lag (unaspirated) versus long-lag (aspirated) distinction, the Dutch contrast is manifested differently, as a prevoiced (voiced) versus short-lag (voiceless) distinction. The boy successfully mastered the English contrast within a 7-month observational period which began when he was 3;6, although his English productions were still not monolingual-like. However, during the observational period, he also restructured his Dutch phonetic system so that his Dutch voice contrast boundary shifted toward that in English as a result of the influence of his L2. It is important to note that the child initially transferred the prevoicing in Dutch into English but only to some extent and, at the end of the observational period, failed to systematically produce prevoicing in either language. These longitudinal results are informative with regard to the developmental path in this emergent bilingual child. They show that the initial unidirectional influence of L1 on L2 may be negligible and the new system of phonetic contrasts in L2 may dominate L1, which can still be restructured at this age in the course of bi-directional influence between the two phonetic systems.
To date, due to the paucity of acoustic phonetic data, little is known about vowel development in emergent bilingual children. Addressing this gap, the current longitudinal case study aims to document phonetic development and interaction between L1 and L2 in a young boy who participated in this research over a period of 20 months. The child was born in the United States to Mandarin-speaking parents and was raised in a monolingual context until the age of 3;7. To determine the developmental profile of his vowel system(s), a detailed acoustic phonetic analysis of his vowel productions in Mandarin and English was conducted. Our research interests are threefold. First, we aim to determine the initial state of his L2 vowel space. Second, we examine the process of language separation during his subsequent exposure to English in a preschool. Finally, we aim to establish whether and how his L1 Mandarin system has changed as a function of his L2 development.
With regard to the initial state of the L2 vowel space, we predict that the child will begin with his established (though, possibly, still variable) L1 vowel system. Because Mandarin has only five basic monophthongal vowel phonemes and English has at least twelve nominal monophthongs, we expect that the child will construct his L2 system by first creating L1-based broad categories for acoustically similar L1 and L2 vowels. The expectation of category assimilation is based on the Equivalence Classification Hypothesis in Flege's Speech Learning Model (Flege, Reference Flege and Strange1995), which predicts (although based on analysis of adult vowel systems) that L2 vowels will have initially values similar to those in L1. Support for this prediction also comes from the longitudinal study by Simon (Reference Simon2010).
There are several possibilities as to the subsequent developmental path of the child's L2 and its separation from L1. Tentatively, we hypothesize that the child will first aim to establish new corner vowels in his new L2 vowel space, in parallel to the developmental pattern in L1 in which the corner vowels are typically acquired before non-corner vowels (e.g. McGowan et al., Reference McGowan, McGowan, Denny and Nittrouer2014; Stoel-Gammon & Herrington, Reference Stoel-Gammon and Herrington1990). The corner vowels in L2 English (which initially may not include the /æ/) will be established in the process of category separation. This developmental category separation has been well documented in monolingual infants (Kuhl & Meltzoff, Reference Kuhl and Meltzoff1996), and the same process appears to be also active in L2 acquisition (e.g. Baker & Trofimovich, Reference Baker and Trofimovich2005; Guion, Reference Guion2003). Because of the young age of the child, we expect the initial unidirectional influence of L1 on L2 to decrease with his increased experience with L2 (Simon, Reference Simon2010). After the L2 corner vowels have been established, we expect considerable variability in his production of the non-corner English vowels due to bi-directional interactions among L1 and L2 categories in the process of building a new system of contrasts both within the L2 and between his L1 and L2.
Finally, we expect some changes in the child's L1 system, analogous to Simon's (Reference Simon2010) findings for L1 in relation to the development of voicing contrast in L2. Accordingly, we expect L1 category shifts in L1 vowel space as a function of L2, which may primarily affect the non-peripheral Mandarin vowels /y, ɤ/. However, although we expect relative constancy in the child's productions of Mandarin vowels at the beginning of his exposure to L2, we cannot also rule out the possibility that his L1 non-corner vowels have not yet been firmly established. The greater variability of these vowels may be related to the maturation of his motor control and not necessarily to the influence of L2.
METHOD
Participant
One male child participated. Both of his parents were native Mandarin speakers who immigrated to the United States where he was born one and a half years later. The child received input in his L1 Mandarin from his parents, who interacted with each other and with the child in Mandarin. Both parents had received at least a college-level education in China. Besides interacting with his family members, the child played mainly with Mandarin-speaking children and had very limited contact with English. At the age of 3;7, he enrolled in an English-language preschool, where he was immersed in an all-English environment three days per week. All three preschool teachers were native English speakers of the central Ohio dialect. There was only one other non-native English-speaking child, but that child also spoke English in class. All class instruction and materials were in English. When he was 4;11, he was enrolled in a full-time (5 days per week) kindergarten program. Both of his kindergarten teachers were native English speakers of the central Ohio dialect and he was the only non-native English pupil in the class totaling sixteen children. The child's parents reported that he preferred to use English at home after starting the kindergarten program. They also reported that he had no hearing impairment or any speech-language disorder.
Speech material
Two sets of words were recorded: (1) Mandarin monosyllabic and disyllabic words each of which included one of five basic Mandarin monophthongal vowel phonemes: /i, y, a, u, ɤ/ (following Duanmu, Reference Duanmu2000, but here we use the symbol /ɤ/ rather than /ə/ to refer to the last vowel as it is a more standard usage); and (2) monosyllabic English words each of which included one of eleven basic monophthongal vowel phonemes found in the Ohio dialect of English: /i, ɪ, e, ε, æ, u, ʊ, o, ɔ, ɑ, ʌ/. As a reference, vowel symbols and features are shown in Table 1 for American English spoken in central Ohio and in Table 2 for Mandarin.
Table 1. Symbols and features of eleven American English monophthongal vowels spoken in central Ohio (except for /ɑ/, all back vowels are rounded)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921005611933-0908:S0305000914000531:S0305000914000531_tab1.gif?pub-status=live)
Table 2. Symbols and features of five monophthongal vowels in Mandarin
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921005611933-0908:S0305000914000531:S0305000914000531_tab2.gif?pub-status=live)
The basis for word selection included familiarity, word frequency, and picturability. All Mandarin words in our list, except one, can be found in the database of spoken words used daily by preschool Mandarin children (Liu et al., Reference Liu, Han, Zhang, Wu, Sheng, Mo, Yang and Kong2008). All English tokens were relatively high-frequency words according to the Kučera and Francis (Reference Kučera and Francis1967) norms (M = 75·5, s.d. = 74·4, range: 8 – 360) and the Thorndike and Lorge (Reference Thorndike and Lorge1944) norms (M = 819, s.d. = 868, range: 75 – 4778). The frequencies of occurrence of each Mandarin and English word used in this study are provided in the Appendices. Phonetic context was not strictly controlled in either word set (although all vowels were produced in a stressed syllable), nor was the tone environment controlled for in the Mandarin set. The two original word lists were balanced so that there was the same number of words for each target vowel. However, several words were eliminated after the first two recording sessions because the child did not recognize them. Thus, in the final word lists, Mandarin /i/ and /ɤ/ each included four stimulus words, /a/ and /y/ each included three, and /u/ included five. The English vowels /i, ɪ, e, ε, æ, u, ʊ, o, ʌ/ each included three stimulus words, /ɔ/ included two, and /ɑ/ included four (see Appendices).
Procedure
The study extended over a 20-month period. The recording procedure had two phases. In the first 12 months (from 3;7 to 4;6), one recording session was conducted each month; the average time between sessions was 31 days. After that, recordings were made in months 15 (4;9), 16 (4;10), 19 (5;1), and 20 (5;2). The word productions were recorded using a picture-naming task in a quiet room at the child's home with his mother present. In each session the child was first recorded saying the Mandarin words and then, after a short 15 – 20 minute break, the English words. The same experimenter – fluent in both Mandarin and English – used Mandarin to interact with him in the Mandarin task and English in the English task. During these sessions, the child was seated in front of a laptop computer wearing a Shure SM10A head-mounted microphone situated approximately one inch from his mouth. Pictures representing target words were randomly ordered and presented on the computer monitor (the same random order was used across all recording sessions).
The child spontaneously produced the target word by answering the experimenter's question of “What is this?” Sometimes, the child produced short phrases or commented on the pictures, but only the vowel in the stimulus word was analyzed. During the recording sessions each stimulus word was produced once and was recorded directly onto a hard drive disk with a 16-bit quantization rate and 44·1 kHz sampling rate. If the speech signal was too weak or peak clipping occurred or the child whispered or shouted, he was asked to repeat the word. All recordings were done under the control of a custom Matlab program. In each session, the child produced 19 tokens in the Mandarin set and 33 tokens in the English set (shown in the Appendices).
Acoustic measurements
Spectrographic analysis was used to determine the frequencies of the first two formants, F1 and F2. The formant frequencies were measured at the vowel's midpoint, which was determined on the basis of temporal locations of each vowel's onset and offset in the waveform using a custom Matlab program. Vowel onsets and offsets were determined using standard measurement criteria (Kent & Read, Reference Kent and Read1992). A second speech analysis program TF32 (Milenkovic, Reference Milenkovic2003) was used as an additional visual check of the spectrograms and for hand correction of the automatic formant measurements, if needed. Given the child's relatively high F0, a wide analysis bandwidth (600 Hz) was used. An auditory check of the vowel quality was also done to ensure that no part of a preceding or following consonant was included (this was important for vowels that were preceded or followed by the sonorants /ɹ, l, w/).
The size of the ‘basic’ vowel space (defined by the area bordered by the point vowels) is a parameter often used to characterize the nature of vowel structure differences across prepubertal development of the vocal tract in children as well as across ages, genders, languages, and dialects (e.g. Chung, Kong, Edwards, Weismer, Fourakis & Hwang, Reference Chung, Kong, Edwards, Weismer, Fourakis and Hwang2012; Fox & Jacewicz, Reference Fox, Jacewicz and Botinis2010; Vorperian & Kent, Reference Vorperian and Kent2007). We use this measure here to observe the shapes and sizes of the L1 and L2 vowel spaces as a function of L2 exposure. Following a commonly utilized approach, the midpoint formant values of the four corner vowels /i, æ, ɑ, u/ define the vowel space quadrilateral in English (Vorperian & Kent, Reference Vorperian and Kent2007) and the three corner vowels /i, a, u/ define the vowel space triangle in Mandarin (Chung et al., Reference Chung, Kong, Edwards, Weismer, Fourakis and Hwang2012).
It is natural that the vowel space area decreases as a function of the developmental increase in vocal tract length (e.g. Chung et al., Reference Chung, Kong, Edwards, Weismer, Fourakis and Hwang2012; Vorperian & Kent, Reference Vorperian and Kent2007). In order to factor out the effect of vocal tract lengthening, a set of normalized formant frequency values were generated to calculate a normalized vowel space area. We used Lobanov's (Reference Lobanov1971) procedure, which converts formant values in Hz to z-scores for each individual speaker. This is a normalization procedure that Adank, Smits, and van Hout (Reference Adank, Smits and van Hout2004) found to be one of the most effective. Since the normalized formant frequency values do not directly reflect Hz values, they were then rescaled into Hz-like values using the method suggested by Thomas and Kendall (Reference Thomas and Kendall2007) to facilitate interpretation.
RESULTS
Preliminary analyses indicated that the child's productions did not change considerably within each two-month period. For that reason, the data were collapsed into two-month epochs and the results are presented for each epoch in lieu of each individual session. Results are first presented for basic vowel dispersion patterns followed by an analysis of the vowel space areas.
Acoustic dispersion of L1 Mandarin and L2 English vowels
Shown in Figure 1 are the relative positions of L1 Mandarin and L2 English vowels over the eight epochs, superimposed in the common F1 × F2 plane. Unnormalized (rather than normalized) formant frequency values were plotted because normalization affects the overall size of the vowel space but not the relative position of individual vowels. It is the structure of the vowel system in the acoustic space in terms of the relative position of vowels – which is not changed by vowel normalization – that is of our interest here. Figure 1 tracks the development of the child's L2 vowel system relative to L1, which can be divided into three phases: Initiation, Reorganization, and Stabilization.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-55648-mediumThumb-S0305000914000531_fig1g.jpg?pub-status=live)
Fig. 1. Mean formant frequency values (with standard errors) for L1 Mandarin (in triangle) and L2 English (in quadrangle) vowels produced by the child across eight epochs over a 20-month period. Lines connect the three corner vowels /i, a, u/ in Mandarin (forming a triangular vowel space) and the four traditional corner vowels /i, æ, ɑ, u/ in English (forming a quadrilateral space). Three non-traditional corner vowels /ɪ, ʊ, ʌ/ in English in epoch 1 are connected in dotted line (showing a triangular vowel space similar to Mandarin).
The initial state of L2
The Initialization phase in epoch 1 represents the initial state of the child's L2 vowel system. We observe several broad L2 categories which are clustered near the three L1 corner vowels: /i/ (L2 /i, ɪ, e/), /u/ (L2 /ʊ, u/), and /a/ (L2 /ʌ, ε, æ/). The fourth assimilatory cluster includes the L1 /ɤ/ and L2 /o/ (and perhaps also /ɑ, ɔ/). Of particular interest is the more peripheral location of the English lax /ɪ, ʊ/ relative to the tense /i, u/ and their proximity to the L1 /i, u/, which indicates that the child did not produce the L2 tense/lax and L1 vowels contrastively. The far back locations of L2 /ʊ, u/, along with an unusually low L2 /ʌ/ relative to L2 /æ/, indicate that the child was utilizing his L1 Mandarin vowel space as the base of articulation of his L2 vowels.
The L1-L2 separation
The process of the separation of the two vowel systems spans the epochs 2 – 6, which we call the Reorganization phase. In epoch 2, we find the child's L2 vowels more centralized relative to L1. Clearly, the child began to produce exaggerated contrasts between his L1 /i/ and the L2 /i, ɪ, e/ cluster and between his L1 /u/ and L2 /u/ and /ʊ/, which were well separated from one another. His unusually fronted and lowered L2 /u/ and lowered /i/ contributed to a reduced L2 vowel space. The subsequent development of his L2 system can be characterized as a progressive enhancement of this reduced L2 space. In epoch 6, we observe that the L2 corner vowels /i, ʊ, æ, ɑ/ reached the positions typical of English spoken in central Ohio.
We infer from the plots that, in the Reorganization phase, the child focused on developing contrasts among individual vowels in L2. Developing contrasts among the four back vowels /ʊ, o, ɑ, ɔ/ was a particularly long process, which was not yet complete at the end of data collection. Figure 2 depicts the great positional variation and acoustic instability of the /ʊ-o/ contrast over this period. The development of the /ɑ-ɔ/ contrast was comparatively more typical, starting from a complete acoustic overlap until category separation in the ‘designated’ low back area of the English vowel space. We also note that throughout the Reorganization phase, the child's productions of most L2 vowels were more variable than his L1 vowels, which was reflected in larger standard errors for the formant frequency means.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-99335-mediumThumb-S0305000914000531_fig2g.jpg?pub-status=live)
Fig. 2. Developmental trajectories for two back L2 English vowel pairs /ʊ, o/ and /ɑ, ɔ/ produced by the child across eight epochs. Data points are redrawn from Figure 1.
The relative stability of L1
In the final Stabilization phase (epochs 7 and 8), the L2 vowel system has mostly stabilized which was reflected in a reduced within-category variation. With respect to the L1, we find the positions of the corner vowels /i, u, a/ to be relatively constant across all eight epochs (Figure 1). To determine whether the L1 and L2 systems in epoch 8 were comparable with those of English and Mandarin monolinguals, we consulted published sources (Jacewicz, Fox & Salmons, Reference Jacewicz, Fox, Salmons, Trouvain and Barry2007; Lin & Wang, Reference Lin and Wang2001) and plotted in Figure 3 his bilingual systems against vowels from adult males using normalized, rescaled formant values.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-16403-mediumThumb-S0305000914000531_fig3g.jpg?pub-status=live)
Fig. 3. Comparison of the child's Mandarin and English vowel spaces at epoch 8 with the vowel spaces of corresponding monolingual adults. The left panel shows the dispersion of five monophthongal Mandarin vowels produced by monolingual Mandarin adult male speakers (data reported in Lin & Wang, Reference Lin and Wang2001) and the dispersion of eleven nominal monophthongs in American English produced by monolingual English adult male speakers from the central Ohio area (data reported in Jacewicz, Fox & Salmons, Reference Jacewicz, Fox, Salmons, Trouvain and Barry2007). In the middle panel, the child's L2 English vowels are superimposed on those of monolingual English adults. In the right panel, the child's L1 Mandarin vowels are superimposed on those of monolingual Mandarin adults.
The left panel in Figure 3 shows the Mandarin (triangular) and English (quadrilateral) spaces of the adult monolinguals. In the middle and right panels, the child's English and Mandarin systems are superimposed over the corresponding monolingual systems. Clearly, the final L1 and L2 spaces in the child are comparable with those of the monolingual adults. The general dispersion of L2 vowels is like that found in native English, in spite of an apparent counter-clockwise shift of the child's vowel quadrilateral. This shift represents a recent sound change in central Ohio, which was found in the American English vowels of monolingual children and young adults in this area (Jacewicz, Fox & Salmons, Reference Jacewicz, Fox and Salmons2011a, Reference Jacewicz, Fox and Salmons2011b). The general shapes of the two Mandarin vowel spaces are also comparable, and the child's corner vowels /i, u, a/ correspond to those in the adults. However, his non-peripheral /ɤ/ and /y/ do not match exactly the positions for the Mandarin adults.
To better understand possible sources of this discrepancy, we examined the pattern of variation for /y/ and /ɤ/ (Figure 1). There was a general trend of /y/-fronting relative to /i/, which was manifested as a progressive decrease in the acoustic distance between /y and /i/ with age. A Spearman's correlation was run to determine the relationship between the /i-y/ distance and the child's age. There was a significant negative monotonic correlation between the two variables (r s = −·81, n = 8, p = ·015), indicating that the vowels were produced with greater proximity to one another as the child grew older. A different trend was found for /ɤ/, which was first produced as a relatively raised variant and then was lowered and backed in the vowel space. These two patterns are suggestive of two different developmental processes, which will be discussed below.
The change of the acoustic vowel space areas
In addition to the dispersion patterns in Figure 1, we also examined the L1 and L2 vowel spaces across the eight epochs to observe how their shapes and areas have changed in the child as a function of L2 exposure. As Figure 4 shows, the child's Mandarin vowel space appeared relatively stable across these 20 months. However, his English vowel space showed substantial changes in both the size and general shape – especially during the Reorganization phase.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-22693-mediumThumb-S0305000914000531_fig4g.jpg?pub-status=live)
Fig. 4. The child's L1 Mandarin triangular vowel space (left) and L2 English quadrilateral vowel space (right) over eight epochs using rescaled normalized formant frequency values.
Shown in Figure 5 are scatter plots of the rescaled normalized areas for both Mandarin and English. Superimposed on both plots are regression lines. Since in epoch 1 the child's L2 space closely resembled that of the Mandarin triangle, we have plotted two areas for his English space: (1) the triangular area based on the peripheral ‘non-corner’ vowels [ɪ, ʊ, ʌ] (shown with an open triangle symbol) and (2) the quadrilateral area to allow comparisons to epochs 2 to 8. As can be seen, the English triangular area value is very close to that of the Mandarin triangle, which provides support that, in epoch 1, the child was initially basing his L2 vowels on the L1 frame. Regression analysis indicated that there was a significant decline in the Mandarin vowel space across epochs 1 to 8 (F(1,6) = 11·8, p = ·014), but that the change was very gradual (−·004 kHz2 per epoch). On the other hand, the increase in the size of the English quadrilateral space across epochs 2 to 8 was not only significant (F(1,5) = 27·9, p = ·003) but the rate of change was more than six times as great in absolute magnitude (.025 kHz2 per epoch).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-91576-mediumThumb-S0305000914000531_fig5g.jpg?pub-status=live)
Fig. 5. Regression model of the child's rescaled normalized vowel space areas in L1 Mandarin (filled circles) and L2 English (unfilled circles). The unfilled triangle in epoch 1 is added to the plot to represent the early ‘English’ vowel space defined by the three vowels /ɪ, ʊ, ʌ/.
DISCUSSION
This longitudinal case study documented the emergence of bilingualism in a vowel system of a preschool-age boy. Through instrumental analysis of his vowel productions in both L1 and L2, three aspects of his phonetic development were examined: (1) the initial state of his L2 vowel space, (2) the process of L1-L2 separation over the course of his increased exposure to L2, and (3) the status of his L1 vowel system as a function of phonetic category formation in L2.
The initial state of L2 vowel space in an emergent bilingual
The results provided compelling evidence that the child utilized his L1 vowel space as the initial base in building his new L2 vowel system. As expected, the child initially clustered acoustically similar English and Mandarin vowels into several large groups in the vicinity of the L1 corners. In epoch 1, he did not produce a contrast between the high tense and lax English vowels nor an accurate distinction between /ʌ/ and /æ/, whose acoustic locations were reversed. The acoustic proximity of the high vowels to those in Mandarin and the low position of /ʌ/ approximating the Mandarin /a/ resulted in a Mandarin-like triangular ‘English’ vowel space which was distinctive from the typical English quadrilaterals found in native English children (McGowan et al., Reference McGowan, McGowan, Denny and Nittrouer2014). These important findings with respect to the observed acoustic category assimilation are in accord with the Equivalence Classification Hypothesis (Flege, Reference Flege and Strange1995), showing that this framework, originally proposed for adult L2, is also applicable to L2 acquisition in young children.
The process of L1-L2 separation
We hypothesized that the child will first aim to establish new corner vowels in his new L2 vowel space. We expected that the he would initially cluster ‘similar’ L1 and L2 vowel categories in a common acoustic space and that this acoustic overlap would gradually decrease with L1-L2 category separation as a function of his learning phonetic distinctions in L2. Instead, the child abruptly separated the two vowel spaces in epoch 2 and reduced his new L2 English space. This finding is not necessarily unexpected because it was previously shown that development often occurs in discrete steps rather than gradually (Lowie, Reference Lowie and Chapelle2012; Simon, Reference Simon2010). We interpret this outcome as his attempt to maximize the contrast between the languages, especially given the fact that this reduction in L2 space was complemented by a slight expansion of his L1 Mandarin space, possibly to further augment the contrast. By temporarily forming a new vowel space periphery in L2 away from the L1 corners and centralizing the L2 space relative to the established Mandarin corners, the child might have attempted to produce the L1-L2 distinctions, using the centralized variants as new corners of his emerging L2 space.
We need to bear in mind that this finding – although informative – comes from a single child and needs to be verified with a greater number of participants. It is possible that this particular strategy is limited to the interaction of phonetic vowel features of Mandarin and English, and a different trajectory may result from a contact of languages which both have crowded vowel spaces. The abrupt separation of the two vowel spaces very early in the L2 development needs to be verified in future studies with emergent bilinguals to learn more about possible strategies that might be used to construct the L2 on the basis of L1.
Throughout the next 16 months, we observed a gradual and steady growth of the reduced L2 vowel space as the child ‘adds' L2 vowel categories to his L2 system by producing a greater number of acoustic distinctions. At the end of the observational period, the L2 vowel space was not only expanded (and its area was even larger than in L1) but there was no acoustic overlap among individual L2 vowels, except for the low back corner of the vowel space. Lack of the overlap and decrease in the acoustic variability manifested in reduced standard errors indicate an increase in articulatory precision, which suggests that the relations among the L2 vowels have been established.
The status of the L1 vowel system as a function of phonetic category formation in L2
A particular strength of this study is the use of a longitudinal design to document both the L2 and L1 productions of the same individual over an extended period of time. We assume that the child had remained monolingual until his immersion in English at preschool. Although he was born in the United States and was thus exposed to various sources of auditory sensory information in English, it is unlikely that these sources could have supplied appropriate input for L2 phonetic learning. Kuhl, Tsao, and Liu (Reference Kuhl, Tsao and Liu2003) provided convincing experimental evidence that L2 phonetic learning in infants is not simply triggered by hearing a foreign language. Crucially, early learning is facilitated by social interaction and social contact with a live person who provides information that is referential in nature. It is therefore unlikely that bilingual development of the child began prior to his active exposure to English in the preschool. However, given his young age, we expected his L1 vowel system to be still ‘flexible’ such that it could be restructured under the influence of L2, particularly with respect to the non-peripheral Mandarin vowels.
The results revealed the relative constancy of his Mandarin vowel space in terms of both the dispersion of the corner vowels and the calculated vowel space area. However, there were also considerable ongoing changes in the production of the two non-peripheral vowels /y/and /ɤ/. Examination of the positional changes of these two vowels leads us to propose that the positional variation of /ɤ/ was affected by category formation in L2 but the vowel /y/ was still developing as an L1 phonetic category, suggesting that developmental processes were still active in L1.
In particular, the trend of /y/-fronting with age corresponds to the typical route of acquisition in monolingual Mandarin children, who acquire the /y/ late (Shi & Wen, Reference Shi and Wen2007). The child's variable productions of /y/ relative to /i/ seem to be related to his continuing development of the /i-y/ contrast in L1 rather than to an influence of the L2. However, according to the literature, the /ɤ/ is acquired relatively early in Mandarin-learning children, usually by age 3;0 (Shi & Wen, Reference Shi and Wen2007; Si, Reference Si2006). Given that the observational period began at 3;7, we conjecture that the child had already acquired the vowel at that time and its subsequent positional variations reflected the influence of his emerging L2, particularly his ongoing development of contrasts among several back English vowels.
CONCLUSIONS
The developmental trajectory of the current emergent bilingual child helps us recognize the complexity of the phonetic restructuring of the vowel system. As a whole, the study contributes the finding that it is possible for a young emergent bilingual to restructure and separate their two language systems very early. While capturing the emergence of bilingualism in a single child, this study supports the previous findings that bilinguals first utilize their existing L1 vowel space. Though developmental trajectories may differ between children and L1/L2 combinations, this particular case shows that a new L2 vowel system can be initiated by means of a drastic restructuring of the existing vowel space to create maximal contrast between the two vowel systems. After this abrupt partitioning, the reduced L2 vowel space is gradually expanded as the L2 learner discovers phonological contrasts in L2 and progresses toward realization of acoustic goals set by native monolingual speakers of that language. Eventually, the two vowel systems are separated.
APPENDIX 1
Word list used to elicit Mandarin productions (the target vowels are marked in bold).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-28982-mediumThumb-S0305000914000531_tabU1.jpg?pub-status=live)
For each word, the word frequency was calculated by the number of occurrences of each target word divided by the total occurrences of all words reported in Liu et al. (2008).
APPENDIX 2
Word list used to elicit English productions.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034830-72590-mediumThumb-S0305000914000531_tabU2.jpg?pub-status=live)