INTRODUCTION
Many children around the world grow up learning two languages simultaneously. Within the United States, more that 10·5 million children between the ages of 5 and 17 years grow up in households where a foreign language is spoken in addition to English (US Census, Reference Shin and Kaminski2007). These bilingual children receive language input from their parents that differs from that of monolinguals in several ways. The most obvious difference is that bilingual children receive some input in each of their languages. But beyond this, this input has been found to contain instances of the languages being ‘mixed’ together (Goodz, Reference Goodz1989). This ‘mixing’ of more than one language while speaking is referred to as ‘code-switching’ (CS).
Bilingual speakers can code-switch in multiple ways, to varying degrees, and for different reasons. Some of the factors that have been shown to influence CS include: the linguistic background of the speakers, their age or race, and their role in a conversation (Cheng & Butler, Reference Cheng and Butler1989). Furthermore, the level of fluency in the two languages has also been found to influence CS patterns, with more fluent bilinguals having different CS tendencies compared to less fluent bilinguals (Poplack, Reference Poplack1980). Most research to date has described and analyzed code-switching in the context of adult-to-adult speech or speech between an adult and a school-aged child. Little research has examined CS in adults' speech to younger children who are only just learning their native language(s). Only a couple of studies have explored the effect that CS might have on these children's vocabulary development (e.g. Byers-Heinlein, Reference Byers-Heinlein2013; Place & Hoff, Reference Place and Hoff2011). This work has been based solely on parental report and has relied primarily on caregivers' recollection and estimates of CS instances present in the speech addressed to their child. No previous studies have analyzed actual examples of the speech that bilingual infants are exposed to on a regular basis.
One possibility is that adults, to some degree, avoid code-switching when speaking to young children, for fear of causing linguistic confusion or increasing processing demands. Data from laboratory tasks with adult bilinguals suggest that there are, in fact, some processing costs associated with CS (e.g. Abutalebi, Brambati, Annoni, Moro, Cappa & Perani, Reference Abutalebi, Brambati, Annoni, Moro, Cappa and Perani2007; Proverbio, Leoni & Zani, Reference Proverbio, Leoni and Zani2004). Hence, it might be expected that there would be similar, if not greater, processing costs to a young child. Previous work by Byers-Heinlein (Reference Byers-Heinlein2013) supports this possibility; she found a significant negative relationship between a parent-report rating of how often they CS and 18-month-old children's receptive English vocabularies, as well as a marginal negative relationship between parental report of CS and 24-month-old children's productive English vocabularies. That is, greater rates of CS (based on parental report) were associated with infants having smaller productive and receptive vocabularies. It is not clear, however, whether parental report of code-switching is accurate, nor whether different types of CS might have differential effects.
Another possibility is that adults may not completely avoid CS when talking to young children, but may CS to a different degree or in a different manner than with other adults. For instance, they may CS predominantly across sentences (inter-sentential CS), rather than within sentences (intra-sentential CS). Inter-sentential CS is generally comprised of long strings of words in each language (e.g. I like the red house! a ti cuál te gusta?). Intra-sentential CS, on the other hand, can involve primarily words in one language, with only one or a few words in the other language (e.g. The red casa is the one I like). Intra-sentential CS could potentially be more challenging for children to process, since it requires a rapid switch between lexicons within a single sentence. It also could potentially cause confusion as to which language a novel lexical item belongs; for example, in the sentence above, children might mistake casa as being an English word given the English context and similar phonology across languages. It is hence possible that adults might attempt to avoid this form of CS when talking to infants.
A third alternative is that caregivers might actually use CS as a way of teaching translation equivalencies across languages (TEs: cross-language synonyms). Thus, parents might produce utterances such as Look, it's a kitty! El gatito! as a way of making explicit the fact that the two words refer to the same object. A longitudinal study by David and Wei (Reference David and Wei2008) analyzed the language exposure and vocabulary of thirteen French–English bilingual children (12–36 months of age), and found that there was a significant correlation between language exposure and translation equivalents. Children with more balanced language input tended to have more TEs in their vocabulary. Language exposure was calculated as a percentage, based on parent report, and TEs were quantified by comparing MCDIs (MacArthur-Bates Communicative Development Inventories) in French and English. The authors examined the proportion of within-sentence CS during a parent–child interaction, and found no relationship between this measure and translation equivalents; however, they did not examine instances of translations in adjacent utterances (such as the one in the example above). In contrast, Poulin-Dubois and colleagues (Reference Poulin-Dubois, Bialystok, Blaye, Polonia and Yott2012) found only a marginal correlation between parent-reported exposure to a second language and the child's percentage of translation equivalents; they suggest that more balanced exposure leads to a more balanced vocabulary, but not necessarily to translation equivalents. It is important to note that they did not examine code-switching per se, but general percentage exposure to the L2. They suggest that the weak relationship between exposure and the proportion of TEs may relate to whether the vocabulary in the different languages is presented in similar environments as compared to being context-specific. Assuming the context of the input is indeed a factor, then children who hear the same concept repeated across languages in adjacent sentences (e.g. in a similar environment) might be particularly likely to learn TEs. The mixed findings regarding the relationship between language exposure and the acquisition of TEs suggest that there is a need to further explore this topic. It remains unclear whether various types of CS (e.g. within-sentence CS vs. adjacent-utterance CS) may lead to differences in how TEs are learned (i.e. whether some forms of CS might ‘reinforce’ this learning process more than others).
It is also possible that parents do not alter their code-switching behavior when talking to children as compared to bilingual adults, or that they do so but this has no effect on the children's vocabulary development. Additionally, it is possible that there are differences in code-switches to young children that are an indirect result of other changes made when speaking to infants. For example, because speech to children generally has a shorter, simplified sentence structure, there may be fewer opportunities for certain types of code-switches to occur, even if parents do not actively avoid such behaviors. At present, the extent to which code-switching occurs when speaking to young children and what this might indicate in terms of adult intentions and child outcomes remain unclear.
Bilingual caregivers in different studies have consistently reported code-switching while talking to their infants. However, many questions remain regarding the characteristics of the CS heard by young children and the effect that this might have on their language development. The present study serves as a first attempt to describe more specific patterns of code-switching behavior in parents' speech to young children. Furthermore, we explore whether individual differences in CS influence children's lexical development. If CS is challenging for young language learners to process, then infants who are frequently exposed to CS may be at a relative disadvantage in terms of their vocabulary development compared to infants who are not exposed to CS, or who are exposed to CS less frequently. Alternatively, if parents use CS to try and teach translation equivalents, we might expect that bilingual children whose parents CS more often would have a greater number of overlapping words across their two vocabularies. Thus, there are two alternative hypotheses: CS might be detrimental to language learning as a result of increased processing demands. Alternatively, it might aid language development through the explicit teaching of translation equivalents. Little research distinguishes between these alternatives, yet they would have vastly different implications for parents and educators working with young children. These issues serve as the basis for the current study, which relies on the examination of parental speech samples to explore the amount and nature of CS to young children.
Defining code-switching
In order to examine how often parents CS, it is first necessary to determine what counts as an instance of code-switching. Researchers studying adults have developed a number of theories about the constraints on CS and what circumstances seem to promote or facilitate it (see Cantone, Reference Cantone2007; Isurin, Winford & de Bot, Reference Isurin, Winford and de Bot2009; Muysken, Reference Muysken2000, for detailed reviews of theories and studies). There are ongoing debates regarding the appropriateness of the different theories. Two of the more prominent views of code-switching that are relevant to the current work can be distinguished by their focus on insertion versus alternation of units across the two languages (Boumans, Reference Boumans1998).
From an insertional perspective, CS is thought of as the ‘embedding’ of elements from one language into the syntactic frame of another language. Thus, under this view, there is an asymmetrical relationship between the two languages. According to the Matrix Language Frame (MLF) model developed by Myers-Scotton (Reference Myers-Scotton1997), one language (the Matrix Language – ML) provides the syntactic frame, while the other language (the Embedded Language – EL) plays a more secondary role. Elements from the EL are embedded into a ‘frame’ that maintains the grammatical structure of the ML. In particular, mixed utterances maintain the word order, inflections, and the system morphemes (e.g. function words) of the ML, and any insertions must maintain congruency with the element of the ML that would have otherwise been used (Boumans, Reference Boumans1998).
From the alternation perspective, CS is viewed as the act of switching back and forth between languages, with switches tending to occur most often between utterances or sentences. Rather than embedding one language into a base language, there is a complete switch from the grammar and lexicon of one language to the other. In this approach, neither language is thought of as being a secondary contributor (Poplack, Reference Poplack1980). Rather, the languages possess equal roles and a speaker can alternate between them at his or her discretion.
These two theoretical approaches have implications for what should count as an instance of code-switching. For example, according to the alternation theory, single word switches are not considered to be true CS, but instead are referred to as ‘nonce borrowings’ (Poplack, Sankoff & Miller, Reference Poplack, Sankoff and Miller1988). In contrast, from an insertional perspective, any switched lexical item that does not fit the criteria of an established borrowing (or a word that has been transferred from one language to the other to fill a lexical gap, such as internet) is considered to be a code-switch (Myers-Scotton, Reference Myers-Scotton1997). If a speaker is mostly producing Spanish utterances, then produces one English sentence, and continues in Spanish, this would count as a single CS based on a matrix approach, but as two switches (one into English, and a second back into Spanish) based on a switching approach. If the speaker had produced three English sentences, instead of one, each sentence would be considered a separate embedding according to an insertional approach, but the number of sentences in a row would have no implication for a switching approach because they maintained the same language. Thus, the approach used to evaluate utterances produced by bilingual speakers will directly influence important measures such as CS frequency.
EXPERIMENT
In the present study, we analyzed speech samples produced by bilingual caregivers to their infants as a way of seeking a principled understanding of the following six questions: (i) How much do bilingual parents code-switch when talking to their young children? (ii) Are parents more likely to code-switch between utterances than within utterances (which would be expected if they were avoiding multilingual utterances)? And when code-switches occur within a sentence, where do they occur? (iii) Do bilingual parents repeat words across languages to create translation equivalences? (iv) Does parental language fluency or education predict the degree of code-switching? (v) Do parents' self-reports of the frequency with which they code-switch match their actual CS behavior in a laboratory setting? (vi) Does the degree of parental code-switching predict children's vocabulary size? The answers to these questions serve as a first exploration into the nature of bilingual parents' code-switching when speaking to their young children. We address each of these questions separately.
Given the lack of previous work examining actual speech samples of adult CS to infants, there is no set standard regarding which model of CS constraints to follow. In this work investigating parental CS to young children, we implemented Myers-Scotton's guidelines and used an insertional approach to quantify intra-sentential CS. However, we also analyzed CS separately via the switching approach, as will be described more fully in the coding section, below. By measuring CS by both approaches, we ensure that our results can be usefully applied regardless of the outcome of the theoretical debate.
GENERAL METHODS
Participants
The participants were twenty-four caregiver–child dyads. The children were between 17 and 24 months of age (11 males; M = 20·6 months, SD = 1·82). Thirty-three percent of the children had an older sibling in the household, while the other 67% were either first born or an only child. Each child was exposed to both English and Spanish from one or more of their caregivers. The caregiver of interest for this study was the Spanish–English bilingual who spent the most time interacting with the child. This was the mother in all but two cases. All of the children were exposed to a minimum of 30% and a maximum of 70% of both languages since birth and had not been previously diagnosed with any developmental problems. The caregivers spoke different varieties of Spanish: Argentinian Spanish (n = 1), Puerto Rican Spanish (n = 3), Cuban Spanish (n = 1), Salvadorian Spanish (n = 2), Colombian Spanish (n = 2), Peruvian Spanish (n = 4), Dominican Spanish (n = 1), Guatemalan Spanish (n = 5), Panamanian Spanish (n = 1), and Mexican Spanish (n = 2). Two families did not provide this information. Based on questionnaire data (described below), the matrix language was considered to be English for six of the caregivers, and Spanish for the remaining eighteen. Caregivers reported having completed either a master's degree (16% mothers, 13% fathers), bachelor's or equivalent (29% mothers, 25% fathers), or a 2-year college degree or below (46% mothers, 46% fathers). Additionally, one father reported having earned a doctorate degree. Two of the mothers and three of the fathers did not provide this information.
Materials
Two parent-report language questionnaires were used to measure child vocabulary: the MacArthur-Bates Communicative Development Inventory (MCDI) Words and Sentences (Fenson et al., Reference Fenson, Dale, Reznick, Thal, Bates, Hartung and Reilly1993) and the Spanish-adapted version, the MacArthur-Bates Inventarios del Desarrollo de Habilidades Comunicativas Palabras y Enunciados (Jackson-Maldonado, Thal, Marchman, Newton, Fenson & Conboy, Reference Jackson-Maldonado, Thal, Marchman, Newton, Fenson and Conboy2003). The combination of Spanish and English MCDIs has been used successfully with reliable results in several studies that examined bilingual children's language development (e.g. David & Wei, Reference David and Wei2008; Pearson & Fernández, Reference Pearson and Fernández1994; Pearson, Fernández, Lewedeg & Oller, Reference Pearson, Fernández, Lewedeg and Oller1997; Pearson, Fernández & Oller, Reference Pearson, Fernández and Oller1993). Parents were specifically instructed to mark only words that they had heard their child say in that particular language on each MCDI form.
A language history questionnaire was also used to gather information about the language background of the parents, as well as the input provided to the child. The language history questionnaire was adapted from questionnaires by Bosch and Sebastián-Gallés (Reference Bosch and Sebastián-Gallés1997) and Byers-Heinlein (Reference Byers-Heinlein2009), and was written in both Spanish and English. Some questions asked for estimates of parents' proficiency in each language using an ordinal scale from 1 to 7 (1 = little or no knowledge, 7 = like a native speaker) and how and when they learned Spanish and English. Parents were also asked to provide an estimate of the amount of time they spoke with their child in Spanish and in English each day. Other questions asked parents to identify which language they would use in different situations (e.g. with friends, when out shopping), and to rate how true they felt different statements associated with their language use were (e.g. I often start a sentence in English and then switch to speaking in Spanish).
Stimuli
The participants were provided with a selection of twenty-five toys to play with during the play session. The toys included animals (horse, snake, octopus, fish, shark, butterfly, bears, dogs, cat, lobster, rabbit, cow, pig), food items (hot dog, orange, corn, egg), a Mrs Potato Head doll (with removable eyes/nose, mouth, arms, shoes, and hat), and other items that were expected to be somewhat familiar to the children (hairbrush, two pairs of star-shaped sunglasses, a small plane with wheels, and a plastic dog bowl). These items were selected so as to avoid English–Spanish cognates.
Procedure
All play sessions took place in a laboratory setting. A bilingual researcher explained the study to parents, speaking in either English or Spanish, depending on the parent's preference. However, the researcher also spoke briefly in the other language to make it clear that it was a bilingual setting and parents were free to treat it as such.
Parents were instructed to play with their child as they would at home and speak as they would normally in either language. Parents and children sat together on the floor, with a standard set of toys arrayed around them (described above). They were given an Audio Technica lavalier microphone to clip to their clothing, and the session was audio-recorded as an uncompressed WAV file using a Marantz PMD660 Professional Portable Digital Recorder at a sampling rate of 44·1 kHz. Speech samples of the parents were taken from these recordings. Play sessions lasted an average of 13 minutes. During their visit, children were also tested for an unrelated study, not described here.
Coding and analysis
Coders were fluent in both Spanish and English. They used English as their primary language and Spanish as their secondary language on a regular basis. The audio recordings of the play sessions were uploaded to a computer and orthographically transcribed using the Computerized Language Analysis (CLAN) program developed by the CHILDES project (MacWhinney, Reference MacWhinney2007). The CLAN program was used to link sound files directly to transcripts in small ‘bullet’ segments in order to facilitate accurate transcription. Utterance boundaries were determined using two of three criteria: after pauses longer than one second, after a terminal contour (drop in pitch), and/or after an obvious grammatical structure ending. The length of the samples varied from 88 to 480 utterances (M = 274·17, SD = 87·64, median = 275·5). After orthographic transcription was completed, these bullets were coded using Codes for the Human Analysis of Transcripts (CHAT), which allows for a variety of analyses using different tiers for each parameter of interest (MacWhinney, Reference MacWhinney2000). Coding and transcription procedures followed the CHAT manual (MacWhinney, Reference MacWhinney2007), along with additional coding conventions particular to this study. Each utterance was precoded for language (English, Spanish, mixed, or unassigned). When a speaker produced a single-word utterance that could be from either language (e.g. no), these instances were precoded in one of two ways. If the utterances immediately before and after were in the same language, the ambiguous utterance was coded in that language (e.g. You want some food? No? Why not?: all precoded as English). However, if there were two different languages preceding and following, the ambiguous utterance was marked as unassigned.
Code-switches were marked on a dependent tier. As sentences to young children are often quite short, switches of any size, including single word switches, were considered CS, following the MLF model (Myers-Scotton, Reference Myers-Scotton1997). However, proper names (e.g. Mamá, Mrs Potato Head, etc.) and words in the other language that functioned as a proper name (e.g. Tía used as the name of an aunt) were not considered CS. Switches from the matrix language to the embedded language could occur either between words within a sentence or between sentence boundaries (inter-sentential CS), and each was marked with a separate identifier. Parents often spoke to their children using two-word sentences, sometimes code-switching between the determiner and noun (e.g. El doggie.). We classified these instances as intra-sentential CS. However, if a parent switched languages and produced a one-word sentence without a determiner (e.g. ¿Qué es esto? Doggie.), we classified this as an inter-sentential switch.
CS were marked separately based on whether they were inter-sentential or intra-sentential. For inter-sentential switches, we coded each transcript in two ways: first following a matrix approach, and second based on parameters from the switching model. Based on the matrix approach, consecutive embedded-language utterances were each marked as a CS. Based on the switching approach, each alternation between languages was also coded as a CS. We then used these coding conventions to calculate: (i) the total number of CS of each type, (ii) the percentage of intra-sentential and inter-sentential CS relative to the total number of (assigned) utterances, and (iii) the number of times a caregiver repeated the same word in the other language in an adjacent utterance.
Inter-rater reliability
The ten middle utterances from ten of the transcripts were transcribed and coded by another researcher, using the same transcription and coding conventions. Cohen's kappa coefficients were calculated to measure agreement between the two coders for each transcript. The average kappa coefficient for language assignment was 0·79; and for type of CS was 0·68. These coefficients are considered to represent ‘substantial agreement’ (ranging from 0·61 to 0·8) (Landis & Koch, Reference Landis and Koch1977). Some of the main sources of disagreements across coders included difficulty hearing/understanding words in the audio recordings, as well as some difficulty determining where to separate the utterances. Most of these discrepancies were resolved through discussion between the coders.
RESULTS AND DISCUSSION
Question (i): How much do parents code-switch with their young children?
All of the parents in our sample code-switched at least one time during the play session. We had predicted that CS would occur, but that every parent did so, in only a 13-minute play session, suggests that CS may not be an uncommon occurrence in speech to young children.
However, the amount of code-switching did vary greatly from parent to parent, and there was also significant variation between types of code-switching. The exact amount of CS depends on the approach used to compute this value. The range for intra-sentential (within-sentence) code-switches was 0–43 (M = 7·5, SD = 10·1, median = 2·5), as shown in Figure 1. Overall, an average of 3·6% (SD = 4·93) of parents' language-assigned utterances contained a within-sentence CS, and 3·9% (SD = 5·24) of their multiword utterances. The range for inter-sentential CS was 0–143 using a matrix approach (M = 25·4, SD = 37·14, 6·5) and 0–116 using a switching approach (M = 25·4, SD = 32·25, median = 13). An average of 12·12% (SD = 15·76) of utterances contained inter-sentential CS by the matrix approach, and an average of 12·09% (SD = 12·54) of utterances contained inter-sentential CS by the switching approach. Figure 2 shows a histogram of inter-sentential CS using the switching approach. Furthermore, there was one caregiver who switched languages between more than a third of her utterances (i.e. 90 switches out of 246 total utterances, or 36·6%).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034250-08158-mediumThumb-S0305000914000695_fig1g.jpg?pub-status=live)
Fig. 1. Histogram of intra-sentential code-switching frequency across parents.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034250-56431-mediumThumb-S0305000914000695_fig2g.jpg?pub-status=live)
Fig. 2. Histogram of inter-sentential code-switching frequency across parents.
Combining across types of CS, an average of 15·8% (SD = 16·9; range: 0·4–58·5%) of utterances contained CS by the matrix approach, and 15·7% (SD = 14·9; range: 0·4–45·8%) of utterances contained CS by the switching approach. These values suggest that many infants are likely to be hearing a substantial proportion of CS on a daily basis. Certainly, this suggests that parents are either not attempting to avoid CS, or not successful at doing so.
Question (ii): Do parents primarily code-switch between utterances rather than within utterances, perhaps indicating avoidance of multi-language utterances? And if they CS within utterances, where do these code-switches occur?
As noted above, parents had within-sentence code-switches on an average of 3·9% of their multiword utterances, but there was a substantial range across parents. One parent had no intra-sentential code-switches at all; another had a CS within more than 20% of her utterances. Most parents (15 out of 24), however, had more CS between utterances than within utterances (t(23) = 2·46, p = ·02, although it is important to note that these proportions are calculated over different values). Nevertheless, this difference was not consistent enough to suggest that parents were specifically avoiding intra-sentential CS. There was no correlation between the proportion of utterances with an inter-sentential vs. intra-sentential CS (r = 0·10), although there was a slight (non-significant) correlation between the actual number of CS of each type (r = 0·33; p = ·12), which could be related to general talkativeness. This might suggest that the two types of CS occur for different reasons or based on different conversational pressures. Regardless, it appears that at least some children are exposed to within-sentence CS quite often.
We then identified all instances of such intra-sentential CS, and classified where in the sentence the language switch took place. Over half of the CS examples contained a switch occurring between a determiner and an immediately subsequent noun (e.g. can I have a beso?, dame el apple!). Many others occurred between either a pronoun or an adjective and an immediately subsequent noun (el otro fishy, go get your huevo). Thus, the vast majority of examples involved a switch at the noun itself. Research on adult-directed CS suggests that they typically occur at points of structural equivalence across languages, where a switch would not violate either language's syntactic rules (Poplack, Reference Poplack1980). According to this ‘equivalence constraint’, CS could occur either before the noun or before the determiner (e.g. dame the apple, or dame el apple); the bias for having the switch within the noun phrase has not been clearly predicted, and future work should explore whether this bias is unique to child-directed speech. Interestingly, there were a number of instances in which the CS went from a Spanish determiner to an English noun, but where the determiner was not the appropriate gender for what the word would have been in Spanish. That is, parents would say things such as el butterfly and mira un orange, where the determiner was masculine but the Spanish words for butterfly (mariposa) and orange (naranja) are both feminine. Although the number of cases of such atypical gender use is relatively small, this is an intriguing direction for future study.
Question (iii): Do parents repeat words across languages, possibly so as to create translation equivalences?
We next examined whether parents CS in order to repeat words in adjacent utterances. Monolingual parents often repeat words when speaking with young children, particularly when attempting to teach new words (e.g. Bard & Anderson, Reference Bard and Anderson1994; Broen, Reference Broen1972; Phillips, Reference Phillips1973; Snow, Reference Snow1972). Here we examine how often this occurred across languages.
On average, parents CS in a way that resulted in words being repeated in adjacent utterances 6 times per dyad (mean = 6·25; SD = 10·63), or on 2·4% of utterances, but here, too, there was substantial variability across parents. Most parents did this 5 times or less (n = 19); but 5 parents did this quite often (ranging from 10 to 42 times per parent). Proportioned over the number of utterances a parent produced, this ranged from 0 to 17%.
Repetition is quite common in speech to young children generally, and as a result these proportions may simply be reflecting the frequency of code-switching combined with frequent repetition. That is, there is no clear indication that parents are purposefully attempting to provide translation equivalences to their young children. Nonetheless, if children are hearing such repetition across languages in their input, they might use this to help learn the relationships between different vocabulary sets, a point we return to in question (v). However, the repetitions identified in the current work did not appear to be limited to words likely to be novel to the children, suggesting that these types of CS might instead (or additionally) be used as an attention-getting device. Words like look and come here were often code-switched and sometimes repeated again in the other language (e.g. Look at this! ¡Mira! Look look!).
Question (iv): Does parental fluency or education predict the degree of code-switching?
We collected two measures of parental fluency in each language (see Table 1). First, we identified the age at which parents learned their second language, or age of acquisition (AOA). Second, we asked parents to rate their proficiency in each language. Finally, we gathered information on parental education, as well. We correlated each of these measures both with how often parents CS inter-sententially and how often they did so intra-sententially. Three parents did not provide complete information and were excluded from the relevant analyses.
Table 1. Demographic data related to parental fluency
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034250-59170-mediumThumb-S0305000914000695_tab1.jpg?pub-status=live)
note: a Data was only available for twenty-one participants.
The first variable, age of acquisition of the second language, varied greatly in our sample. Some parents reported learning both languages from birth, whereas others acquired their second language during elementary or secondary school, or later in life. The correlation between the age of acquisition and proportion of intra-sentential CS did not reach significance (r(20) = –0·21, p = ·36). These results were similar for inter-sentential CS by both the matrix approach (r(20) = –0·24, p = ·29) and the switching approach (r(20) = –0·23, p = ·32). Prior research has suggested that bilinguals who are more fluent are more likely to show code-switching within a sentence, while less fluent bilingual adults are more likely to CS between sentences (Poplack, Reference Poplack1980). This might suggest that we would find a correlation between greater fluency (indicated by a younger age of acquisition) and more CS, at least for intra-sentential CS. While the correlations here were in the appropriate direction, they did not reach significance in this sample, and in fact were quite similar for intra-sentential and inter-sentential CS. However, Poplack's claim was actually regarding the proportion of intra-sentential CS out of all CS, not the proportion of intra-sentential CS out of all utterances. In fact, we did not find this effect either: the correlation between the proportion of intra-sentential CS (out of all CS) and the parent's AOA was quite weak (r(20) = 0·05, p = ·82). It is not clear whether this is simply the result of a lack of power, or indicative of different patterns of CS when speaking to children vs. adults.
The next variable, self-reported proficiency, was quantified using a ranking scale from 1 to 7 (1 = little to no knowledge, 7 = native-like). The majority of parents rated themselves as being equally fluent in both languages or almost equally fluent. All but one parent reported native-like proficiency in Spanish. This particular parent rated her Spanish ability as a 4, although she reported learning both languages from birth. There was more variety in parents' rated proficiency in English, with a third of the parents reporting only moderate fluency. Surprisingly, though, this also did not predict code-switching; there was no relationship between proficiency and intra-sentential switches (r s(20) = 0·11, p = ·33) or inter-sentential CS (matrix: r s(20) = 0·12, p = ·30; switching: r s(20) = 0·13, p = ·28). It is possible that this 7-point rating scale was not sensitive enough to differences in fluency among the caregivers.
The third variable, years of education, was fairly mixed in our sample, ranging from an eighth-grade education through a master's degree. Education likewise did not correlate with intra-sentential CS (r(23) = −0·32, p = ·13) or inter-sentential CS (matrix: r(23) = 0·07, p = ·75; switching: r(23) = –0·30, p = ·15) but showed a weak trend towards more language switching with poorer education levels.
In general, though, while parents vary in the extent to which they code-switch with their children, this does not appear to be tied to general demographic patterns. Of the three demographic variables, age of acquisition, rated proficiency, and education, none correlated with amount of CS in our sample.
Question (v): Do parents' ratings of their CS behavior match their CS in the play session?
Although we did not initially collect information from parents regarding how often they judged that they code-switched when speaking to their children, we added this question to our language questionnaire half-way through the study, and have this data for fifteen of the twenty-four parents. Questions about CS behavior were asked in a number of ways: we asked parents how often they found themselves starting a sentence in English and switching to Spanish part-way through, how often they did the reverse, how often they ‘borrowed’ a word from Spanish when using English (or the reverse), and how often they mixed the two languages. Parents rated each question on a 1 to 7 scale.
Interestingly, four parents reported that they never mixed languages when speaking with their children. Although three of these parents CS relatively infrequently, each did so, despite the relatively short period of time in the lab, and one did so quite often (13 intra-sentential code-switches). This supports prior work by Goodz (Reference Goodz1989) that suggests that even parents committed to maintaining a one-parent-one-language distinction nonetheless use both languages to their children.
We then correlated parental responses to the question on mixing languages to their observed CS behavior (see Table 2). We found no correlation for intra-sentential CS (r s(14) = 0·27, p = ·33) but a significant correlation for inter-sentential CS (r s(14) = 0·56, p = ·03 by the matrix approach, r s(14) = 0·54, p = ·04 by the switching approach). There were stronger effects looking at parents' ratings of how often they switched between languages, and how often they borrowed words across languages.
Table 2. Correlations between parental reports of CS and actual CS behavior (n = 15)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034250-87390-mediumThumb-S0305000914000695_tab2.jpg?pub-status=live)
notes: All correlations are Spearman's, since data consists of ratings; * = p < ·05; ** = p < ·01.
These results suggest two things. First, even children whose parents attempt to use a strict one-parent-one-language approach to communication may still hear code-switching on a relatively frequent basis (see also Goodz, Reference Goodz1989). That is, parents may actually code-switch even when they state this does not happen. But second, those parents who report more mixing between languages are generally accurate in their self-report. This bodes well for future research on this topic, in that it suggests that parental self-report may be fairly accurate. However, given the small size of this sample, we would nonetheless suggest that future work continue to explore speech samples actually produced by caregivers. Finally, the significant correlations between parental ratings and actual behavior seem to be limited to inter-sentential code-switching. Parent's self-reports of ‘language mixing’ do not appear to be an accurate indicator of their within-sentence code-switches.
Question (vi): Does the degree of code-switching predict children's vocabulary size?
Our final analyses examined whether any of these CS measures (intra-sentential, inter-sentential, or adjacent) have implications for children's vocabulary development. As noted earlier, based on adult literature there is reason to believe that CS might entail processing costs. Proverbio et al. (Reference Proverbio, Leoni and Zani2004) found that Italian–English interpreters were slower to make judgments about mixed-language sentences than unmixed ones. Similarly, a number of studies have reported evidence of switching costs at the neural level, through the recording of event-related potentials (ERPs) and functional magnetic resonance imaging (fMRI) (Abutalebi et al., Reference Abutalebi, Brambati, Annoni, Moro, Cappa and Perani2007; Chauncey, Grainger & Holcomb, Reference Chauncey, Grainger and Holcomb2008; Duñabeitia, Dimitropoulou, Uribe-Etxebarria, Laka & Carreiras, Reference Duñabeitia, Dimitropoulou, Uribe-Etxebarria, Laka and Carreiras2010; Proverbio et al., Reference Proverbio, Leoni and Zani2004). Given these findings, we would expect that young children would experience even greater processing costs associated with CS, as toddlers have fewer resources available to use since they are still learning each language. The presence of CS in the input could potentially be confusing or challenging for the child and reduce the cognitive resources available to learn words from such sentences, resulting in smaller vocabularies.
On the other hand, it is quite possible that CS actually provides explicit cues as to the relationships between languages, particularly for parents who frequently repeat words across languages. Frequency of presentation has been shown to have a significant effect in monolingual vocabulary acquisition; for example, Goodman, Dale, and Li (Reference Goodman, Dale and Li2008) found a positive correlation between how often parents used a set of test words and the age at which those are typically acquired. Vosoughi, Roy, Frank, and Roy (Reference Vosoughi, Roy, Frank and Roy2010) assessed 690 hours of input to a particular child, and found that a word's recurrence (i.e. how often a word was repeated within a short time span – approximately one minute) correlated strongly with the age of acquisition for that particular word. Thus, it seems that children in the early stages of lexical acquisition benefit greatly from repetition in general. When parents provide repeated exposure to new words, this seems to aid in the storing and accessing of words (Hoff & Naigles, Reference Hoff and Naigles2002). Under this account, children who hear more CS that include words being repeated in adjacent utterances might have more translation equivalents in their vocabularies, as well as larger vocabularies overall.
Total vocabulary measures
Spanish and English MCDI scores were used to quantify children's vocabulary. Two vocabulary counts were calculated: total vocabulary (TV: total number of word forms known in Spanish and in English combined) and total conceptual vocabulary (TCV: total vocabulary minus overlapping vocabulary, which reflects the total number of concepts for which the child has at least one word). These numbers would differ if the child had a large number of translation equivalents in his or her vocabulary (e.g. perro and dog), since the TCV would count such word pairs only once, but the TV would count them twice.
One concern is that the presence of cognates could inflate the child's apparent vocabulary. Many parents had marked cognate words (e.g. tren and train) on both the Spanish and English forms, despite the fact that it was nearly impossible to discern whether the children truly possessed separate lexical representations for these cognate words. However, cognates made up a relatively small percentage of children's vocabularies, with an average count of only 2·8 cognates per child (range 0–19); given this small number, we simply removed cognates from the total vocabulary count in all cases (but not from total conceptual vocabulary, since the child clearly had at least one word for this concept).
These two vocabulary totals (TV and TCV) were used in correlational analyses with the amount of code-switching in parents' speech. One of the parents completed the vocabulary inventory incorrectly, marking all of the child's words (in both languages) on one MCDI, rather than differentiating between the Spanish and English vocabularies. Since we were unable to determine whether this child had any translation equivalents in her vocabulary, her data is used only in the measures involving TCV.
There was a large amount of variation in children's vocabulary counts. Total conceptual vocabulary ranged from 3 to 396 words (TCV: M = 92·46, SD = 121·72, median = 38·5). The children varied in age over a 6-month span, and as a result, age correlated with vocabulary (r(23) = 0·46, p = ·029; see Table 3). We therefore explored effects of vocabulary in the analyses below both when ignoring age, and when using age as a covariate. Total vocabulary (including each word form, but ignoring cognates) ranged from 3 to 509 words (TV: M = 114·08, SD = 155·2, median = 41).
Table 3. Correlation matrix between parental input and child outcomes
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160922034250-92799-mediumThumb-S0305000914000695_tab3.jpg?pub-status=live)
notes: All correlations in this table are Pearson's, despite some factors being parental ratings which are, by nature, ordinal data rather than interval. * = p < ·05; ** = p < ·01; Significant correlations are in gray, to aid in clarity. For proportion of intra-sentential CS, we proportioned over the total of multiword utterances, rather than over all utterances, since single-word utterances cannot logically contain an intra-sentential CS. For inter-sentential CS, we proportioned over all utterances.
Many of the children in the present study were reported to have surprisingly low vocabularies. Indeed, fourteen of the twenty-four children had total vocabularies (TV) with less than 50 words. It is not clear whether this is an indication of poor vocabulary estimation on the part of the parents, or whether the children's lexical development was actually delayed; this may be an indication for a need for future follow-ups with a larger sample of children. Prior research has suggested that monolingual lower-SES families may tend to overestimate vocabulary rather than underestimate it (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal and Pethick1994), but this may differ for bilingual families. These findings may also be related to parent education level and its effects on child vocabulary. A large number of studies have demonstrated that SES is related to vocabulary outcome (see Hoff, Reference Hoff2013; Miser & Hupp, Reference Miser and Hupp2012; Rowe, Reference Rowe2008, for some recent reports) and education level is frequently used as a proxy measure for SES in these studies. In the current study, 46% of mothers and 46% of fathers had a 2-year college degree or below. This is a lower education level than that found in many studies of child language, which tend to enroll middle- to upper-class households, but is not substantially lower than that from the MCDI norming studies (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal and Pethick1994).
Correlations were conducted between each type of CS (both raw counts of CS and proportions of CS relative to total number of utterances) and each measure of vocabulary. Results were similar when based on raw counts of CS vs. proportions of CS relative to the total number of utterances; we therefore report proportional data. Results likewise did not change when age was controlled. Results of these correlations are presented in Table 3, in the lower left quadrant. We found a significant correlation between intra-sentential CS and vocabulary (r = 0·49, p = ·018 for total vocabulary, r = 0·50, p = ·013 for total conceptual vocabulary); results are slightly stronger when age is controlled for (r = 0·54 and r = 0·55, respectively, both p < ·01). A greater amount of intra-sentential CS from the parent resulted in children having a larger productive vocabulary, an effect in the opposite direction of our initial prediction.
One concern is that this effect may have been driven by an apparent outlier, a child with a total conceptual vocabulary of 396 words and total vocabulary of over 500. This child's vocabulary was more than 2 SD beyond the mean of the group, generally considered to be indicative of an outlier. However, the real concern with an outlier is whether that particular datapoint might be having too much influence in the overall results. To evaluate this, we calculated Cook's D, a measure of influence, based on a regression formula predicting total vocabulary based on proportion of intra-sentential CS and age. No entry had a Cook's D above 0·32; generally, values on this statistic above 1·0 are worthy of concern (Cook & Weisberg, Reference Cook and Weisberg1982). This result suggests that no one child was having too much influence on the overall regression. Thus, while the child may be an outlier in terms of vocabulary, he or she was not the primary cause of the correlation as a whole. (Indeed, while removing this child's data from the correlation weakens the apparent effect somewhat, the trend remains. Controlling for age, correlations between intra-sentential CS and vocabulary are r = 0·41, p = ·064 for total vocabulary, and r = 0·42, p = ·056 for total conceptual vocabulary.) Based on the above justification, this child was included in the sample and all the subsequent analyses. Taken together, this set of data suggests that there was a correlation between parental code-switching and child vocabulary. However, the effect is such that more code-switching was related to a larger child vocabulary, not a smaller one. There was certainly no evidence to suggest that parents' code-switching has any negative impact on children's vocabulary. This data fails to support suggestions that code-switching would be detrimental to lexical acquisition.
What might explain this relationship? We cannot be certain, but one interesting pattern emerges: a large number of children in this study had quite low vocabulary scores (13 children had total vocabularies of under 50 words). Most of these parents also had low proportions of intra-sentential code-switching (mean = 2·6%; 9 parents had 4 or fewer examples). Perhaps parents of these low-vocabulary children were either avoiding code-switching within sentences, or were simply speaking to their children in shorter, simpler sentences (which left fewer opportunities for intra-sentential CS). Or, to put it another way, perhaps parents only begin to code-switch intra-sententially once the child acquires a sufficiently large productive vocabulary. Future work, particularly longitudinal work, could explore this possibility directly.
Translation equivalents
Our final analysis examined the number of translation equivalents in children's vocabulary. This was calculated by counting the number of referents for which the child had two lexical items, one in each language. This was proportioned against total vocabulary, and both the raw and proportioned TE counts were used in correlational analyses with the number of times the caregiver translated a word in an adjacent utterance.
Given the relatively low vocabulary counts overall, it is not surprising that few children had many translation equivalents. Indeed, this formed a bimodal distribution, with 5 children having 50 or more translation equivalents, and 18 having less than 10 (mean = 18; range = 0–113). Based on the proportion of total vocabulary, children ranged from 0% translation equivalents to 23·5% (mean = 6·74%). Controlling for age, there was no correlation between translation equivalents and parents' repetitions across languages, based either on raw counts of repetitions (r = 0·05) or on proportions across utterances (r = 0·02). There was likewise no correlation between the proportion of translation equivalents in the child's vocabulary and parents' repetitions across languages, based both on raw counts (r = 0·15) and on proportions across utterances (r = 0·20). Thus, there is no evidence that hearing more CS of any type (immediately adjacent or not) aids in the development of translation equivalents. It is not clear what to make of this finding, since it relies on a very small N (only 5 children had large numbers of TEs). However, in general, while there are substantial differences in children's vocabulary patterns, these do not appear to be accounted for by our measure of parental word repetition across languages.
A complete correlation matrix of all data can be found in Table 3.
FINAL DISCUSSION
The present study investigated characteristics of parents' code-switching behavior when addressing their children, and the relationship between this CS and children's vocabulary development. Among our sample, parents varied in the amount and types of CS, but each parent code-switched at least one time during a short play session. It is possible that parents only code-switch occasionally or for specific words, but nonetheless, it appears that code-switching may be more customary in speech to young children than previously predicted. The fact that parents CS so frequently when speaking to their infants suggests that this is in fact a relevant phenomenon in bilingual language acquisition, and that there is a need for future exploration of this topic, so that we can better understand the implications of this behavior.
Parents were found to CS more often inter-sententially than intra-sententially. We measured inter-sentential CS according to two different theoretical perspectives, and by either approach, CS occurred between sentences more frequently than within sentences, even when single-word sentences were excluded from analysis. The differences were significant, but it is unclear what is driving this preference for code-switching inter-sententially. There are several possible explanations. The first is that parents CS more frequently between sentences because it could potentially be less disruptive to the child. That is, a switch that occurs at a logical boundary might be less disruptive than one that occurs inside a sentence. Intra-sentential CS generally consists of one or a few words from the second language being inserted into an utterance in the first language. Our presumption had been that this type of CS would be more disruptive to processing, and potentially more confusing for children. This possibility could account for why parents do not switch inside a sentence as often. However, 63% of parents produced intra-sentential CS more than once (83% at least once), suggesting that parents were not generally trying to avoid these altogether, despite the potential confusion this might cause young language learners.
Another possible reason for a preference for inter-sentential CS could be related to the language proficiency of the parents. Intra-sentential CS is argued to require a higher level of mastery of both languages' grammars in order to CS easily and appropriately (Poplack, Reference Poplack1980). The majority of parents in our sample were not balanced bilinguals, and could have had more difficulty code-switching within sentences. If this were true, we would expect more intra-sentential CS compared to inter-sentential CS in those parents who were more skilled in both languages. However, we did not find that self-rated proficiency was related to intra-sentential CS in our sample.
As already noted, we measured inter-sentential CS according to two different theoretical perspectives, one based on ‘switches' between languages, and the other based on a matrix or frame language into which other speech can be inserted. It is not clear which of these approaches would have greater psychological reality for the child; indeed, this may change with the child's own linguistic development. For example, for a young infant, any change in the continuing sound pattern may be striking, regardless of direction. In contrast, a child who has begun to acquire a dominant language may be processing input in a way more comparable to the matrix approach. The children in this study may themselves differ in this respect. However, despite the theoretical distinctions between these code-switching perspectives, they did not lead to any substantive differences in the current results; the number of code-switches tended to be quite comparable in the two approaches for most parents (r = 0·95). There were a few individual cases, however, where parents had rather different numbers depending on the approach. For example, one parent had 5 inter-sentential CS by a matrix approach, but 22 by a switching approach, whereas another had 143 by a matrix approach but only 90 by a switching approach. Thus despite generally comparable results, these theoretical distinctions may be important in individual cases.
A substantial number of the parents' code-switches were translations of words that they had previously said in the other language and included repetitions of words that were not likely to be unfamiliar to the child (e.g. Look!, ¡Mira! It's a fishy). This repetition across languages is not a behavior that is commonly seen in adult-to-adult speech, unless the speaker is trying to emphasize or to clarify a word with which the listener does not seem to be familiar. However, repetition even within a language is far more common in child-directed speech, so this CS across languages may simply be an outcome of the increase in repetition more generally.
Most CS occurred immediately prior to a noun: either subsequent to a determiner (un fishy) or following an adjective or preposition. Moreover, there were a number of instances where a masculine Spanish determiner was used, even though the subsequent English word would have been feminine had it been spoken in Spanish. These patterns have not been previously described in the adult-directed CS literature, and should be investigated in more depth in the future.
One of the main findings is that there was no evidence that the degree of parental CS had a negative impact on children's vocabulary development. In fact, there was a significant positive correlation between intra-sentential CS and children's vocabulary, such that parents who CS more often within sentences had children with larger (not smaller) vocabularies. This could be an indication that parents tend not to code-switch within sentences until their children reach a certain level of vocabulary skill; a longitudinal study would be needed in order to explore this possibility in more depth.
The finding of a positive relationship contradicts a recent study by Byers-Heinlein (Reference Byers-Heinlein2013) that found a negative relationship between CS and vocabulary development. This may be the result of methodological differences; for example, Byers-Heinlein used an English-only measure of receptive vocabulary, whereas we used an expressive measure collected across the children's two languages. Moreover, Byers-Heinlein's measure of code-switching was based on parent report; such reports theoretically capture the general pattern of parental speech in a way that a single 13-minute lab session cannot. The current results suggest that while absolute numbers of CS are not accurate (some parents claim to never CS but still do), there remain strong correlations between parents' self-report of CS and actual measures of CS – at least inter-sententially. This in turn suggests that the methodological difference is unlikely to be the cause of the difference between our results. Future work is clearly needed to help explain the discrepant results across studies. However, it is worth noting that parents' reports of their mixing behavior did not correlate with their within-sentence CS, which is where the positive relationship with vocabulary was found. Thus, use of parents' self-report may not be an accurate measure of how often parents present their children with dual-language sentences.
In the current study, however, there was no indication of impaired vocabulary as a result of code-switching. This is an encouraging finding, as it indicates that using more than one language when speaking with young children may not be detrimental to their language development, as some theories have suggested (e.g. Barron-Hauwaert, Reference Barron-Hauwaert2004). Parents were observed to translate words in adjacent utterances, but this behavior, too, did not correlate with the number of translation equivalents in children's vocabularies. These null results suggest that parents need not overly worry about their use of CS having negative impacts on their children's vocabulary acquisition, although future work with a larger, more representative sample of bilingual families is clearly warranted.
Future work on this topic is needed, and could take several directions. First, such work could explore whether the amount and types of CS vary depending on the type of situation. For example, parents may CS to differing degrees when attempting to teach new words vs. when they are simply labeling familiar objects. Moreover, both of these may differ from situations in which the parent is attempting to gain their child's attention or instruct them in a particular task (Let's put away your toys!). Recordings of parent–child interactions in the home across multiple settings might provide greater insight into when, and why, parents CS when talking with their children.
A second direction for future work would explore whether the amount and types of CS vary depending upon the age of the child, in order to explore whether bilingual parents CS more (or less) as children become more proficient in the language. Previous research examining the input that monolingual parents produce when addressing their children suggests that parents do in fact adjust their speech depending on the age, cognitive development, or language proficiency of the child (e.g. Bernstein Ratner, Reference Bernstein Ratner1984; Kitamura & Burnham, Reference Kitamura and Burnham2003; Kitamura, Thanavishuth, Burnham & Lusaneeyanawin, Reference Kitamura, Thanavishuth, Burnham and Lusaneeyanawin2002; Liu, Tsao & Kuhl, Reference Liu, Tsao and Kuhl2009), and child age or linguistic development may influence CS as well. Although child age did not relate to parental CS in the current study, the participants represented a fairly narrow age range, and future work should investigate this more directly. It would be particularly useful to explore CS behaviors longitudinally, exploring changes in parents' CS behavior as their children aged. This would allow for an examination of both how the child's vocabulary influences the parents' CS and how CS input affects vocabulary growth.
It would also be useful to further explore differences in CS behavior depending on parent language experience. Although we did not find effects of parental fluency on CS in this study, the parents were a mixed set of bilinguals, with some being sequential bilinguals (with an age of acquisition later in life) and others being early bilinguals. This difference among the participants may have masked more subtle effects of language fluency.
There may also be differences in CS for different language combinations; the current study explored only Spanish and English, two languages that are from different rhythmic classes but which are both SVO; patterns of CS could differ for children learning a tonal and a non-tonal language, as one example, or languages with different word order patterns. Finally, future work could explore aspects of language development beyond parent-reported vocabulary. One such measure would be on-line comprehension, either of sentences produced in a single language or of multilanguage utterances. Children who are exposed to more CS might be more familiar with this form of speech and show fewer processing delays than children who hear intra-sentential CS less often.
This study had several limitations, beyond its small sample size. Our analyses were correlational in nature, making it impossible to identify causal relationships. This was shown most clearly in the relationship between parental CS and children's vocabulary: had the relationship been negative, it would have been easy to suggest that CS caused confusion and thus hampered children's lexical acquisition. With a positive relationship, we suggested instead that perhaps parents' CS is dependent on the vocabulary level of their children. In reality, we cannot know whether parental CS leads to enhanced children's vocabulary, enhanced children's vocabulary leads to more CS, or both are instead related to some third variable altogether. Exploring CS longitudinally may help in this regard. In addition, while language mixing was directly observed, it was observed for a very short time window, in an unfamiliar situation in which parents were aware they were being observed. Whether behavior in this type of laboratory task is comparable to that in the home environment remains to be seen.
In conclusion, the present study explored CS among a group of twenty-four bilingual parent–toddler dyads. All parents CS when speaking to their young children, suggesting that it is not uncommon for children to be exposed to mixed-language sentences. Children who heard more such sentences actually demonstrated larger vocabularies, rather than smaller ones, providing no evidence that exposure to mixed-language sentences hinders lexical acquisition. Which parents CS more often was not predictable on the basis of parental language fluency or education, and CS seemed particularly biased towards occurring immediately before a noun (within a noun phrase). Finally, parents often repeated words across their two languages, but this did not appear to increase the likelihood of children having translation equivalents in their vocabulary.