Considerable attention has been devoted to the question of how infants acquire native-language sound categories in the face of variable speech input. The mapping from speech acoustics to linguistic representations is many-to-one (Liberman, Cooper, Shankwieler, & Studdert-Kennedy, Reference Liberman, Cooper, Shankweiler and Studdert-Kennedy1967), leaving infants with the challenge of reconciling various sources of phonetic variability in order to form stable sound categories. For example, speech from a single talker varies with speech context and speaking rate (Hillenbrand, Getty, Clark, & Wheeler, Reference Hillenbrand, Getty, Clark and Wheeler1995; Liberman, Reference Liberman1957; Lisker & Abramson, Reference Lisker and Abramson1964). Across talkers, infants encounter additional variability from speech styles (e.g., infant-directed, adult-directed speech), idiosyncratic speech patterns, dialect, and foreign-accented speech. Thus, infants must develop representations that respect the stable properties of the native language while maintaining perceptual flexibility to accommodate differences inherent in speech. However, there are limits to the extent to which phonetic variability should be accommodated. For example, when the variation in phonetic realizations signal to an infant that they are listening to a different language, they should not attempt to process the signal as part of their native language. Failing to do so would place increased demands on a limited system and make it difficult to acquire relevant categories within their native language. In the current studies, we examine the conditions under which infants are sensitive to these limits and the effect that changes in accommodation may have on the learning of new words.
Early in development, infants have difficulty processing unfamiliar speech patterns, but, with age, become increasingly able to accommodate speech patterns that deviate from their typical input. Nine-month-olds familiarized to isolated words spoken by one talker will recognize these target words in passages spoken by a different talker if both talkers share the same dialect (e.g., American Midwest English), but do not recognize target words across talkers with different accents (e.g., American Midwest and Spanish English) (Schmale & Seidl, Reference Schmale and Seidl2009) or dialects (e.g., American Midwest and Canadian English) until 12 months of age (Schmale, Cristià, Seidl, & Johnson, Reference Schmale, Cristià, Seidl and Johnson2010). The greater the differences between native and non-native speech patterns, the less likely younger infants will accommodate variant forms. For example, 19-month-olds show a preference to attend to highly frequent word-forms spoken in both a familiar dialect (Connecticut English) and unfamiliar dialect (Jamaican English), while 15-month-olds only show a preference for words spoken in their native dialect (Best, Tyler, Gooding, Orlando, & Quann, Reference Best, Tyler, Gooding, Orlando and Quann2009). Thus, with development, infants show greater flexibility in their ability to recognize familiar word forms that are produced in non-native dialects or accents.
However, development is not the only factor that can influence infants’ accommodation of phonetic variability. There is also emerging evidence that infants’ flexibility with accent and dialectical differences may vary with exposure to phonetic variability and task demands. In contexts of word learning, 12-month-olds will map objects to words that vary phonetically from their native language of English, such as Japanese [ʃi̥ka], but will not accept forms that violate English phonotactics, such as Czech [ptak] (Mackenzie, Curtin, & Graham, Reference MacKenzie, Curtin and Graham2012). Toddlers of 18 to 20 months adapt to a newly introduced sound pattern after being trained on three familiar label–object pairings (e.g., dog) containing a sound change (e.g., [dæg] while seeing a dog), and generalize this sound change to a newly presented familiar object, such as [sæk] for [sak] (sock) (White & Aslin, Reference White and Aslin2011). Similarly, 24-month-olds will flexibly accommodate non-native pronunciations of newly learned words, but only when words are initially learned in the non-native accent (Schmale, Hollich, & Seidl, Reference Schmale, Hollich and Seidl2011) or if they receive prior exposure to speech in a variable accent or dialect (Schmale, Cristià, & Seidl, Reference Schmale, Cristià and Seidl2012; Van der Feest & Johnson, Reference van der Feest and Johnson2016). Thus, even short exposures to variable pronunciations can result in accommodation (see also Creel, Reference Creel2012; Iverson & Kuhl, Reference Iverson and Kuhl1996). Together, these studies demonstrate that accommodation to deviations in speech input is flexible and depends on age, lexical knowledge, and the amount of variable input within the task (Werker & Curtin, Reference Werker and Curtin2005).
What remains unclear is how infants shift from accepting phonetic variability within their native language to using phonetic realization differences to distinguish between their own and other languages. This is an important issue for understanding the pressures upon the perceptual systems for language development because it addresses the central interplay between stability and flexibility. On the one hand, infants are developing stable representations of the native language environment (i.e., phonemes) and using these to direct learning of new information (e.g., via selective attention to speech input that maps onto existing phonemic representations). On the other hand, infants must maintain enough perceptual flexibility to accommodate deviations (e.g., accent variations) that are inherent in the speech environment but do not cross relevant categorical boundaries. This issue of how emerging levels of representation interact to constrain perception of speech input is core to theoretical questions underlying language development (Curtin, Byers-Heinlein, & Werker, Reference Curtin, Byers-Heinlein and Werker2011; Werker & Curtin, Reference Werker and Curtin2005).
Here, we employ a different approach to examine the contextual cues that may shift infants’ accommodation of non-native word-forms. First, we replicate a procedure where 12-month-old English-learning infants have been shown to accommodate Japanese forms as potential labels for novel objects (Mackenzie et al., Reference MacKenzie, Curtin and Graham2012) and use this to examine whether 17-month-olds will continue to map Japanese forms to novel objects (Experiment 1). Second, we examine whether exposure to a non-native language may shift infants’ accommodation of Japanese forms (Experiments 2 and 3). Although previous research has demonstrated that prior exposure to accent variations within their own language facilitates infants’ accommodation of novel word forms (Schmale et al., Reference Schmale, Cristià and Seidl2012; van der Feest & Johnson, Reference van der Feest and Johnson2016), we predicted that exposure to a non-native language would have the opposite effect on infants’ word learning. That is, hearing speech input that does not readily map onto the phonemic representations of their native language may shift infants’ perception to be less accommodating of variant word forms.
Experiment 1
We examine whether English-learning 17-month-olds will accept a Japanese word as a possible label for a novel object. Given that children become more flexible in their acceptance of dialectical and accent variations with age, it is possible that 17-month-olds will continue to map Japanese forms to novel objects if they perceive these forms as accented variants of English. Alternatively, if, with age, infants become more selective in their attention towards the phonetic realizations of their native language, then 17-month-olds may not associate Japanese forms with novel objects.
Method
Participants
Twenty-two 17-month-oldsFootnote 1 were included in the final analyses (see Table 1 for a summary of age and gender). An additional 14 infants were recruited but excluded from the final sample due to fussiness (n = 5), not completing the study (n = 5), failure to recover attention at post-test (n = 2), technical error (n = 1), and parental interference (n = 1).
Table 1. Summary of Age and Gender Distribution for All Experiments

Stimuli
Visual stimuli consisted of videos that were 20 seconds in length. A video of a pinwheel spinning clockwise was used for both pre- and post-test trials. Two videos were presented during the habituation phase, each depicting a different novel object moving back and forth horizontally on the screen.
Auditory stimuli consisted of three Japanese words that were recorded by a native female speaker of Japanese. The Japanese words sika [ʃi̥ka] and hashi [haʃi] were presented during habituation and test phases. These words were chosen because the segments and syllable structure are legal in English, while the phonetic realization of the forms is not part of the infants’ native dialect (e.g., devoiced vowels). A novel word was presented during pre- and post-test trials (tega [tega]). On each trial, infants were presented with nine tokens of a single word that were separated by a 1-second inter-stimulus interval.
Apparatus
Throughout the procedure, infants sat on a caregiver's lap or in a high chair adjacent to their parent. Caregivers wore noise-cancelling headphones during the task to mask auditory input and reduce the likelihood of interference. Infants faced a 122 × 91.5 cm monitor where images were presented. Auditory stimuli were played from speakers located directly below the monitor. All sounds were presented at a range of 65 dB, ±5 dB. Infants’ attention to the screen was monitored and recorded using a digital video camera. Administration of stimuli was controlled using Habit X 1.0 software (Cohen, Atkinson, & Chaput, Reference Cohen, Atkinson and Chaput2004).
Procedures
To assess infants’ formation of word–object associations, a Switch Task procedure was used (Werker, Cohen, Lloyd, Casasola, & Stager, Reference Werker, Cohen, Lloyd, Casasola and Stager1998; See Figure 1 for an overview of the procedure). First, the pre-test trial was presented to assess for baseline attention to the screen. Next, infants were habituated to two novel word–object pairs (e.g., Object A paired with sika, Object B paired with hashi) that were repeatedly presented in semi-random order on sequential trials. The specific combination of word–object pairs was counterbalanced across different test orders. Trials were presented in in blocks of four for a maximum of 24 trials or until infants’ looking time reached a habituation criterion of 65% of their looking time on the first block of trials. All infants included in the final sample met the criterion for habituation. Immediately following habituation, infants were presented with two test trials: (1) a same trial in which a familiar word–object pair was repeated from the habituation phase, and (2) a switch trial in which a familiar object and word from the habituation phase were paired in a novel combination. The order in which test trials were presented was counterbalanced across test orders. The post-test trial was presented following the test phase.

Figure 1. Schematic of Switch task procedures.
Coding
Online coding was used to assess if and when infants met the criterion for habituation in order to proceed to the test phase. For statistical analyses, looking times were coded offline using SuperCoder software (Hollich, Reference Hollich2008). A second rater coded approximately 20% of the data (n = 4) to assess for inter-rater reliability. All raters were blind to test trial type during coding. The intra-class correlation (ICC) was high (ICC = .997, p < .001).
Results and discussion
Infant looking times during the pre-test, post-test, and final habituation block were first compared to ensure that (a) infants’ attention during the final block of habituation had significantly decreased from their baseline attention during the pre-test trial, and (b) infants’ attention recovered between pre- and post-test. See Table 2 for a summary of means. Results confirmed that looking times differed across all three trial types (F(2,42) = 117.63, p < .001, η partial2 = .85). Infants’ attention significantly declined between pre-test and the last block of habituation trials (p < .001) but looking times did not significantly differ between the pre-test and post-test trials (p = .68).
Table 2. Mean Looking Times for All Trial Types by Study and Condition

Note. Standard deviations are shown in parentheses.
The primary analysis was to assess whether infants’ looking times differed between switch and same test trials (see Figure 2 for a graph of mean comparisons). Results of a paired sample t-test indicated that infants looked significantly longer at novel pairings on the switch trial (M = 10.45, SD = 4.55) than at familiar pairings on the same trial (M = 7.93, SD = 3.00) (p = .03, d = .66). The majority of infants (63.63%) showed longer looks on the switch test trial than the same test trial.

Figure 2. Mean looking times on same and switch trials for all experiments.
The results indicate that 17-month-olds accept Japanese words as potential labels for objects. These findings replicate previous findings with 12-month-olds (MacKenzie et al., Reference MacKenzie, Curtin and Graham2012). What might account for this pattern? One possibility is that the 17-month-olds in Experiment 1 did not classify the Japanese labels as non-English words. As previous research has indicated, infants between 15 and 19 months become increasingly flexible in representing and recognizing variations of words spoken in different accents (Mulak, Best, Tyler, Kitamura, & Irwin, Reference Mulak, Best, Tyler, Kitamura and Irwin2013). In this experiment, it is possible that infants perceived the Japanese words as potential English labels, rather than words from an unfamiliar language. We examined this possibility in Experiment 2.
Experiment 2
In this experiment, we presented 17-month-olds with a brief Japanese passage prior to habituation. The passage was intended to signal to infants that they were hearing a language other than English. Brief exposure to another language may thereby shift infants from attending to the word-learning task more broadly and/or make it more challenging for the them to understand the task (Curtin et al., Reference Curtin, Byers-Heinlein and Werker2011; Werker & Curtin, Reference Werker and Curtin2005). Thus, if infants are less likely to perceive Japanese word-forms as English, following exposure to Japanese, they should no longer map objects to these labels.Footnote 2 In addition to the passage, we manipulated whether the same talker was heard during both the passage and habituation or whether a different talker was heard during habituation. The purpose of this manipulation was to clarify whether exposure to the phonetic realizations of Japanese would affect infants’ perception of forms produced by an individual talker or whether it would shift infants’ perception more broadly to generalize across different talkers.
Method
Participants
Forty-four children (M age = 17.6 months, 21 female, range = 16.8–18.2 months) were included and randomly assigned to either the same talker or different talker condition. An additional 17 infants were excluded for the following reasons: failure to complete the experimental trials (n = 4), fussiness (n = 3), attending for less than 1 second on a test trial (n = 3), failing to recover attention at post-test (n = 2), failure to habituate (n = 2), technical error (n = 2), and parental interference (n = 1).
Stimuli and apparatus
Two changes were made to the stimuli and apparatus of Experiment 1: (1) a different novel word was presented during pre- and post-test trials (dona [donə]) to avoid any potential phonetic similarity with the Japanese passage, and (2) a spoken passage trial was included. The passage was 16 seconds in length and consisted of a Japanese translation of the nursery rhyme Twinkle Twinkle Little Star. All auditory stimuli, including habituation and pre- and post-tests words, were pre-recorded by two female native speakers of Japanese. The passage was presented along with a still image of a field of flowers.
Procedure
Procedures were the same as in Experiment 1 except that infants were presented with the passage prior to the pre-test trial. Depending on the condition to which infants were randomly assigned, subsequent habituation and test trials were presented with a talker who was either the same or different from the one heard during the passage.
Coding
Looking times were coded offline as in Experiment 1. Approximately 20% of the data (n = 8) was coded by a second rater to assess for inter-rater reliability. The intra-class correlation indicated strong reliability between coders (ICC = .993, p < .001).
Results and Discussion
Results of a preliminary 2 (talker condition: same vs. different) × 3(trial type: pre-test, post-test, last habituation block) mixed model ANOVA indicated a significant main effect of trial type (F(2,84) = 104.36, p < .001, η partial2 = .71), but no significant interaction between talker condition and trial type (p = .44) (see Table 2). Infants looked significantly less during the last block of habituation than during the pre-test (p < .001) and post-test (p < .001) trials but showed no difference in looking times between pre- and post-test trials (p = .18).
The primary analysis was a 2 (talker condition: same vs. different) × 2 (test trial: same vs. switch) mixed model ANOVA to examine whether looking times on same and switch test trials differed between talker conditions. Results indicated no significant main effects of talker condition (p = .39) or Test Trial (p = .59), nor was there a significant interaction between these two variables (p = .82). Thus, 17-month-olds did not, on average, look significantly longer on the switch trials (M = 8.81, SD = 4.96) than on same trials (M = 8.37, SD = 5.20) (see Figure 2). Exactly half of the infants (50%) showed longer looks on the switch test trial than the same test trial.
Unlike the results of Experiment 1, 17-month-olds did not form associations between objects and Japanese labels following a brief period of exposure to Japanese. These findings suggest that brief exposure to Japanese may have shifted infants away from accommodating Japanese word-forms. However, it is possible that this perceptual shift is not exclusively due to infants’ brief experience with the phonetic realizations of Japanese, but rather to their more general experience with phonetic variability prior to learning. That is, exposure to any speech input that is sufficiently distinct from their native language could lead infants to constrain their perception and become less accepting of variant word forms. A third experiment was conducted to explore this possibility and assess whether infants would demonstrate the same shift in perception after hearing another non-native language (Spanish) that was phonetically distinct from both English and Japanese.
Experiment 3
In previous research, pre-exposure to accented speech has been shown to facilitate children's recognition of novel words (Schmale et al., Reference Schmale, Hollich and Seidl2011; Schmale et al., Reference Schmale, Cristià and Seidl2012; van Heugten & Johnson, Reference van Heugten and Johnson2014). In these contexts, experiencing accent variations within one's native language is said to facilitate the accommodation of novel word forms via the expansion of existing phonemic categories (see Schmale et al., Reference Schmale, Cristià and Seidl2012). In the current study, we examine how exposure to a non-native language may differentially affect infants’ accommodation of non-native word-forms. Namely, we ask whether exposure to a language (Spanish) that deviates from the phonetic norms of the infants’ native language (English) would make them less likely to accommodate Japanese word-forms. Although it is important for infants to learn to accommodate phonetic variability within their native language, phonetic variability between different languages may also serve as important category boundaries that can help infants selectively direct their attention towards relevant speech input (i.e., input from the language they will primarily be learning). Thus, exposure to a different language may cue infants to whether they are perceiving input from their native language and inform their subsequent learning of novel word-forms.
Method
Participants
Twenty-three 17-month-olds were included in the final sample (see Table 1). An additional 8 infants were recruited but excluded from analyses due to fussiness (n = 5), failure to habituate (n = 2), and parental interference (n = 1).
Stimuli and apparatus
Stimuli and apparatus were identical to those in Experiment 2 except that the passage was a Spanish translation of Twinkle Twinkle Little Star. The passage was recorded by a native female speaker of Spanish.
Procedure and coding
Procedures were the same as in Experiment 2. Looking times were coded offline by two blind coders. Eighteen percent of the data (n = 4) was double-coded to assess for inter-rater reliability. The intra-class correlation indicated strong reliability between coders (ICC = .985, p < .001).
Results and discussion
Preliminary analyses indicated significant differences in looking times between pre-test, post-test, and the final block of habituation (F(2,44) = 87.01, p < .001, η partial2 = .80). Infants looked significantly less during the final block of habituation than during the pre-test (p < .001) and post-test (p < .001) trials, but looking times did not significantly differ between pre-test and post-test trials (p = .25), suggesting recovery of attention during the post-test trial.
Results of a paired sample t-test indicated that infants did not look significantly longer during Switch test trials (M = 7.85, SD = 5.32) than during Same test trials (M = 7.04, SD = 3.81) (p = .51). Thus, like the findings of Experiment 2, infants did not form novel word–object associations following exposure to the Spanish passage. Overall, these results would suggest that preliminary exposure to any language that is phonetically distinct from their native language of English produced a shift in infants’ perception and mapping of Japanese labels.
General discussion
The current studies were designed to examine the conditions under which 17-month-old English learners would accommodate Japanese forms as potential labels for novel objects. Results from Experiment 1 demonstrated that, in the absence of prior exposure to Japanese, English-learning 17-month-olds formed novel word–object associations using Japanese forms. Despite being older and having accumulated more English language experience than the 12-month-olds in Mackenzie et al.’s (Reference MacKenzie, Curtin and Graham2012) study, infants in the current study continued to accommodate Japanese forms. This finding is not surprising given that between 15 and 19 months of age, infants become more flexible in their accommodation of phonetic variations within their own language (Best et al., Reference Best, Tyler, Gooding, Orlando and Quann2009; Mulak et al., Reference Mulak, Best, Tyler, Kitamura and Irwin2013). Thus, it should be expected that 17-month-olds would accept word forms that vary slightly from their typical input. Previous research has also indicated that infants typically do not map meaning to forms that violate the phonotactics of their native language (Graf Estes, Edwards, & Saffran, Reference Graf Estes, Edwards and Saffran2011; Mackenzie et al., Reference MacKenzie, Curtin and Graham2012; Vukatana, Curtin, & Graham, Reference Vukatana, Curtin and Graham2016) or include non-native phonemes (May & Werker, Reference May and Werker2014). Accommodation of variant forms is therefore dependent on the degree to which these forms vary from the structure and representations of one's native language. In the current study, the Japanese forms were phonetically distinct from English but did not include illegal sound combinations or non-native phonemes, and therefore may have been perceived as accented variants of English.
Findings from Experiment 2 and 3, however, also demonstrate that 17-month-old infants are sensitive to language contexts that are indicative of whether or not they are perceiving forms from their native language. Following brief exposure to either Japanese (Experiment 2) or Spanish (Experiment 3), infants no longer associated Japanese labels with novel objects. Again, these results contribute to a larger body of research examining the role of experience on infants’ accommodation of phonetic variation (e.g., Schmale et al., Reference Schmale, Hollich and Seidl2011; Schmale et al., Reference Schmale, Cristià and Seidl2012). In previous research, prior exposure to accent variations has typically been shown to increase the likelihood that infants will generalize words in their native language across different talkers. Experience with accent variability therefore increases infants’ accommodation of phonetic variations that may occur within their own language. In our study, infants were less likely to form novel word–object associationsFootnote 3 after being exposed to language that deviated from the phonetic norms of their native language. Thus, hearing a language that was phonetically distinct from their own reduced infants’ accommodation of phonetic variability. This remarkable ability to flexibly adapt their perception of variable speech is an important asset to infants, particularly within the domain of word learning. By shifting their attention away from word forms that do not exist within their own language, infants can better allocate cognitive resources towards the acquisition of forms that they are more likely to use in their everyday interactions.
Overall, these findings help to elucidate how the system of stable phonological representations can help to incorporate and accommodate variation in the signal. In the absence of contradictory information, infants will accept forms that subtly vary as potential labels. However, the system is constrained when additional information is provided. That is, by presenting infants with a passage containing no recognizable forms, infants do not attempt to accommodate non-native forms into their linguistic system. This demonstrates the sophisticated nature of the representational system, allowing flexibility when no evidence to the contrary is presented, but constraining when the context suggests the input is not part of the system they are learning. The various emerging levels of representation are thus working together to determine how information is interpreted and integrated into the infant's developing system (Curtin, Byers-Heinlein, & Werker, Reference Curtin, Byers-Heinlein and Werker2011; Werker & Curtin, Reference Werker and Curtin2005).
Acknowledgements
This work was supported by funds from the Canada Foundation for Innovation, the Canada Research Chairs program, and the University of Calgary awarded to SG, and by an operating grant from the Social Sciences and Humanities Research Council of Canada (Grant #: 435-2012-0124) awarded to SC and SG, and a Discovery grant from the Natural Sciences and Engineering Research Council of Canada awarded to SG (Grant #194530-2011). VSJ was supported by a postdoctoral fellowship from the Social Sciences and Humanities Research Council of Canada (Grant #: 756-2014-0062) and an Eyes High Postdoctoral fellowship from the University of Calgary. We thank Natalia Czarnecki and Summer Abdalla for their assistance with this research.