During the last decades, much research on bilingual word recognition has focused on the question whether lexical access is language-selective or not. By now, there is evidence from the visual (e.g., Dijkstra & Van Heuven, Reference Dijkstra, Van Heuven, Grainger and Jacobs1998; Duyck, Reference Duyck2005; Duyck, Van Assche, Drieghe & Hartsuiker, Reference Duyck, Van Assche, Drieghe and Hartsuiker2007; Van Assche, Duyck, Hartsuiker & Diependaele, Reference Van Assche, Duyck, Hartsuiker and Diependaele2009), and to a much lesser extent, from the auditory domain (e.g., Ju & Luce, Reference Ju and Luce2004; Lagrou, Hartsuiker & Duyck, Reference Lagrou, Hartsuiker and Duyck2011; Marian & Spivey, Reference Marian and Spivey2003; Schulpen, Dijkstra, Schriefers & Hasper, Reference Schulpen, Dijkstra, Schriefers and Hasper2003; Spivey & Marian, Reference Spivey and Marian1999; Weber & Cutler, Reference Weber and Cutler2004) in favor of a language-nonselective account of lexical access. According to this account, lexical representations from both lexicons are activated at least to a certain degree during word recognition, even when only one language is task-relevant. It is less clear, however, whether there are factors that can constrain language-nonselective lexical access, such as the context of the to-be-recognized words. In the visual domain, a few studies have recently addressed this question (Duyck et al., Reference Duyck, Van Assche, Drieghe and Hartsuiker2007; Libben & Titone, Reference Libben and Titone2009; Schwartz & Kroll, Reference Schwartz and Kroll2006; Titone, Libben, Mercier, Whitford & Pivneva, Reference Titone, Libben, Mercier, Whitford and Pivneva2011; Van Assche et al., Reference Van Assche, Duyck, Hartsuiker and Diependaele2009; Van Assche, Drieghe, Duyck, Welvaert & Hartsuiker, Reference Van Assche, Drieghe, Duyck, Welvaert and Hartsuiker2011; Van Hell & de Groot, Reference Van Hell and de Groot2008), whereas such evidence is almost completely lacking for the auditory domain. In the present study, we therefore investigated whether the auditory presentation of a meaningful sentence context is a factor that can constrain lexical access to the currently relevant lexicon. Moreover, we examined whether the semantic predictability of target words in the sentence is a restricting factor, and additionally, we investigated the influence of sub-phonemic cues, inherent to the native accent of the speaker, on parallel language activation.
Bilingual word recognition in isolation
Evidence for language-nonselective lexical access in bilingual auditory word recognition was first reported by Marian and colleagues.Footnote 1 In the eyetracking study of Spivey and Marian (Reference Spivey and Marian1999), late Russian–English bilinguals who were very proficient in their L2 and living in an L2-dominant environment, were instructed in their L2 to pick up a real-life object (e.g., Pick up the marker). The participants fixated more on competitor items with a name in the irrelevant L1 that was phonologically similar to the target (e.g., a stamp; marka in Russian) than on distracter objects with a name in L1 that was phonologically unrelated to the L2 target. Additionally, there was evidence for language nonselectivity even when participants were listening in L1. When Russian–English bilinguals received an instruction like: Podnimi marku “Pick up the stamp”, they looked more often to interlingual competitor objects (marker) than to distracter objects. Analogous to the findings with the English instructions, this can be explained because the English translation equivalent of marka, stamp, is more phonologically similar to the Russian target word marku than to the distracters.
These results were partly replicated by Weber and Cutler (Reference Weber and Cutler2004) in a later study with Dutch–English bilinguals. These bilinguals, who were living in an L1-dominant environment, were instructed to click on one of four pictures presented in a display, and move it to another location on the computer screen (e.g., Pick up the desk and put it on the circle). There were more fixations on competitor objects whose name had a phonetically similar L1 onset than to distracter objects (e.g., when instructed to pick up the desk, there were more fixations on a picture of a lid than on control items, because lid is the translation equivalent of the Dutch word deksel, phonologically overlapping with the L2 target desk). However, when these participants were instructed in their L1 (e.g., target deksel) competitor items (desk) were not fixated longer than control items, which suggests that non-target language representations in L2 are not activated strongly enough to influence L1 recognition.
In one of Schulpen et al.'s (Reference Schulpen, Dijkstra, Schriefers and Hasper2003) experiments, Dutch–English bilinguals completed a cross-modal priming task in which primes were presented auditorily and targets visually. Visual lexical decision times were longer when the target was preceded by an interlingual homophone than when the target was preceded by a monolingual control. For instance, responses after the pair /liːs/ – LEASE were slower than after /freɪm/ – FRAME (/liːs/ is the Dutch translation equivalent for groin). The observation of longer reaction times after interlingual homophone pairs suggested that bilinguals activated both the Dutch and the English meaning of the homophone. Furthermore, the authors observed that the auditory presentation of the English pronunciation of the interlingual homophone led to faster decision times on the related English target word than the Dutch version of the interlingual homophone. This indicates that these subtle differences between homophones may affect the degree of cross-lingual activation spreading, which will turn out to be important for the present study. These differences are most likely situated at the sub-phonemic level (e.g., languages often differ in the length of voice onset time (VOT)), but it is possible that there are suprasegmental differences too (e.g., Lee & Nusbaum, Reference Lee and Nusbaum1993).
Further studies on the influence of sub-phonemic cues on lexical access in bilinguals were reported by Lagrou et al. (Reference Lagrou, Hartsuiker and Duyck2011) and Ju and Luce (Reference Ju and Luce2004). Lagrou et al. conducted a lexical decision experiment in L2 or L1 with Dutch–English bilinguals, living in an L1-dominant environment. The participants responded more slowly to homophones (e.g., lief “sweet” – leaf /liːf/) than to matched control words, both in L2 and L1, whereas a monolingual English control group showed no effect. Moreover, this study investigated whether the listener's selectivity of lexical access is influenced by the speaker's L1. With this aim, targets were pronounced by a native Dutch speaker with English as the L2 or by a native English speaker with Dutch as the L2. Although the speaker's accent contains language cues that might affect the activation of target and non-target languages (Schulpen et al., Reference Schulpen, Dijkstra, Schriefers and Hasper2003), there was no interaction between the homophone effect and the native language of the speaker. In sum, the results of this study suggest that bilinguals do not use these language- and speaker-specific sub-phonemic cues to restrict lexical access to only one lexicon, even though this implies a less efficient strategy for lexical search.
Ju and Luce (Reference Ju and Luce2004) also found evidence for language-nonselective lexical access. Here however, the effect was modulated by sub-phonemic information related to language-specific voice-onset times (VOTs). In a visual world eye-tracking study, Spanish–English bilinguals fixated pictures of interlingual competitors (nontarget pictures whose English names (e.g., pliers) shared a phonological similarity with the Spanish targets (e.g., playa “beach”)) more frequently than control distracters. However, this effect was only found when the Spanish target words were altered to contain English-appropriate voice onset times. When the Spanish targets had Spanish VOTs, no L1 interference was found. The results of this study suggest that bilingual listeners may still use fine-grained, sub-phonemic, acoustic information related to language-specific VOT to regulate cross-lingual lexical activation. At first sight, this is in contrast with the result of Lagrou et al. (Reference Lagrou, Hartsuiker and Duyck2011). However, in the Ju and Luce study, a salient acoustic feature (voicing) was manipulated systematically, so that this artificial cue was a reliable and consistent predictor of language membership, whereas the stimuli in the Lagrou et al. study differed on a wider range of acoustic parameters (i.e., all sub-phonemic cues related to the native accent of the speaker). Moreover, in the study of Ju and Luce all stimuli started with voiceless stops, whereas the stimuli of Lagrou et al. started with a variety of sounds (i.e., nasals and fricatives).
Bilingual word recognition in a sentence context
Monolingual studies have demonstrated that contextual information is used to facilitate word recognition in the native language. For example, when reading ambiguous words, context helps to select the correct interpretation (e.g., Binder & Rayner, Reference Binder and Rayner1998; Onifer & Swinney, Reference Onifer and Swinney1981; Rayner & Frazier, Reference Rayner and Frazier1989). Moreover, predictable words are processed faster than non-predictable words (e.g., Schwanenflugel & LaCount, Reference Schwanenflugel and LaCount1988; Schwanenflugel & Shoben, Reference Schwanenflugel and Shoben1985; Stanovich & West, Reference Stanovich and West1983). Semantic information provided by sentence context may also influence lexical selection in bilingual visual word recognition. For example, Schwartz and Kroll (Reference Schwartz and Kroll2006) and Van Hell and de Groot (Reference Van Hell and de Groot2008), found that the cognate facilitation effect (i.e., faster RTs on cognates than on matched control items) and the homograph effect (i.e., slower RTs on interlingual homographs than on matched control items), markers of cross-lingual lexical interactions in the visual domain, were annulled or diminished when reading high-constraining sentences. In the study by Van Assche et al. (Reference Van Assche, Drieghe, Duyck, Welvaert and Hartsuiker2011), cross-lingual interactions in high-constraining sentences were significant, both on the early and late reading times, whereas in a study by Libben and Titone (Reference Libben and Titone2009), this was only the case on the early reading time measures and not on the late comprehension measures. According to Titone et al. (Reference Titone, Libben, Mercier, Whitford and Pivneva2011), a possible explanation for the differences between studies could be that bilinguals differ in the relative degree of L2 proficiency or other variables that were not taken into account. Taken together, these studies indicate that semantic constraint influences, but does not annul, the co-activation of representations from both languages in the bilingual lexicon, at least in visual language processing.
In the auditory domain, research on bilingual word recognition in a sentence context is more scarce. Chambers and Cooke (Reference Chambers and Cooke2009) investigated whether interlingual lexical competition is influenced by the prior sentence context. In this visual world study, English–French bilinguals with varying proficiency levels listened to L2 sentences, and were instructed to click on the image that represented the sentence-final word. Each display contained an image of the final noun target (e.g., chicken), an interlingual near-homophone (e.g., pool) whose name in English is phonologically similar to the French target (e.g., poule “chicken”), and two unrelated distracter items. The interlingual competitors were fixated more than unrelated distracter items when the prior sentence information was compatible with the competitor (i.e., both the French target and the interlingual near-homophone are plausible in the sentence context, e.g., Marie va décrire la poule “Marie will describe the chicken”), but not when this sentence information was incompatible with the competitor (i.e., only the French target, but not the interlingual near-homophone is plausible in the sentence context, e.g., Marie va nourrir la poule “Marie will feed the chicken”). These findings suggest that semantic constraints imposed by a sentence context may override activation of non-target language lexical competitors in the auditory domain.
FitzPatrick and Indefrey (Reference FitzPatrick and Indefrey2010) recorded EEGs from Dutch–English bilinguals listening to sentences in L2, containing semantic incongruities that typically elicit an N400 component. When listening to an incongruity in L2, this component is delayed in comparison with the component when listening to an incongruity in L1. In one condition of this study, the last word of the sentence was a word with initial overlap with an L1 translation equivalent of the most probable sentence completion (e.g., “My Christmas present came in a bright-orange doughnut” (initial overlap with “doos” where doos is Dutch for box). There was an N400 effect to L1 translation equivalents that were initially congruent with the sentence context. Importantly, this N400 had the same timing as the N400 in response to a semantic incongruity whose translation equivalent did not have initial congruence. Thus, when listening to sentences in L2, L1 competitors were not activated (or these L1 competitors are at least not considered for semantic integration). Because these sentences are quite semantically constraining towards the targets, FitzPatrick and Indefrey argued that sentences that bias towards specific lexical representations in the target language yield no cross-lingual effects.
Although both studies above used meaningful sentences in their studies, there have been no bilingual studies on auditory word recognition that directly manipulated the degree of semantic sentence constraint within-study, assessing its influence on cross-lingual interactions. The results of Chambers and Cooke (Reference Chambers and Cooke2009) and FitzPatrick and Indefrey (Reference FitzPatrick and Indefrey2010) suggest that contextual factors may have a larger impact on the degree of language selectivity in the auditory domain than in visual word recognition. However, it may also be the case that modulations by semantic constraint may be more pronounced for words with interlingual form overlap only (i.e., homographs and homophones, as used here) than for the typical cognates (e.g., Van Assche et al., Reference Van Assche, Drieghe, Duyck, Welvaert and Hartsuiker2011) in the visual studies, because such a constraint is compatible with the (shared) meaning of the L1 and L2 reading of cognates but only compatible with one of the two readings of a homograph/homophone. As such, a suggested interaction between sentence constraint and modality (visual vs. auditory) may be a by-product of the type of critical stimuli used to assess cross-lingual interactions. For the auditory domain, it remains possible that under high constraint, only one homophone meaning is considered, rendering the stimulus similar to one without form overlap.
The present study
Our goal was to address three questions. First, we investigated whether there is parallel language activation when listening to meaningful sentences in L2. Second, we investigated the influence of semantic constraint on lexical access when listening in L2. Third, we also tested whether sub-phonemic cues, provided by the native accent of a speaker, are used to restrict lexical access when listening to sentences. Our previous study (Lagrou et al., Reference Lagrou, Hartsuiker and Duyck2011) suggested that these sub-phonemic cues inherent to the native accent of the speaker are not used to restrict lexical access to the currently relevant lexicon when listening to words in isolation. However, in daily life people usually do not listen to isolated words, but have conversations with other people. Of course, a continuous stream of auditory input contains far more cues that could provide the listener with information about the language in use, which makes it more likely that such cues are indeed used to restrict lexical access when the input consists of sentences compared to isolated words. On the other hand, both in real life and in our experiments, speakers sometimes speak in a language that is the L2 to them, so that the cues picked up from the speaker's accent may be misleading with respect to language membership. Because cues based on speaker accent are not always valid indicators of the language for recognition, it is possible that listeners still do not exploit them to regulate the degree of language selectivity. The present design may reveal which of these two hypotheses is correct.
To summarize, we investigated whether L1 knowledge influences lexical access when listening to sentences in L2. With this aim, Dutch–English bilinguals completed an English lexical decision task on the last word of spoken sentences. In critical trials, this last word was either an interlingual homophone or a matched control word. To investigate the influence of sentence constraint, sentences were either low- or high-constraining. To test whether lexical access is sensitive to cues related to the native accent of the speaker, sentences were pronounced by a native Dutch speaker or by a native English speaker.
Method
Participants
Sixty-four students from Ghent University participated in the experiment for course credits or a monetary fee. All were native Dutch speakers and reported English as their L2.Footnote 2 They started to learn English around the age of 14 years at secondary school, and because they were regularly exposed to their L2 through popular media, entertainment, and English university textbooks, they were all quite proficient in their L2, even though they lived in a clearly L1-dominant environment.Footnote 3 After the experiment, participants were asked to rate their L1 (Dutch) and L2 (English) proficiency with respect to several skills (reading, writing, speaking, understanding and general proficiency) on a seven-point Likert scale ranging (1 = very bad, 7 = very good). We also assessed general L3 (French) proficiency. Means are reported in Table 1. Mean self-reported L1 (M = 6.03), L2 (M = 5.03), and L3 (M = 4.00) general proficiency differed significantly (dependent samples t-tests yielded ps < .001).
Participants were not informed that their L1 knowledge would be of any relevance to the experiment. Thirty-three participants listened to the sentences pronounced by the native Dutch speaker, 31 participants listened to the sentences pronounced by the native English speaker. One participant who made more than 20% errors was excluded from all analyses.
Stimulus materials
Target stimuli consisted of 240 stimuli: 30 interlingual Dutch–English homophones (e.g., lief “sweet” – leaf /liːf/), 30 matched English control words, 60 English filler words, and 120 nonwords. All targets were selected from the stimulus list of Lagrou et al. (Reference Lagrou, Hartsuiker and Duyck2011), in order to increase comparability across studies, and therefore make it possible to assess the context effects while keeping stimuli constant. Targets were between three and seven phonemes long, and control words were matched with these homophones with respect to number of phonemes and English frequency as reported in the CELEX database (ps > .32). Nonwords were created with the WordGen stimulus software (Duyck, Desmet, Verbeke & Brysbaert, Reference Duyck, Desmet, Verbeke and Brysbaert2004). They were phoneme strings with no Dutch or English meaning, but with a legal English phonology, and they were matched with interlingual homophones and control words with respect to word length. For each target, a low- and high-constraining sentence was constructed, resulting in 480 sentences. Sentences were matched in terms of number of words and syntactic structure. Targets were always in the final position of the sentence. To ensure that participants would not see the same target twice, sentences were divided across two lists. The low- and high-constraining sentences for each homophone-control pair are included in the Supplementary Materials accompanying this paper on the online pages of the journal (http://journals.cambridge.org/BIL). Sentences were pronounced by a native Dutch speaker who was also a highly proficient English speaker, or by a native English speaker who was also a highly proficient Dutch speaker. Using WaveLab software, stimulus materials were recorded in a sound-attenuated booth by means of a SE Electronics USB1000A microphone with a sampling rate of 44.1 kHz and a 16-bit sample size. Sentence and target durations were measured with WaveLab software.
Sentence completion
To verify the constraint manipulation of the sentences containing an interlingual homophone or control word, a sentence completion study was conducted with 20 further participants. Participants saw each sentence without the interlingual homophone/control word, and were instructed to complete the sentence with the first word that came to mind when reading the sentence. Production probabilities for interlingual homophones and control words were extremely low for low-constraining sentences, and were very high for high-constraining sentences. Production probabilities for the irrelevant L1 translation equivalents of the homophone were extremely low for low- and high-constraining sentences (see Table 2).
Additionally, another 15 participants were asked to rate the plausibility of the low-constraining sentences on a scale from 1 (not at all plausible) to 9 (very plausible). A paired t-test demonstrated that plausibility ratings for homophone sentences (M = 5.79, SD = 0.50) did not differ from ratings for control word sentences (M = 6.06, SD = 1.51), t(29) = –0.76, p = .46.
Speakers
The native Dutch speaker was a 25-year-old female with Dutch as L1 and English as L2. She had 12 years of L2 experience. Her English was very fluent but characterized by a clear Dutch accent. The native English speaker was a 45-year-old female with English as L1 and Dutch as L2. She had L2 experience since she moved to the Dutch-speaking part of Belgium 15 years ago. Her Dutch was very fluent but characterized by a clear English accent.
Procedure
Participants received written instructions in English (their L2) to perform an English lexical decision task on the last word of each sentence. They wore a headphone through which sentences were presented auditorily. Before the experiment, a practice session of 24 trials was completed. Each trial started with a 500 ms presentation of a fixation cross in the center of the screen. After another 200 ms the sentence was presented. Then participants had to decide whether the last word was an English word or a nonword. When a word (nonword) was presented, participants used their right (left) index finger to press the right (left) button of a response box. Visual feedback was presented on the screen during 200 ms (i.e., when an error was made the screen turned red, when the response was correct, “OK!” appeared). The next trial started 500 ms later. After the experiment, participants completed a questionnaire assessing self-ratings of L1 and L2 proficiency (reading, speaking, writing, understanding, and general proficiency), and general L3 proficiency on a seven-point Likert scale (1 = very bad, 7 = very good), and a backward translation test to verify that they knew the L2 words.
Results
On average, participants made 6.54% errors (SD = 2.30). Errors, trials with RTs faster than 300 ms after target onset, and trials with RTs more than 2.5 standard deviations above the participant's mean RT after target offset for word targets were excluded from the analyses. As a result, 8.29% of the data were excluded from the analyses. In all experiments, reported latency analyses are based on RTs measured from (auditory) target offset.Footnote 4 We reported these measures because the native and non-native speaker differed in pronunciation duration (p < .01). Importantly, the pronunciation durations did not differ systematically for low- and high-constraining sentences (p > .15) (see Table 3).
An ANOVA on the reaction times (see Figure 1 and Table 4) with target type (interlingual homophone vs. control) and sentence constraint (low vs. high) as the independent within-subjects variables and speaker (native Dutch vs. native English) as the independent between-subjects variable revealed a main effect of target type, F1(1,61) = 234.50, p < .001, ŋ2p = .79; F2(1,29) = 27.01, p < .001, ŋ2p = .48, indicating that reaction times were significantly slower on interlingual homophones than on control words. Importantly, the main effect of sentence constraint was significant, F1(1,61) = 325.92, p < .001, ŋ2p = .84; F2(1,29) = 152.36, p < .001, ŋ2p = .84, indicating that participants responded significantly faster on targets that were preceded by a high-constraining sentence context than on targets that were preceded by a low-constraining sentence context. This ensures validity of the constraint manipulation. The main effect of speaker was also significant, F1(1,61) = 10.24, p < .01, ŋ2p = .14; F2(1,29) = 80.23, p < .001, ŋ2p = .74, indicating that participants responded faster when the sentences were pronounced by the native English speaker than when they were pronounced by the native Dutch speaker. Moreover, the interaction between sentence constraint and target type was also significant, F1(1,61) = 28.49, p < .001, ŋ2p = .32; F2(1,29) = 9.30, p < .01, ŋ2p = .24, showing a larger homophone effect in the low-constrained condition. Planned comparisons demonstrated that the homophone effect was significant when the target was preceded by a low-constraining sentence, F1(1,61) = 173.23, p < .001, ŋ2p = .74; F2(1,29) = 46.68, p < .001, ŋ2p = .62, but also when the target was preceded by a high-constraining sentence, F1(1,61) = 56.85, p < .001, ŋ2p = .48; F2(1,29) = 6.32, p < .05, ŋ2p = .18. The interaction between target type and speaker was significant in the by-subjects analysis, F1(1,61) = 6.84, p < .05, ŋ2p = .10; F2(1,29) = 3.06, p = .09, ŋ2p = .10, with a larger effect for the Dutch native speaker. Planned comparisons demonstrated that the homophone effect was significant when sentences were pronounced by the native Dutch speaker, F1(1,61) = 168.76, p < .001, ŋ2p = .80; F2(1,29) = 23.26, p < .001, ŋ2p = .42, but also when sentences were pronounced by the native English speaker, F1(1,61) = 76.95, p < .001, ŋ2p = .82; F2(1,29) = 21.10, p < .001, ŋ2p = .45. No further interaction was significant, all F1s < 1, F2s < 1.
It is conceivable that the interactions of the interlingual homophone effect with semantic constraint and speaker accent are influenced by the overall faster reaction times on high-constraint sentences and sentences pronounced by the native English speaker, yielding smaller homophone effects. On this account, semantic information and cues inherent to the native accent of the speaker speed up word recognition, but are not used as a strict cue to restrict lexical search to a single language, and therefore do not modulate the degree of language nonselectivity. To test this hypothesis taking into account baseline RT differences across constraint conditions, we first calculated the percentage homophone interference score for each semantic context.Footnote 5 For each participant, and for both the low- and high-constraining sentences, the difference between the reaction times on homophone sentences and the reaction times on control sentences was divided by the reaction times on control sentences. A paired t-test demonstrated that this interference score was not significantly different for low- and high-constraining sentences, p = .26. Second, we also calculated the percentage homophone interference score for both the Dutch and the English speaker. Again, a paired t-test revealed that the interference score was not significantly different for both speakers, p = .87. This analysis supports the possibility that the interaction effects of the homophone effect with both semantic constraint and the native language of the speaker reflect overall RT differences. In any case, in each of the separate conditions, the homophone effect, as a marker of cross-lingual lexical interactions, was significant.
Because the results of the plausibility ratings of the low-constraining sentences demonstrated that some of the low-constraining sentences may not have been very plausible, we ran an additional analysis in which we excluded the low-constraining sentences and their high-constraining counterpart of which the homophone or control word had a plausibility score lower than 4 on a scale from 1 (not at all plausible) to 9 (very plausible). As a consequence, ten sentences were excluded from this analysis. (These sentences are marked in the Supplementary Materials with an asterisk.) However, the exclusion of these sentences did not change the pattern of results, except that the interaction between sentence constraint and target type was only significant in the analysis by subjects, F1(1,61) = 15.23, p < .001, ŋ2p = .28 but not in the analysis by items, F2 < 1, probably because this of course limited the number of critical stimuli considerably.
Discussion
The present study investigated whether lexical access is language-nonselective when listening to words that are embedded in meaningful sentences in L2. Furthermore, we examined whether the degree of language nonselectivity is modulated by the semantic constraint of the sentences and by the (native or non-native) accent of the speaker of the sentences. Dutch–English late bilinguals, immersed in an L1-dominant environment, completed an L2 auditory lexical decision task on the last word of low- and high-constraining sentences that were pronounced by a native Dutch or by a native English speaker. The results showed that reaction times were significantly slower on interlingual homophones (e.g., lief “sweet” – leaf /liːf/) than on matched control words. This indicates that our bilingual listeners activated both the L2 and the L1 representation of the homophones, and it implies that sub-phonemic cues provided by the stream of speech in a sentence are not used to restrict lexical access to a single lexicon. We found this effect, even though the participants in this study were late bilinguals that are moderately proficient in their L2, and typically use it less than 5% of the time (for a quantification of language dominance in this homogenous population, see Duyck & Warlop, Reference Duyck and Warlop2009). A question that needs to be investigated in future research concerns whether these cross-lingual effects may interact with lower/higher L2 proficiency levels than those of the current study. The current results extend the monolingual finding of Frazier and Rayner (Reference Frazier and Rayner1990) for example, who reported that intralingual homophones are recognized more slowly than non-homophones. The present study also extends previous work on isolated auditory word recognition (e.g., Lagrou et al., Reference Lagrou, Hartsuiker and Duyck2011; Marian & Spivey, Reference Marian and Spivey2003; Schulpen et al., Reference Schulpen, Dijkstra, Schriefers and Hasper2003; Spivey & Marian, Reference Spivey and Marian1999; Weber & Cutler, Reference Weber and Cutler2004) to word recognition in more ecologically valid contexts, namely sentences.
Second, we considered the influence of factors potentially modulating cross-lingual activation spreading, namely semantic constraint of the sentence and speaker accent. The main effect of sentence constraint is consistent with earlier findings in monolingual and bilingual studies of visual word recognition. In the monolingual domain (e.g., Schwanenflugel & LaCount, Reference Schwanenflugel and LaCount1988; Schwanenflugel & Shoben, Reference Schwanenflugel and Shoben1985; Stanovich & West, Reference Stanovich and West1983) participants are typically faster to recognize words that are highly predictable from the preceding sentence context. In the bilingual visual domain, faster reading times for high-constraining sentences than for low-constraining sentences were also found, for example, by Van Assche et al. (Reference Van Assche, Drieghe, Duyck, Welvaert and Hartsuiker2011). Here, we generalize this effect to auditory bilingual word recognition. Importantly, the interaction between the homophone effect and semantic constraint of the sentence was significant, which indicates that the homophone effect was smaller, but not completely annulled, when the preceding sentence context was highly constraining towards the target. This suggests that the semantic constraint of the sentence affects the activation level of representations from both the native and the nonnative language in the bilingual language system, but this activation pattern does not completely eliminate cross-language activation as such. Note that studies in the domain of visual word recognition show a mixed pattern of results, with some studies finding that semantic constraint eliminates cross-lingual effects (Schwartz & Kroll, Reference Schwartz and Kroll2006; Van Hell & de Groot, Reference Van Hell and de Groot2008) and other studies showing that such effects survive a highly semantically constraining sentence (Libben & Titone, Reference Libben and Titone2009; Titone et al., Reference Titone, Libben, Mercier, Whitford and Pivneva2011; Van Assche et al., Reference Van Assche, Drieghe, Duyck, Welvaert and Hartsuiker2011). In the bilingual auditory domain, the results of Chambers and Cooke (Reference Chambers and Cooke2009) and FitzPatrick and Indefrey (Reference FitzPatrick and Indefrey2010) suggest that semantic constraints imposed by a sentence context may annul activation of non-target language lexical competitors. The findings from the present study, however, demonstrate that such a constraint may influence word recognition, but does not necessarily eliminate cross-lingual lexical interactions. A possible explanation for these divergent results could be that we used interlingual homophones, of which the lexical and phonological overlap is maximal. In contrast with our stimuli, Chambers and Cooke used near-homophones and the stimuli of FitzPatrick and Indefrey only shared an overlapping onset. Hence, cross-lingual activation spreading in those studies was much weaker and therefore probably also more easily overridden by the stronger semantic context manipulation.
Third, we tested whether the homophone interference effect was modulated by the native accent of the speaker. The results showed that participants are faster when sentences are pronounced by the native English speaker than when they are pronounced by the native Dutch speaker. It is possible that the threshold for word recognition is exceeded faster when the pronunciation provides a closer match to the listener's stored representation, which is indeed the case when English sentences are pronounced by a native English speaker.Footnote 6 This explanation is also compatible with the results of Adank, Evans, Stuart-Smith and Scott (Reference Adank, Evans, Stuart-Smith and Scott2009), who demonstrated that listeners have difficulties processing speech with a nonnative accent. At least, the fact that reaction times are influenced by the native accent of the speaker demonstrates that the different accents of our speakers indeed contained language-specific acoustic information, which constitutes a valid manipulation check for the assumed sub-phonemic differences between languages. In future research, it would be interesting to investigate more in detail whether this accent-effect arises because Dutch speech increases the salience of Dutch or the fact that accented speech is less intelligible overall. The results also showed that the homophone effect was reduced (but not eliminated) when sentences were pronounced by the native English speaker. This suggests that sub-phonemic cues, inherent to the speaker's L1 are used to some extent as a cue to restrict lexical search to a single lexicon. These findings are consistent with Schulpen et al. (Reference Schulpen, Dijkstra, Schriefers and Hasper2003), who reported that the English pronunciation of (auditorily presented) interlingual homophones led to stronger priming of the English target than the Dutch pronunciation of that same homophone. They are also partly consistent with Ju and Luce (Reference Ju and Luce2004), who found that L1 recognition (Spanish) was influenced by L2 (English) competitors if L1 materials contained L2 sub-phonemic features (i.e., English VOTs), even though the strong acoustic feature (i.e., voicing) in that study was manipulated systematically, whereas the present stimuli differed on a wider range of acoustic parameters, so that such information is less reliable as a cue for lexical selection.
These findings have several theoretical implications. First, this study demonstrates that the language-nonselective nature of lexical access is not fundamentally altered by the preceding (low-constraining) sentence context: even unilingual language context is not used as a restrictive lexical cue, even though this might be an efficient strategy to speed up word recognition as this would surely eliminate a sizable proportion of the considered lexical candidates. Note, however, that Vitevitch (Reference Vitevitch2012) conducted a corpus analysis which challenges the fact that many lexical candidates are active at the same time.
Second, the current results show that a high-constraining sentence context does influence the language nonselectivity of lexical access in the bilingual language system. Nevertheless, it does not prevent activation of lexical representations in the non-target language, not even when these representations do not meet these semantic restrictions (the critical stimuli were interlingual homophones and therefore only have form overlap across languages).
Third, the results of the present study also demonstrate that speech-specific cues provided by the native accent of the speaker are used to some extent to modulate the language-nonselective nature of bilingual lexical access. However, the fact that the homophone effect remained significant when sentences were pronounced by the native Dutch speaker demonstrates that these sub-phonemic cues are not applied to completely restrict lexical access to the currently relevant lexicon.
Our interlingual homophone effects can be explained by extending monolingual models of auditory word recognition such as the Distributed Model of Speech Perception (Gaskell & Marslen-Wilson, Reference Gaskell and Marslen-Wilson1997), NAM (Luce & Pisoni, Reference Luce and Pisoni1998), Shortlist (Norris, Reference Norris1994; Norris, McQueen, Cutler & Butterfield, Reference Norris, McQueen, Cutler and Butterfield1997), and TRACE (Elman & McClelland, Reference Elman and McClelland1988; McClelland & Elman, Reference McClelland and Elman1986) if they are extended with the assumption that L2 representations are part of the same system as, and interact with, L1 representations. The results of the present study also demonstrate that there is an influence of top–down factors such as the semantic constraint of the sentence or sub-phonemic information provided by the native accent of the speaker, to inhibit lexical representations belonging to a particular language. Thus, at a theoretical level, the results of the present study are compatible with a model of bilingual auditory word recognition that supports language-nonselective bottom–up activation with a role for top–down connections that does not result in a functionally language-selective system. Because the homophone effect was reduced, but did not disappear in the high constraining condition and in the condition in which sentences were pronounced by the native English speaker, we can conclude that this role is limited. These findings are partly in line with the visual domain, for which there is a dominant model of bilingual word recognition, i.e., the BIA+ model (Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002). This model consists of language nodes which act as language membership representations within the word identification system, but these nodes do not have top–down connections that regulate cross-lingual activation.
In sum, the present study provides evidence for the conclusion that lexical access is language-nonselective. However, when the semantic context is highly constraining and when the native accent of the speaker is compatible with the target language, cross-lingual interactions are reduced (but not eliminated) by these semantic and accent-specific cues when listening to sentences in L2.