Bilinguals occasionally experience thinking that the spoken words they hear belong to one known language when in fact they belong to another. Such confusions likely arise because bilinguals simultaneously activate multiple languages during spoken word recognition (e.g. Blumenfeld & Marian, Reference Blumenfeld and Marian2007; Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer & Wagner, Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Marian & Spivey, Reference Marian and Spivey2003a, Reference Marian and Spiveyb; Marian, Spivey & Hirsch, Reference Marian, Spivey and Hirsh2003; Shook & Marian, Reference Shook and Marian2012), as well as during reading (De Groot & Nas, Reference De Groot and Nas1991; Dijkstra, De Bruijn, Schriefers & Brinke, Reference Dijkstra, De Bruijn, Schriefers and Brinke2000; Duyck, Van Assche, Drieghe & Hartsuiker, Reference Duyck, Van Assche, Drieghe and Hartsuiker2007; Libben & Titone, Reference Libben and Titone2009) and speaking (Costa & Santesteban, Reference Costa and Santesteban2004; Costa, Santesteban & Ivanova, Reference Costa, Santesteban and Ivanova2006; Grainger & Beauvillain, Reference Grainger and Beauvillain1987; Jackson, Swainson, Cunnington & Jackson, Reference Jackson, Swainson, Cunnington and Jackson2001; Kroll & Gollan, Reference Kroll, Gollan, Ferreira, Goldrick and Miozzo2014; Meuter & Allport, Reference Meuter and Allport1999; Thomas & Allport, Reference Thomas and Allport2000). The parallel activation of words in a bilingual's two languages, combined with the challenges of second language (L2) phoneme discrimination (Cutler, Weber & Otake, Reference Cutler, Weber and Otake2006; Weber & Cutler, Reference Weber and Cutler2004), increases the normal demands of spoken word processing (Luce & Pisoni, Reference Luce and Pisoni1998; Marlsen-Wilson & Welsh, Reference Marslen-Wilson and Welsh1978; McClelland & Elman, Reference McClelland and Elman1986; McQueen & Cutler, Reference McQueen and Cutler2001; Norris, Reference Norris1994), particularly for L2 vs. first language (L1) speech (Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010).
Of relevance to the present study, the amount of cross-language activation experienced at any given point in time may vary as a function of the particular language context bilinguals find themselves in, and whether it follows an entirely different language context or not. For example, someone returning to their L1-English-speaking home from their L2-French-speaking workplace might experience a different pattern of cross-language activation than someone who did not experience the same language switch, or who experienced the switch in an L1 to L2 direction. Here, we investigate precisely this issue by determining whether cross- and within-language activation during spoken language processing varies as a function of whether a spoken comprehension task follows a conversation in the same or different language. To set the stage, we first review what is known about bilingual spoken language processing, the notions of language mode and language switching, and the idea that bilinguals have the capacity to inhibit activation of lexical representations of an entire language as well as specific lexical representations corresponding to individual words.
Studies using varied approaches have investigated cross-language activation during bilingual spoken language processing (e.g. Costa & Santesteban, Reference Costa and Santesteban2004; Costa et al., Reference Costa, Santesteban and Ivanova2006; Fitzpatrick & Indefrey, Reference FitzPatrick and Indefrey2010; Jackson et al., Reference Jackson, Swainson, Cunnington and Jackson2001; Meuter & Allport, Reference Meuter and Allport1999; Phillips, Klein, Mercier & De Boysson, Reference Phillips, Klein, Mercier and De Boysson2006; Thierry & Wu, Reference Thierry and Wu2007). Some studies focus on real-world aspects of spoken language processing in the context of naturalistic language interactions, when bilinguals simultaneously produce speech, self-monitor their own output, and decode the speech of another person (e.g. Pivneva, Palmer & Titone, Reference Pivneva, Palmer and Titone2012). Others focus more specifically on spoken language comprehension where the primary demand is to decode a speech signal (e.g. Blumenfeld & Marian, Reference Blumenfeld and Marian2007; Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Cutler et al., Reference Cutler, Weber and Otake2006; Marian & Spivey, Reference Marian and Spivey2003a, Reference Marian and Spiveyb; Marian et al., Reference Marian, Spivey and Hirsh2003; Mercier, Pivneva & Titone, Reference Mercier, Pivneva and Titone2014; Shook & Marian, Reference Shook and Marian2012; Spivey & Marian, Reference Spivey and Marian1999; Titone, Mercier, Sudarshan, Pivneva & Baum, Reference Titone, Mercier, Sudarshan, Pivneva and Baum2014; Weber & Cutler, Reference Weber and Cutler2004). Many comprehension studies have used naturalistic tasks such as the visual world method, which allows assessment of moment-by-moment “activation” of a target word relative to word-onset competitors and unrelated words, thus offering good temporal resolution in a relatively natural and undemanding task (Allopenna, Magnuson & Tanenhaus, Reference Allopenna, Magnuson and Tanenhaus1998; Cooper, Reference Cooper1974; Dahan, Magnuson & Tanenhaus, Reference Dahan, Magnuson and Tanenhaus2001; Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy, Reference Tanenhaus, Spivey-Knowlton, Eberhard and Sedivy1995; see Huettig, Rommers & Meyer, Reference Huettig, Rommers and Meyer2011, for a review and critical evaluation).
In a now-classic visual world experiment, Spivey and Marian (Reference Spivey and Marian1999) presented Russian–English bilinguals with an array of objects including the target (an object called “speaker” in English), a phonologically similar cross-language competitor (spichki “matches” in Russian), and two phonologically and semantically unrelated distracters, and monitored their eye movements as they heard spoken instructions to move the target object (speaker). Consistent with the idea that bilinguals automatically activate a non-target language, cross-language competitor pictures were fixated significantly more often than unrelated pictures. However, subsequent studies showed this cross-language effect to be significantly smaller than that found for within-language competition (e.g. “spear” given the target word “speaker”). Furthermore, the size of the cross-language effect varied with factors such as whether the task occurred in the L1 or L2 (Marian & Spivey, Reference Marian and Spivey2003a), age of acquisition, proficiency, and frequency of exposure to the target and non-target languages (Blumenfeld & Marian, Reference Blumenfeld and Marian2007; Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Mercier et al., Reference Mercier, Pivneva and Titone2014; Titone et al., Reference Titone, Mercier, Sudarshan, Pivneva and Baum2014). Other work suggests that cross-language differences in subtle acoustic cues can undermine cross-language competition, for example, when the voice onset time of particular consonants systematically differs across two languages (Ju & Luce, Reference Ju and Luce2004).
Presumably, cross-language competition adds to the cognitive burden of language processing for bilinguals, who must recruit domain-general capacities such as inhibitory control to help resolve cross-language competition. Consistent with this view, we previously found that bilinguals who had better inhibitory control, as measured by a battery of domain-general tasks, showed less within- and cross-language competition than bilinguals who had weaker inhibitory control (Mercier et al., Reference Mercier, Pivneva and Titone2014; see also Pivneva, Mercier & Titone, Reference Pivneva, Mercier and Titone2014, for reading, and Pivneva & Titone, Reference Pivneva and Titone2014, for spoken production). These findings cohere with work suggesting that cognitive control is recruited specifically to inhibit within-language competitors during spoken language processing by bilinguals, but not monolinguals (Blumenfeld & Marian, Reference Blumenfeld and Marian2011).
While it is fairly clear that bilinguals can modulate activation of specific lexical forms, either to inhibit or enhance activation depending on context (see Bialystok, Craik, Green & Gollan, Reference Bialystok, Craik, Green and Gollan2009, for a discussion), less understood is whether and how they modulate activation of an entire language as a function of prior language exposure. There appear to be two distinct possibilities given the literature. Based on the “language mode” hypothesis (Grosjean, Reference Grosjean and Nicol2001, Reference Grosjean2008), one possibility is that prior exclusive use of a particular language (e.g. French at work) increases baseline activation of all words in that language (i.e. French) by putting people into a monolingual mode of processing. This would lead to less cross-language competition in a subsequent communicative context that remains in that language (e.g. French at home), but presumably more cross-language competition in a subsequent context that switches to a different language (e.g. English at home). This possibility is consistent with findings showing that a preceding task, when performed in the same language as the target task, helps participants focus on the target language (e.g. Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Elston-Güttler, Gunter & Kotz, Reference Elston-Güttler, Gunter and Kotz2005), but when performed in a different language, leads to greater cross-language competition, at least early on (Elston-Güttler et al., Reference Elston-Güttler, Gunter and Kotz2005). However, when the preceding context does not focus on a particular target language, cross-language competition is enhanced (Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Elston-Güttler & Gunter, Reference Elston-Güttler and Gunter2008; Elston-Güttler et al., Reference Elston-Güttler, Gunter and Kotz2005). For example, Canseco-Gonzalez et al. (Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010) found greater Spanish cross-language competition during an all-English visual world task in participants who had used both English and Spanish prior to the visual world experiment.
In contrast, a second possibility is that prior use of a particular language (e.g. French at work) causes bilinguals to actively and globally inhibit that language when they switch to a subsequent task involving a different language (e.g. English at home) to pre-emptively reduce interference (De Groot & Christoffels, Reference De Groot and Christoffels2006; Green, Reference Green1998; Koch, Gade, Schuch & Philipp, Reference Ju and Luce2010). This would have the effect of reducing cross-language activation in a switched language context, perhaps even over and above what might be observed in a non-switched context or a context requiring the use of both languages. Consistent with this view, studies examining language switching in blocked designs show that bilinguals can restrict interference by globally suppressing a non-target language (e.g. Gerfen, Tam, McClain, Linck & Kroll, 2015; Levy, McVeigh, Marful & Anderson, Reference Levy, McVeigh, Marful and Anderson2007; Linck, Kroll & Sunderman, Reference Linck, Kroll and Sunderman2009; Misra, Guo, Bobb & Kroll, Reference Mishra, Hilchey, Singh and Klein2012; Philipp, Gade & Koch, Reference Philipp, Gade and Koch2007; Philipp & Koch, Reference Philipp and Koch2009). As well, functional neuroimaging evidence reveals differential brain engagement in blocked as opposed to mixed language contexts and supports the notion that bilinguals have the ability to globally suppress an entire language system following a blocked language switch (Abutalebi, Annoni, Seghier, Zimine, Lee-Jahnke, Lazeyras, Cappa & Khateb, Reference Abutalebi, Annoni, Seghier, Zimine, Lee-Jahnke, Lazeyras, Cappa and Khateb2008; Guo, Liu, Misra & Kroll, Reference Guo, Liu, Misra and Kroll2011; Kovelman, Baker & Petitto, Reference Kovelman, Baker and Petitto2008; Rodriguez-Fornells, Rotte, Heinze, Nösselt & Münte, Reference Rodriguez-Fornells, Rotte, Heinze, Nösselt and Münte2002). Of note, these studies have shown this ability to be asymmetrical; that is, the stronger L1 is inhibited more than the weaker L2, presumably to facilitate the production of the weaker L2 (e.g. Meuter & Allport, Reference Meuter and Allport1999).
Thus, while bilinguals seem able to modulate activation of an entire language system as a function of a prior language context, an open question concerns the factors that lead to more or less cross-language activation in a language switching context. For example, one way prior studies differ is in whether a clear-cut switch occurs between two generally monolingual task contexts, each calling for the exclusive use of one language (as opposed to tasks requiring some language co-activation and use), a situation that would enable bilinguals to proactively, and globally, suppress the non-target language. A second potential factor may be the degree of active engagement within a task, and whether a particular communication task is more passive (in the case of comprehension) or active (in the case of production). During “passive” comprehension tasks (e.g. listening to the radio), the activation of the target language rises, and cross-language competitors may only be sporadically activated. Consequently, a more local or reactive kind of inhibition may be sufficient for competitors to be suppressed and word recognition to take place. In contrast, during language production, the communication task may be hindered by the morphosyntax, semantics and phonology of a non-target language, and may thus benefit from a more active and global inhibition of a non-target language. Such a global non-target language inhibition strategy may be particularly important when the non-target language is a bilingual's dominant or first language, given that producing speech in the L2 is harder than in the L1 (e.g. Hanulová, Davidson & Indefrey, Reference Hanulová, Davidson and Indefrey2011; Runnqvist, Strijkers, Sadat & Costa, Reference Runnqvist, Strijkers, Sadat and Costa2011), and more susceptible to cross-language interference. However, because both languages are affected by a speaker's bilingualism (Kroll & Gollan, Reference Kroll, Gollan, Ferreira, Goldrick and Miozzo2014), suppressing the non-target language during production may be advantageous in highly proficient bilinguals regardless of whether they are using L1 or L2.
Thus, bilinguals may be more likely to actively engage a global inhibition strategy when switching between monolingual tasks clearly requiring the exclusive use of one language, especially when needing to exert more control over cross-language interference, such as when they are speaking the language. This distinction between global and local inhibition is consistent with work examining language control specifically (Abutalebi & Green, Reference Abutalebi and Green2007; De Groot & Christoffels, Reference De Groot and Christoffels2006; Green, Reference Green1998) but also with a domain-general cognitive control view contrasting proactive and reactive cognitive control (Braver, Reference Braver2012). This is also consistent with previous work showing an enhancement of cross-language competition using language context manipulations that were either passive (e.g. watching a movie; Elston-Güttler et al., Reference Elston-Güttler, Gunter and Kotz2005) or active but bilingual, thus requiring both languages to be active (e.g. conversing with the experimenter in both the target and non-target language; Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010). In contrast, studies showing evidence for global language inhibition used language production tasks in alternating language blocks (Gerfen, et al., 2015; Linck et al., Reference Linck, Kroll and Sunderman2009; Misra et al., Reference Misra, Guo, Bobb and Kroll2012).
To investigate these issues, we examined whether and how speaking in one language (English or French) modulates within- and cross-language lexical competition in a subsequent English spoken word recognition task. Specifically, participants engaged in a spontaneous speech production task (as part of another study) in either the same language as the ensuing visual world task (i.e. English; no-switch condition) or a different non-target language (i.e. French, switch condition). Similar to Pivneva et al. (Reference Pivneva, Palmer and Titone2012), we used a modified version of the map task (Anderson, Bader, Bard, Boyle, Doherty, Garrod, Isard, Kowtko, McAllister, Miller, Sotillo, Thompson & Weinert, Reference Anderson, Bader, Bard, Boyle, Doherty, Garrod, Isard, Kowtko, McAllister, Miller, Sotillo, Thompson and Weinert1991), a production task used to examine spontaneous speech in the context of relatively natural dialogues (Brown & Miller, Reference Brown and Miller1980; Macafee, Reference Macafee1983; Macaulay, Reference Macaulay1985). Following the spontaneous production task, participants engaged in a visual world task (Allopenna et al., Reference Allopenna, Magnuson and Tanenhaus1998) where they listened to instructions (“Click on the field”) while viewing displays that included pictures of the target word (field), a within- or cross-language word-onset lexical competitor (respectively feet or fille “girl” in French), and two unrelated words (i.e. car and church), or control displays that included pictures of the target word and three unrelated words. The language production and visual world tasks were carried out by two different experimenters so the switch between the tasks (and languages) could be interpreted as being between two monolingual contexts (i.e. French, then English). We examined participants’ fixations to the pictures of target and competitor words as well as their latency selecting the target picture in response to spoken instructions.
According to the language mode view, prior participation in the production task, which was performed in a relatively monolingual context, should increase the baseline activation of words in this language. Participants in the no-switch condition, especially those less proficient in English, should benefit from this prior task due to the relative boost in activation of English vs. French words. This should in turn lead them to experience reduced cross-language competition from French in the subsequent visual world task. Participants in the switch condition, in contrast, should start the visual world task with relatively high levels of French activation. As a result, they should face greater levels of cross-language competition, at least until the activation levels of their languages are tipped in favor of English. This may be especially true in native French bilinguals less proficient in the task language, for whom French words may already have relatively high resting activation levels.
According to the language switching view, however, the language switch between two relatively monolingual tasks should prompt bilinguals to suppress the non-target language (French) in the subsequent visual world task. This might be particularly true to the extent that the non-target language is more susceptible to interfere because its recent use generated relatively high activation levels, but also when it is dominant and benefits from generally higher baseline activation levels (e.g. Abutalebi & Green, Reference Abutalebi and Green2007; De Groot & Christoffels, Reference De Groot and Christoffels2006; Green, Reference Green1998). To the extent that bilinguals are efficient in using context, we would expect inhibition of the non-target language and, thus, more limited cross-language competition in the switch condition vs. no-switch condition. Thus, participants in the no-switch condition should have no incentive to suppress French beyond that of being in an English-speaking environment. In contrast, participants in the switch condition may need to actively inhibit French when performing the English visual world task to limit its interference. This global inhibition of the non-target language, paired with the reactivation of the previously suppressed, now target language, may divert cognitive resources away from the task at hand and be reflected in a cost, especially for native French bilinguals performing the task in a less proficient language.
Methods
Participants
Fifty-seven English–French bilinguals participated for credit or monetary compensation ($10/hour). All participants were between 18 and 35 years old, reported normal or corrected-to-normal vision, and had no speech or hearing disorders. Participants were also excluded if they had a history of attention, learning, or neurological disorders. Twenty-nine had English as their native and dominant language (M = 21.9 years, SD = 1.0 years, 23 women, 6 men) and 28 had French as their native language (M = 22.7 years, SD = 2.9 years, 24 women, 4 men), six of which reported now being dominant in English. Half of the participants of each group were assigned to a language production task in the same language as the ensuing spoken language task (English; no-switch condition), and the other half were assigned to a language production task in a different language as the ensuing language task (French; switch condition). Table 1 presents background information on the resulting four bilingual groups.
EN = native English bilinguals, no-switch condition; ES = native English bilinguals, switch condition; FN = native French bilinguals, no-switch condition; FS = native French bilinguals, switch condition
Note: The four groups were compared on all measures with one-way ANOVAs (* indicates significant differences between the groups), followed with pairwise comparisons (see methods for details).
Individual difference measures
Individual differences among participants were measured using an L1 vocabulary task (Wechsler Abbreviated Scale of Intelligence (WASI); Wechsler, Reference Wechsler1999), a language experience questionnaire (modified version of Language Proficiency and Experience Questionnaire (LEAP-Q); Marian, Blumenfeld & Kaushanskaya, Reference Marian, Blumenfeld and Kaushanskaya2007), and several non-verbal measures of inhibitory control. The inhibitory control measures included the Simon task (Simon & Ruddell, Reference Simon and Rudell1967; non-verbal version developed by Blumenfeld & Marian, Reference Blumenfeld and Marian2011), two Stroop tasks (Stroop, Reference Stroop1935; one using arrows adapted from Liu, Banich, Jacobson & Tanabe, Reference Liu, Banich, Jacobson and Tanabe2004; one using numbers), and an anti-saccade task (Hallet, Reference Hallett1978). No significant differences were found between the groups on any of the inhibitory tasks (see supplementary online material for mean group performance on each measure; Mercier et al., Reference Mercier, Pivneva and Titone2014, for details on the tasks).
Table 1 presents mean results for the language measures. Although the groups were more similar than different, native English bilinguals were overall less exposed to their L2 than native French bilinguals, and native English bilinguals in the switch condition self-rated some of their L2 abilities as lower than native French bilinguals. We thus controlled for the ratio of English-to-French nativeness, a measure that correlated with the majority of all other L2 exposure and proficiency measures (see Table 2)Footnote 1 in subsequent analyses. This measure and another, the ratio of English-to-French foreign accent, are objective measures of proficiency in the target (English) and non-target (French) language we derived from the language production task. Specifically, independent raters coded participants’ speech in the monologue portion of the task (see “language production task” section below for details) in terms of how native they sounded and how strong an accent they had in English and French (see Pivneva et al., Reference Pivneva, Palmer and Titone2012). We then computed the ratio of English-to-French nativeness and foreign accent on the monologue portion of the conversation task (see Table 1).
* p < .05; ** p < .01; *** p < .001
Materials
There were 20 item sets that included spoken target and control words, and six pictures (target, within- and cross-language lexical competitors, and three distracter controls, one of them named in the control conditions). We matched pictures within each set in terms of color and complexity to the extent possible based on an independent norming procedure. Twenty-six young adults, who did not participate in the main experiment, rated visual complexity and appeal of the pictures using five-point scales (1 = low, 5 = high). No differences were found for target, within-language competitor, cross-language competitor, and control-word pictures in visual appeal (overall M = 3.07, SD = 0.65). Cross-language competitor pictures were rated as visually less complex (M = 4.02, SD = 0.59) than control pictures (M = 4.33, SD = 0.22; p = .01), but no other difference in visual complexity was found (overall M = 4.17, SD = 0.44). The materials were the same as those used by Mercier et al. (Reference Mercier, Pivneva and Titone2014), which provides a list of the specific item sets and further details on their norming.
Within- and cross-language target–competitor pairs overlapped by an average of 2.05 phonemes (SD = 0.22). The degree of target/competitor phonological overlap did not significantly differ across conditions (p > .05). There were also no differences in length (in milliseconds) between targets (M = 728.20, SD = 101.98) and control words (M = 749.95, SD = 118.62, p = .53). The mean duration of word–initial overlap between target and within-language lexical competitors was 298.85 ms, that between target and cross-language lexical competitors 290.40 ms. Targets were also comparable in word frequency (M = 20.40, SD = 31.26) to within-language (M = 30.20, SD = 63.35, p > .05) and cross-language (M = 20.50, SD = 60.71, p > .05) lexical competitors, based on Celex (Baayen, Piepenbrock & van Rijn, Reference Baayen, Piepenbrock and Van Rijn1995) and Lexique (New, Pallier, Brysbaert & Ferrand, Reference New, Pallier, Brysbaert and Ferrand2004; i.e. for French words). Spoken instructions were digitally recorded by a female native speaker of Canadian English, and down-sampled to 16 kHz using Sony Sound Forge 8.0 (see Figure 1 below for timing details).
The two different spoken stimuli consisted of the target (field) and control (car) words, which were preceded by instructions directing participants to click on the named picture (“Click on the field”). The four different display types consisted of the within-language lexical competitor display (EC, where the picture of an English lexical competitor was presented, e.g. feet), the cross-language lexical competitor display (FC, where the picture of a French lexical competitor was presented, e.g. fille), the combined within- and cross-language lexical competitors (EFC, presenting both competitor types), and the control display (NC, where no picture represented a lexical competitor). Thus, within each item set, the same target and named control pictures were presented (e.g. field and car) across the four different display types (EC, FC, EFC, and NC), which were distinguished by the presence and type of competitor picture(s) presented. With the exception of the EFC display, which included pictures of the target, named control, within-language competitor, and cross-language competitor, displays included additional matched control pictures. Consequently, for each of the 20 stimulus sets, there were four display types, each heard in two spoken word conditions, for a total of eight cells per stimulus set, five of which presented without functional word onset competitors given the word heard. Of the resulting 160 trials, 62.5% thus occurred without word onset competitors in the display. The control trials were presented to minimize the likelihood of participants noticing the competitors (see Tanenhaus, Reference Tanenhaus and Gaskell2007). Of note, the EFC display was not included in the analyses as it functioned similarly to the EC displays, and will not be discussed further.
The inner corner of each picture was 2.33 degrees of visual angle away from the center of the screen. Picture height and width varied but had an approximate surface area of 51 cm2. Four pseudo-randomized trial lists were created so that the target, competitor and distracter pictures were presented in all four quadrants within and across participants in a counterbalanced fashion. Displays from each stimulus set were distributed in the lists so they were equally spaced in time, and the same display was presented once in each half of the task (once with each word heard).
Apparatus
Eye movement data were acquired with an Eye-Link 1000 tower-mounted system (made by SR-Research, Ontario, Canada) with a sampling rate of 1 kHz using Experiment Builder (SR-Research, Ontario, Canada). Viewing was binocular, but eye movements were recorded from the right eye only. Calibration consisted of a standard five-point grid. The stimuli were presented on a 21-inch (50.8-cm) ViewSonic CRT monitor with a refresh rate of 60 Hz located 71 cm away from participants.
Procedure
Participants first engaged in a language production task, which they performed in either the same language as the ensuing visual world task (English; no-switch condition) or the non-target language (French; switch condition). Participants then performed the visual world task, followed by the second part of the language production task (this part was performed in the language not used prior to the visual world task), and inhibitory function measures. All instructions were given in English until after the visual world task was completed, except for the language production task for participants in the switch condition. Language measures (i.e. vocabulary task, language questionnaire) were administered last to avoid inadvertently activating the non-target language in participants in the no-switch condition prior to the visual world task.
Language production task
We used a modified version of the map task (Anderson et al., Reference Anderson, Bader, Bard, Boyle, Doherty, Garrod, Isard, Kowtko, McAllister, Miller, Sotillo, Thompson and Weinert1991) to create a “monolingual language” context and drive participants to adopt either a monolingual English (no-switch condition) or French (switch condition) mode of processing prior to their engagement in the visual world task (the results of this task are reported in Pivneva et al., Reference Pivneva, Palmer and Titone2012).
In the map task, two conversational partners receive a map the other cannot see. Normally, one partner assumes the role of instruction giver, while the other assumes that of instruction follower. Each map contains a starting point, and black and white drawings of landmarks, each accompanied by a word label. Because these word labels periodically mismatch across the instruction giver and follower's map versions, partners have to discuss these discrepancies as the instruction giver verbally describes the route traced on his or her map so that the instruction follower can reconstruct this route on her own map. In our task, participants served as instruction givers in two different versions of the map task, first in the context of a monologue (to a “hypothetical” instruction receiver), then in the context of a dialogue (to a conversational partner, a confederate of the experiment). The language of the instructions and map labels matched the monologue or dialogue language such that the testing situation was monolingual (i.e. in the same language as the ensuing visual world task, English, in the no-switch condition, or in the non-target language, French, in the switch condition). Participants were implicitly discouraged from switching languages during the map task in that the confederate failed to answer if they did not provide instructions in the required language. The portion of the map task administered prior to the visual world task lasted approximately 25 minutes (Table 1 presents the duration of participants’ speech output prior to the visual world task).
Visual world task
Each trial began with the presentation of a calibration point in the middle of the screen, which participants were asked to fixate (see Figure 1), followed by a centrally located red square. Participants moved the cursor (green circle) and clicked on this red square to trigger the appearance of a picture display and the beginning of the recorded instructions. The cursor appeared in the center of the screen to ensure participants would fixate there at the beginning of each trial. Participants were asked to naturally scan the pictures and click on the appropriate one after hearing the instructions. The pictures disappeared following a click on any one of the pictures, and the calibration circle reappeared, signaling the beginning of a new trial. The visual world task lasted approximately 25 minutes.
Analyses
We operationally defined lexical competition in three distinct but interrelated ways, which compared fixations or mouse-click latency across trials of different conditions. We first assessed lexical competition as the proportion of fixations to the lexical competitor as a function of the spoken word heard (target or control word). Here, greater fixations to a lexical competitor are expected to the extent that the competitor picture represents a lexical competitor to the spoken word (i.e. target word > control word). Second, we assessed lexical competition as the proportion of fixations to the target picture as a function of whether a competitor appears in the display. Here, fewer fixations to the target picture are expected to the extent that the competitor picture interferes with the processing of the target word (i.e. EC < NC, or FC < NC). Third, we assessed lexical competition as the time participants took to correctly click on the target picture in “competitor” displays based on the word they heard (target or control word). Here, to the extent that a functional competitor is present only when the target is heard, longer latencies are expected only when the target word is heard (i.e. target word > control word).
Using these three different operational definitions, we conducted a series of planned analyses using linear mixed effects (LME) models. Our first step was always to use omnibus models to examine whether lexical competition interacted with the language context manipulation, the task language relative to participants’ language background, and the point in time during the visual world experiment when the measurement took place. We then conducted further analyses to separately examine participants performing the task in L1 and L2. Then, when interactions involving time were found, we further split the analyses based on the language context manipulation to examine whether lexical competition effects were similarly modulated as the experiment progressed across language context manipulation conditions.
Results
The data were analyzed using LME models within the lme4 package of R (version 3.0 for Mac OS X; Baayen, Davidson & Bates, Reference Baayen, Davidson and Bates2008; Bates, Reference Bates2007; R Development Core Team, 2012). We created separate models for each dependent variable and type of lexical competition examined (i.e. within- and cross-language). Dependent variables included the proportion of fixations to the target picture when competitors were absent vs. present in displays, the proportion of fixations to lexical competitors (within- or cross-language) as a function of the spoken word heard (target vs. control word), and the latency of response (i.e. the mouse click to the target picture). The onset of each fixation was measured from the onset of the saccade leading to it (Altmann & Kamide, Reference Altmann and Kamide1999). We discarded trials where participants clicked on a wrong picture, made no fixations to the named picture, or took less than 200 ms to respond (3.4% of the data). Fixations were calculated separately for each participant, for each trial, as a proportion of total fixations, where the numerator represented the fixations to a specific picture (either the target, within-language competitor, or cross-language competitor) and the denominator represented the summed fixations to any of the pictures present in the display (e.g. Dahan & Tanenhaus, Reference Dahan and Tanenhaus2004). Overall, accuracy was above 95% (see Table 3).
EN = native English bilinguals, no-switch condition; ES = native English bilinguals, switch condition; FN = native French bilinguals, no-switch condition; FS = native French bilinguals, switch condition
Note: The four groups were compared on all measures with one-way ANOVAs (* indicates significant differences between the groups), followed with pairwise comparisons (see methods for details). Trials responded to incorrectly were discarded from subsequent analyses.
Given the strong role of the proportion of phonological overlap between target and competitor words in our previous studies (Mercier et al., Reference Mercier, Pivneva and Titone2014), we restricted our analyses to sets of items where the proportion of target/competitor phonological overlap (in ms duration) was greater than 0.30 to maximize the likelihood of observing lexical competition effects (16 different sets for within-language competitor condition, 17 for cross-language competitor condition).Footnote 2 The resulting mean proportion of competitor/target phonological overlap (within-language: M = 0.45 SD = 0.09; cross-language: M = 0.42, SD = 0.08) was thus increased relative to the full stimulus set (within-language: M = 0.40, SD = 0.12; cross-language: M = 0.39, SD = 0.11), and there were no differences between within- and cross-language stimuli (p = .15).
For models examining the proportion of fixations to lexical competitors, and participants’ response latency in correctly clicking on target pictures, the comparison involved the word heard (target vs. control word, baseline: control) in the presence of a competitor display. For models examining the proportion of fixations to the target picture, the comparison involved the display viewed (competitor vs. control display, baseline: control display) only when the target word was heard. In the main models, we examined whether lexical competition (word heard or display viewed) interacted with the language context manipulation (no-switch vs. switch; baseline: no-switch), the task language relative to participants’ language background (L1 vs. L2; baseline: L1), and trial order (continuous) to examine how lexical competition changed over the experiment.
Finally, we also examined the time course of lexical competition early within a trial, as the spoken word unfolded, in terms of the proportion of fixations to target and competitor pictures (two consecutive time bins, 200–600 ms, and 600–1000 ms), and later within a trial, at the end point of word recognition, in terms of correct response latency clicking on the target picture. The ratio of English-to-French nativeness (continuous) was included as a fixed effect to account for participants’ language background and relative proficiency in the target vs. non-target languages.1 Tables presenting details on these models can be obtained in the supplementary materials online.
When a significant four-way interaction was found, we conducted sub-analyses split on whether the visual world task performed in a participant's L1 or L2. We also tested whether an interaction with the trial order clarified the results, and we kept it as a control variable in the model on all other occasions (Baayen, Reference Baayen2008).Footnote 3 To reduce collinearity, the ratio of English-to-French nativeness was standardized and centered. Across all models, participants and items were random factors (random intercepts only), and Markov chain Monte Carlo (MCMC) sampling tests (n = 10,000) were used to obtain p-values for all fixed factors. The correlation among fixed effects across all models was < 0.810. Although collinearity appears high, it decreased significantly if trial order was standardized and centered (the correlation of 0.810, for instance, was lowered to 0.202), without substantially altering the overall pattern of results. We report the results without standardizing/centering this variable, however, to enable easier interpretation of the plots and comparisons with our previous work.
Cross-language lexical competition
Fixation proportions to target picture as a function of the display viewed (competitor vs. control)
We computed the average proportion of fixations to each picture for the cross-language lexical competitor (FC) and control (NC) displays (see Figure 2 for the proportion of fixations to the target and cross-language competitor in the conditions compared in subsequent analyses). We chose 200 ms as a lower boundary for the analysis based on the visual examination of the fixations over time, taking into consideration the time needed to program and launch a saccade (e.g. Altmann & Kamide, Reference Altmann, Kamide, Henderson and Ferreira2004; Fischer, Reference Fischer1992; Hallet, Reference Hallett1978; Matin, Shao & Boff, Reference Matin, Shao and Boff1993; Saslow, Reference Saslow1967) and following other visual world studies (e.g. Dahan et al., Reference Dahan, Magnuson and Tanenhaus2001; Magnuson, Dixon, Tanenhaus & Aslin, Reference Magnuson, Dixon, Tanenhaus and Aslin2007; but see Altmann, Reference Altmann2011, for saccade programming estimate of approximately 100 ms). The 1000 ms upper boundary was chosen following both inspection of the data and previous visual world studies showing fixations to targets reaching an asymptote at this time (e.g. Allopenna et al., Reference Allopenna, Magnuson and Tanenhaus1998). For subsequent analyses, we computed the average proportion of fixations to each picture for each trial and participant for time bins extending from 200 ms to 600 ms and 600 ms to 1000 ms.
First, we examined whether cross-language competition interacted with language context (no-switch vs. switch; baseline: no-switch), participants’ task language (L1 vs. L2; baseline: L1), and trial order (continuous). Between 200 ms and 600 ms, a significant four-way interaction between display viewed (FC vs. NC display), language context manipulation (no-switch vs. switch), task language (L1 vs. L2), and trial order (continuous) was found (b = 0.004, SE = 0.002, p MCMC = .0328). To understand this interaction, we computed separate submodels for people doing the task in their L1 and L2.
For participants who performed the task in their L1, we found an interaction between display viewed and language context manipulation (b = 0.179, SE = 0.054, p MCMC = .0004) for models that included trial order as a main effect (the effect was in the same direction when trial order was part of the interaction). As can be seen in the upper panels of Figure 2, participants performing the task in their L1 showed evidence of cross-language competition when tested in the no-switch condition (left panel; main effect of display viewed, b = –0.089, SE = 0.038, p MCMC = .0190) but not in the switch condition (right panel).
For participants who performed the task in their L2, we found an interaction between display viewed, language context manipulation, and trial order (b = 0.004, SE = 0.001, p MCMC = .0028). As can be seen in the lower panel of Figure 2, participants performing the task in their L2 showed evidence of cross-language competition only in the no-switch condition (left panel). However, there was a significant interaction between display viewed and trial order for participants in the no-switch condition (b = −0.002, SE = 0.001, p MCMC = .007). As seen in Figure 3, cross-language competition, as indexed by reduced target fixations in the presence of a cross-language competitor, grew as the experiment progressed in the no-switch condition (left panel). In contrast, no effects were found for the switch condition (Figure 3, right panel).
Between 600 ms and 1000 ms, there was a significant interaction between display viewed (FC vs. NC display) and language context manipulation (no-switch vs. switch) (b = 0.074, SE = 0.032, p MCMC = .0232; only if entirely removing trial order from the model). As can be seen in Figure 2 above, both L1 and L2 participants showed evidence for cross-language competition, as indexed by reduced target fixation in the presence of a cross-language competitor, when tested in the no-switch condition (left panel; main effect of display viewed, b = –0.059, SE = 0.021, p MCMC = .0042) but not in the switch condition (right panel).
Fixation proportions to cross-language competitor pictures as a function of the word heard (target vs. control)
We next examined whether cross-language competition interacted with the language context manipulation (no-switch vs. switch; baseline: no-switch), the task language relative to participants’ language background (L1 vs. L2; baseline: L1), and trial order (continuous). Here, cross-language competition was measured as the comparison between fixations to the cross-language competitor pictures when the target vs. control words were heard (i.e. word heard; baseline: control display).
Between 200 ms and 600 ms, we found a trend towards an interaction between word heard (target vs. control word) and language context (no-switch vs. switch) (b = 0.141, SE = 0.087, p MCMC = .1016). As seen in Figure 2, there was a modest cross-language competition effect in the no-switch condition that reversed in the switch condition. When L2 participants were analyzed alone, we found a significant interaction between word heard and language context, after removing trial order from the model (b = –0.103, SE = 0.038, p MCMC = .0058; the effect was in the same direction when keeping trial order in the model). As seen in Figure 2, L2 participants looked at cross-language competitor pictures more when hearing the target than control word in the no-switch condition (left panel; main effect of word heard; b = 0.060, SE = 0.027, p MCMC = .0272), whereas a trend towards the opposite effect was found in the switch condition (right panel; trend towards main effect of word heard; b = –0.045, SE = 0.025, p MCMC = .0738). When participants performed the task in L1, however, a trend towards a word heard by language context interaction was also found (b = 0.164, SE = 0.089, p MCMC = .0668), but there were no reliable effects if examining each language context condition individually.
Between 600 ms and 1000 ms, we found a significant interaction between word heard (target vs. control word) and language background (L1 vs. L2; baseline: L1) (b = 0.041, SE = 0.017, p MCMC = .0148) when trial order was removed from the interaction. As can be seen in Figure 2, when examining the two groups separately, there was only a main effect of word heard for participants performing the task in L2 (b = 0.045, SE = 0.014, p MCMC = .0018), which was only reliable in the no-switch condition (b = 0.048, SE = 0.014, p MCMC = .0002).
Correct response latency (i.e. mouse click on target picture) in cross-language lexical competition trials as a function of the word heard (target vs. control)
Finally, we examined whether cross-language competition interacted with language context (no-switch vs. switch; baseline: no-switch), and the task language (L1 vs. L2; baseline: L1), and trial order (continuous). Here, cross-language competition was measured as the comparison between response latencies to the cross-language competitor displays when the target vs. control words were heard (i.e. word heard; baseline: control display). There was an interaction between word heard and language context (b = –128.33, SE = 45.82, p MCMC = .0044), if trial was removed from the model. Examining L1 and L2 participants separately, we found a similar interaction for L1 participants (b = –128.55, SE = 27.95, p MCMC = .0001), but we only found a main effect of word heard for L2 participants (b = 185.10, SE = 42.99, p MCMC = .0001). As seen in the upper panel of Figure 4, all participants (with the exception of L1 participants tested in the switch condition) took longer to respond when they heard a target word that phonologically overlapped with a cross-language competitor in the display than a control word without a functional competitor in the display.
Summary of cross-language competition effects
Taken together, these results suggest that when participants performed the visual world task in a language that was different from the one they had just used for production (the switch condition), they globally suppressed the now irrelevant non-target language, thus reducing the degree of cross-language competition experienced. This appeared to be the case whether participants switched from L1 to L2 or vice versa.
In fact, L1 participants only showed evidence for cross-language competition when they were tested in the no-switch condition (in terms of both reduced target fixations to the FC vs. NC displays, as seen in Figure 2, and longer response click latencies to the FC displays when hearing the target vs. control words, see upper panel of Figure 4). Perhaps unsurprisingly, L2 participants generally experienced more cross-language competition (in terms of reduced target fixations to the FC vs. NC displays, greater cross-language competitor fixation when hearing the target vs. control words, and longer response click latencies to the FC displays when hearing the target vs. control words) and showed evidence for incomplete suppression of cross-language competition in the switch condition. Indeed, despite the absence of cross-language competition effects in terms of fixation patterns, they showed some cross-language competition in terms of their latency of response (upper panel of Figure 4).
Within-language lexical competition
Fixation proportions to target pictures as a function of the display viewed (competitor vs. control)
We computed the average proportion of fixations to each picture as the spoken word unfolded for the within-language lexical competitor (EC) and control (NC) displays (see Figure 5 for the proportion of fixations to the target and within-language competitor in the conditions compared in subsequent analyses). We adopted the same approach as that used for cross-language competitor trials and computed the average proportion of fixations to each picture individually for each participant and for each trial for time bins extending from 200 ms to 600 ms, and 600 ms to 1000 ms.
We first examined whether within-language competition interacted with the language context manipulation (no-switch vs. switch; baseline: no-switch), the task language relative to participants’ language background (L1 vs. L2; baseline: L1), and trial order (continuous). Here, within-language competition was measured as the difference between fixations to the target picture when the within-language competitor vs. control displays were viewed (i.e. display viewed; baseline: control display).
Between 200 ms and 600 ms, we found a trend for a main effect of display viewed (EC vs. NC display; b = –0.150, SE = 0.088, p MCMC = .0870), and for a three-way interaction between language context (no-switch vs. switch), task language (L1 vs. L2), and trial order (continuous; b = –0.002, SE = 0.001, p MCMC = .0788). To further understand these effects, we examined separately participants performing the task in the L1 and L2. In both groups, we found only trends for main effects of display viewed (L1: b = –0.146, SE = 0.090, p MCMC = .1146; L2: b = –0.151, SE = 0.092, p MCMC = .1046). As seen in Figure 5, all participants showed evidence for within-language competition, as indexed by reduced target fixations when the competitor display was viewed. Although L2 participants showed a larger and later effect, there was no interaction between display viewed and task language.
Between 600 ms and 1000 ms, we again found a main effect of display viewed, though only if trial was removed from the interaction (b = –0.073, SE = 0.024, p MCMC = .0032). As seen in Figure 5, within-language competition (indexed by reduced target fixations in the presence of a within-language competitor), was experienced by all participants, although it appears larger when people did the task in their L2 (lower panel) vs. their L1 (upper panel).
Fixation proportions to within-language competitor pictures as a function of the word heard (target vs. control)
We examined whether within-language competition interacted with language context (no-switch vs. switch; baseline: no-switch), the task language (L1 vs. L2; baseline: L1), and trial order (continuous). Here, within-language competition was measured as the comparison between fixations to the within-language competitor pictures when the target vs. control words were heard (i.e. word heard; baseline: control display).
Between 200 ms and 600 ms, a significant four-way interaction between word heard (target vs. control word), language context manipulation (no-switch vs. switch), task language (L1 vs. L2), and trial order (continuous) was found (b = 0.027, SE = 0.001, p MCMC = .0420). To further understand this interaction, we examined separately participants performing the task in L1 and L2. When participants performed the task in their L1, we found a significant interaction between word heard, language context manipulation, and trial order (b = –0.002, SE = 0.001, p MCMC = .0288). After further splitting this group into the no-switch and switch conditions separately, we found a main effect of word heard in the no-switch group if removing trial order from the interaction (b = 0.131, SE = 0.028, p MCMC = .0001), but an interaction between word heard and trial order for participants tested under the switch condition (b = –0.001, SE = 0.001, p MCMC = .0001). As seen in Figure 6, participants performing the task in their L1 showed decreasing within-language competition (indexed by reduced competitor fixation) in the switch condition as the experiment progressed (right panel).
When participants performed the task in their L2, we only found a main effect of word heard (b = 0.175, SE = 0.069, p MCMC = .0114) despite the appearance of greater within-language competition for participants in the switch condition compared to those in the no-switch condition visible on Figure 5. This suggests that native French bilinguals experienced within-language competition regardless of the prior language context or stage of the experiment.
Between 600 ms and 1000 ms, we found a four-way interaction similar to that uncovered between 200 ms and 600 ms, but it was only a trend towards an interaction between word heard (target vs. control word), language context manipulation (no-switch vs. switch), task language (L1 vs. L2), and trial order (continuous; b = 0.001, SE = 0.001, p MCMC = .0774; if removing trial order from the interaction, a significant main effect of word was found: b = 0.041, SE = 0.015, p MCMC = .0092). When we examined participants who performed the task in their L1 and L2 separately, we only found main effects of word heard (L1: b = 0.042, SE = 0.014, p MCMC = .0036; L2: b = 0.059, SE = 0.017, p MCMC = .0001; removing trial order from the interaction). These results suggested that L1 and L2 groups experienced similar within-language competition regardless of the language context, despite the suggestion in Figure 5 that within-language competition was greater between 600 ms and 1000 ms for both L1 and L2 participants in the switch condition.
Correct response latency (i.e. mouse click on target picture) in within-language lexical competition trials as a function of the word heard (target vs. control)
We examined whether within-language competition interacted with the language context manipulation (no-switch vs. switch; baseline: no-switch), the task language relative to participants’ language background (L1 vs. L2; baseline: L1), and trial order (continuous). Here, within-language competition was measured as the comparison between response latencies to the within-language competitor displays when the target vs. control words were heard (i.e. word heard; baseline: control display). We found a trend for a four-way interaction involving word heard (target vs. control word), language context manipulation (no-switch vs. switch), task language (L1 vs. L2), and trial order (continuous; b = –3.142, SE = 1.840, p MCMC = .0852). As shown in the lower panel of Figure 4, all participants took somewhat longer to respond when they heard the target than the control words. Figure 7 further shows that participants performing the task in L2 and tested under the switch condition experienced greater within-language competition, especially at the beginning of the experiment.
To further understand this interaction, we then examined participants performing the task in L1 and L2 separately. When doing so, however, we only found main effects of word heard, and only if trial order was removed from the models (participants performing task in L1: b = 83.69, SE = 24.73, p MCMC = .0004; participants performing task in L2: b = 134.71, SE = 53.91, p MCMC = .0114).
Summary of within-language competition effects
In contrast with cross-language competition, which (although subtle) was significantly reduced in the switch condition compared to the no-switch conditions, within-language competition was more robust and appeared across language groups and language context conditions. Interestingly, both language groups showed greater within-language competition early in the course of the experiment than later in its course. This, however, was apparent at different time points during word recognition. Native-English participants who switched from their L2 (French) into their L1 (English) showed greater within-language competition earlier in the time course of word recognition than those switching from their L1 into their L2. Specifically, native-English participants showed this in terms of the proportion of fixations to the within-language competitor pictures immediately after hearing the target vs. control words (i.e. 200 ms to 600 ms post word onset; Figure 6). In contrast, native-French participants switching from L1 (French) into their L2 (English) experienced greater within-language competition later in the time course of word recognition, in terms of their latency selecting the correct picture (Figure 7).
Taken together, these results suggest that switching languages was associated with a reduced ability to resolve within-language competition irrespective of whether the switch was from L2 to L1 direction, or vice versa. However, the within-language competition effect was greater at the beginning of the experiment and gradually diminished over the course of the experiment for both native-English and native-French participants.
Discussion
We investigated whether and how speaking in one language (English or French) modulates within- and cross-language lexical competition when listening to English in a subsequent spoken word processing task. We expected that the impact of language switch would be most apparent in terms of cross-language competition during the spoken word recognition task, but that it may also be associated with greater within-language competition due to the cost incurred by adjusting to the target language (English) after having engaged in the non-target language (French).
As reviewed previously, past research suggests that switching language prior to a task performed in the same vs. a different language can have two different effects on the cross-language competition experienced during that task. One possibility, suggested by the language mode view (Grosjean, Reference Grosjean and Nicol2001, Reference Grosjean2008) is that more cross-language competition should be observed in participants in the switch (French to English) group than in those in the no-switch (English to English) group. Indeed, participants in the switch group should begin the English task with higher levels of activation of French words, and thus face greater cross-language competition, until activation levels are adjusted from a relatively monolingual French to a relatively monolingual English mode. Another possibility, suggested by the language switching literature (De Groot & Christoffels, Reference De Groot and Christoffels2006; Green, Reference Green1998; Koch et al., Reference Koch, Gade, Schuch and Philipp2010), is that experiencing a clear switch between two monolingual contexts should lead to an active and global inhibition of the non-target language (French), leading participants in the switch group to experience less cross-language competition than those in the no-switch group. This, however, might be associated with a processing cost to the task at hand (English visual world task).
We believe that the overall pattern of results was most consistent with the language switching view, where a proactive inhibitory control mechanism is presumed to play a key role. Specifically, bilinguals in the no-switch condition showed more evidence of cross-language competition (as indexed by decreased fixations to the target picture in the presence of a cross-language competitor, and greater fixations to the competitor picture when hearing the target word than when hearing the control word), consistent with our prior findings using the same task and stimuli (Mercier et al., Reference Mercier, Pivneva and Titone2014). However, bilinguals in the switch condition showed less evidence of cross-language competition, despite having just produced monologues and dialogues in French. Indeed, L1 participants showed no evidence for cross-language competition in the switch condition, and L2 participants only showed it in terms of response latency selecting the target picture.
These results cannot be due to differences in L2 proficiency between the groups as they were statistically controlled. Instead, they suggest that bilinguals who experienced a clear switch between speaking exclusively in one language (French) and listening to words in the other language (English) used this explicit shift to actively and globally suppress the non-target language (French). In contrast, bilinguals who did not switch from French to English, and thus did not experience a clear cue to suppress the non-target language, appeared to have been caught off-guard during the subsequent task, and thus experienced some degree of cross-language competition. Given previous findings (e.g. Blumenfeld & Marian, Reference Blumenfeld and Marian2011; Mercier et al., Reference Mercier, Pivneva and Titone2014), it is possible that they might have relied on local, reactive inhibition to cope with transiently activated competition pertaining to both the target and non-target languages.
Interestingly, the global inhibition of the non-target language seemed to come at a cost with respect to managing within-language competition. Specifically, bilinguals who switched from their L2 to their L1 experienced greater within-language competition early within a trial after hearing spoken words (i.e. the effect was in terms of fixations to competitor pictures) than those who did not switch, whereas bilinguals who switched from their L1 to their L2 faced greater within-language competition later within a trial when they had to click on the correct picture (i.e. the effect was in terms of latency clicking target pictures).
We first discuss the results pertaining to cross-language competition in more detail. When evidence for cross-language competition was found, for bilinguals tested in the no-switch condition, it was significant but modest in absolute size, consistent with a growing number of visual world studies of bilingual spoken language processing (e.g. Blumenfeld & Marian, Reference Blumenfeld and Marian2007; Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Mercier et al., Reference Mercier, Pivneva and Titone2014). As briefly reviewed in the introduction, the magnitude of cross-language competition observed using the visual world method has been shown to be quite small since it was first documented by Spivey and Marian (Reference Spivey and Marian1999), and also has been shown to vary with several factors (Blumenfeld & Marian, Reference Blumenfeld and Marian2007; Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Mercier et al., Reference Mercier, Pivneva and Titone2014). It is possible that bilingual studies using the visual world task have shown more limited cross-language activation because they track the “activation” of a target word relative to cross-language competitors without relying on the use of words that share representations across languages (e.g. interlingual homophones, cognates). Indeed, such words may inadvertently induce a bilingual language mode of processing and, thus, more balanced activation levels in bilinguals’ two languages, and comparatively greater activation of the non-target language (Wu & Thierry, published online November 1, 2010).
The most notable aspect of our results is the absence of cross-language competition for participants in the switch condition. Indeed, with the exception of L2 participants, who generally experienced more cross-language competition and showed incomplete suppression of cross-language competition in terms of response latency, we found no evidence for cross-language competition in participants who had previously switched from a French task to an English task. This can be interpreted as supporting the notion that when a clear switch in language context signals that a particular language is no longer relevant an entire language system can be inhibited, particularly if it is likely to interfere with the target language (De Groot & Christoffels, Reference De Groot and Christoffels2006). This is consistent with an accumulating body of results showing that competitors from L1, or from a more dominant language, are inhibited when the L2 or less dominant language is spoken (Abutalebi et al., Reference Abutalebi, Annoni, Seghier, Zimine, Lee-Jahnke, Lazeyras, Cappa and Khateb2008; Gerfen et al., 2015; Guo et al., Reference Guo, Liu, Misra and Kroll2011; Levy et al, Reference Levy, McVeigh, Marful and Anderson2007; Linck et al., Reference Linck, Kroll and Sunderman2009; Misra et al., Reference Misra, Guo, Bobb and Kroll2012;).
Our findings extend these results by showing that highly proficient bilinguals have the capacity to globally suppress a non-target language, whether it is the L1 or L2, even in a task that does not require speech production (but instead follows a language production task). Importantly, since Meuter and Allport's (Reference Meuter and Allport1999) seminal study showing that bilinguals suffer from asymmetric switch cost (greater difficulty switching from L2 to L1 than the reverse), the reliability of this asymmetry has been challenged (Costa & Santesteban, Reference Costa and Santesteban2004; Costa et al., Reference Costa, Santesteban and Ivanova2006; Gollan & Ferreira, Reference Gollan and Ferreira2007). It appears that bilinguals, if proficient enough and regardless of the age at which they acquired both languages, can experience switching cost of similar magnitude in both directions (Costa & Santesteban, Reference Costa and Santesteban2004; Costa et al., Reference Costa, Santesteban and Ivanova2006). Taken together, then, our results suggest that in proficient bilinguals, global inhibition can be recruited to control the activation of cross-language competitors regardless of whether the task is performed in the L1 or L2, or the dominant or non-dominant language.
Our findings are thus consistent with the view that two types of inhibitory control can be recruited during bilingual language processing – local vs. global – where “local” control is involved when more restricted lexical representations need to be acted on, while “global” control is called upon when complete language systems are activated or inhibited (De Groot & Christoffels, Reference De Groot and Christoffels2006). They are also consistent with Green's inhibitory control model (Abutalebi & Green, Reference Abutalebi and Green2007; Green, Reference Green1998) and a more recent, domain-general cognitive control view contrasting reactive and proactive cognitive control (Braver, Reference Braver2012). Specifically, according to Braver (Reference Braver2012), a proactive mechanism can be recruited to optimally tune attention, perception, and action systems when upcoming events associated with an elevated cognitive demand can be anticipated, and reactive control recruited to provide late corrections based on the bottom-up signal.
Our results somewhat conflict with those of Elston-Güttler et al. (Reference Elston-Güttler, Gunter and Kotz2005), who used a relatively similar “language context” manipulation, in that they exposed bilinguals to a language for a block of time before having them perform the task of interest. Specifically, they had German–English bilinguals exposed to the target (English) or non-target (German) language via a 20-minute film prior to a reading task in which interlingual words were found. Their results suggested that prior exposure to a film narrated in German boosted cross-language activation during the first half of the following English reading task, whereas prior exposure to a film narrated in English eliminated evidence for cross-language activation. These results were taken as evidence that bilinguals could adapt to a target language by reducing cross-language competition when given enough time in a monolingual context through a gradual adaptation of the activation levels of each language. These results contrast with ours in that Elston-Güttler et al. (Reference Elston-Güttler, Gunter and Kotz2005) found that a prior language switch (from German to English) led to greater cross-language competition in a subsequent English task, whereas we found that a prior language switch (from French to English) led to reduced cross-language competition in a subsequent English task.
However, several differences between their study and ours may explain the discrepancy in results. First, Elston-Güttler et al.'s (Reference Elston-Güttler, Gunter and Kotz2005) participants were overall less proficient in the target language (English), and lived in an environment that less commonly involves switches between language contexts. Thus, they may have had less practice with such switches (see Green, published online May 19, 2011, on the importance of considering the community context in which bilinguals typically use their two languages). Second, their experiment examined visual language processing, while ours involved spoken language processing, thus, the degree and kind of adaptation available to bilinguals to cope with cross-language interference may vary depending on the modality. Third, and perhaps most importantly, the methods used to manipulate the language context prior to the key experiment involved different modalities and degrees of difficulty. While Elston-Güttler et al.'s (Reference Elston-Güttler, Gunter and Kotz2005) manipulation involved passive film viewing, ours involved active engagement in a spontaneous language production task. The requirement to produce a monologue and dialogue may have been a stronger inducement of a “monolingual” mode, and prompted participants to adopt a more active, global language inhibition strategy as a result. As reviewed before, global inhibition may be especially important in language production to ensure that production is not interfered with by the non-target language (e.g. Hanulová et al., Reference Hanulová, Davidson and Indefrey2011; Runnqvist et al., Reference Runnqvist, Strijkers, Sadat and Costa2011). Related to this is the fact that most evidence for global language inhibition comes from language switching studies relying on spoken production (Gerfen et al., 2015; Guo et al., Reference Guo, Liu, Misra and Kroll2011; Linck et al., Reference Linck, Kroll and Sunderman2009; Misra et al., Reference Misra, Guo, Bobb and Kroll2012).
Our results also seem at odds with those of Canseco-Gonzalez et al. (Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010), who found that bilinguals led to adopt a bilingual mode of processing prior to engaging in a visual world task showed increased cross-language competition during that task. This discrepancy in findings, however, may have arisen because their participants spoke both languages before they engaged in the visual world task. This situation likely created a context where both languages needed to remain active. As a consequence, this may have biased participants against the adoption of a proactive global language inhibition strategy during the visual world task, even if one of the languages was no longer needed, because the change in language mode was not obvious. In addition, because participants named all pictures to be presented in the visual world task in both languages prior to its performance, the presence of cross-language competitors may have been more obvious, which may have also contributed to larger cross-language competition effects. Our results thus extend those of Canseco-Gonzalez et al. (Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010) by showing that when the language context prior to the experiment is designed to induce a monolingual mode of processing, and perhaps especially if it involves spoken production (see difference from Elston-Güttler et al., Reference Elston-Güttler, Gunter and Kotz2005 above), bilinguals can proactively suppress the non-target language system to reduce cross-language interference.
We next discuss the results pertaining to within-language competition. Consistent with several studies of L1 spoken language processing (e.g. Allopenna et al., Reference Allopenna, Magnuson and Tanenhaus1998; Blumenfeld & Marian, Reference Blumenfeld and Marian2011; Dahan & Gaskell, Reference Dahan and Gaskell2007; Yee & Sedivy, Reference Yee and Sedivy2006) and L2 spoken language processing (e.g. Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Marian & Spivey, Reference Marian and Spivey2003a, Reference Marian and Spiveyb; Mercier et al., Reference Mercier, Pivneva and Titone2014), we found within-language lexical competition in all participants, regardless of whether the task language (English) was their L1 or L2, or whether the language they spoke prior to the task was the target or non-target language. However, we found differences in the size of within-language competition over the course of the experiment, which suggested that global language inhibition was associated with a processing cost in the target language.
Specifically, we found greater within-language competition at the beginning of the experiment relative to the end of the experiment in bilinguals who had switched language prior to the spoken word recognition task, albeit at different points in the time course of lexical access depending on whether the task was performed in L1 or L2. In bilinguals who had switched from speaking their L2 to listening to words in their L1, more within-language competition was observed early in the time course of lexical access (i.e. in terms of fixations to competitor pictures 200 ms to 600 ms following the onset of spoken words), while for bilinguals who had switched from speaking their L1 to listening to words in their L2, within-language competition was observed at its end point (i.e. in terms of latency clicking on target pictures). These results suggest that proficient bilinguals can globally suppress a non-target language, whether it is the L1 or L2, but that doing so diverts cognitive resources away from the task at hand so that a certain cost to the target language is incurred.
The notion of a finite amount of cognitive resources that need to be allocated wisely to ensure adequate performance is consistent with ideas advanced by De Groot and Christofells (Reference De Groot and Christoffels2006), and generally with those of Braver (Reference Braver2012). Indeed, De Groot and Christofells (Reference De Groot and Christoffels2006) suggest that language control mechanisms require that sufficient resources are available to act effectively and lead to optimal performance. Similarly, Braver (Reference Braver2012) proposes that even if both reactive and proactive control can be engaged in parallel, there is likely a heavier reliance on one mechanism at any point in time. Furthermore, he proposes that proactive control requires more resources and is thus only recruited if the tradeoff is favorable. This may be the case when a language is spoken as opposed to more passively received. Our results, however, suggest that a spoken task, when it precedes a more passive reception task, may induce proactive global inhibition strategies. This strategy, while reducing cross-language interference, may mean that more limited cognitive resources are available to cope with within-language competition.
This discussion suggests that bilinguals, through their experience of coping with cross-language interference, may learn to recruit flexible proportions of both reactive and proactive inhibitory control mechanisms based on the situation. This hypothesis is consistent with a growing body of work showing that bilinguals, as a group, outperform monolinguals on tasks requiring domain-general cognitive control (e.g. Bialystok et al., Reference Bialystok, Craik, Green and Gollan2009; Bialystok, Craik & Luk, Reference Bialystok, Craik and Luk2008; Bialystok, Craik & Ryan, Reference Bialystok, Craik and Ryan2006; Costa, Hernández & Sebastián-Gallés, Reference Costa, Hernández and Sebastián-Gallés2008; for a review, see Bialystok, Reference Bialystok2009). While the routine need to suppress cross-language competitors has been proposed to contribute to the development of inhibitory control advantages in bilinguals relative to monolinguals (Green, Reference Green1998; Kroll, Reference Kroll2008), the notion of a more general cognitive control advantage is increasingly favored over that of a more specific inhibitory control advantage (e.g. Bialystok et al., Reference Bialystok, Craik and Luk2012, for a view extending beyond inhibitory control; Costa, Hernández, Costa-Faidella & Sebastián-Gallés, Reference Costa, Hernández, Costa-Faidella and Sebastián-Gallés2009; Hilchey & Klein, Reference Hilchey and Klein2011; Mishra, Hilchey, Singh & Klein, Reference Mishra, Hilchey, Singh and Klein2012; but see Paap & Greenberg, Reference Paap and Greenberg2013). For instance, Costa et al. (Reference Costa, Hernández, Costa-Faidella and Sebastián-Gallés2009) proposed that the cognitive benefits bilinguals gain from their experience is to a more fine-tuned conflict-monitoring system that allows them to better allocate cognitive resources.
In this work, we operationally defined within- and cross-language competition in three distinct ways, some of which departed from the approach typically adopted in the literature to date (e.g. Marian & Spivey, Reference Marian and Spivey2003a; Spivey & Marian, Reference Spivey and Marian1999). That is, instead of directly comparing the fixations to competitor pictures to those made to control pictures within any given trial, we chose to examine the effect of lexical competition on fixations to target and competitor pictures across trials as a function of the presence or absence of the competitor based on the display presented or word heard.
We adopted this approach for two main reasons. First, we wished to obtain greater sensitivity in the detection of lexical competition, especially cross-language competition, which is known to be more elusive (e.g. Blumenfeld & Marian, Reference Blumenfeld and Marian2007; Canseco-Gonzalez et al., Reference Canseco-Gonzalez, Brehm, Brick, Brown-Schmidt, Fischer and Wagner2010; Ju & Luce, Reference Ju and Luce2004; Marian & Spivey, Reference Marian and Spivey2003a; Mercier et al., Reference Mercier, Pivneva and Titone2014). In the visual world task people can easily process pictures which are repeated over the experiment, parafoveally, and thus without the need for a direct fixation. Indeed, lower levels of candidate word activation can result in presaccadic shifts in covert attention (e.g. Henderson, Reference Henderson1992; Henderson, Pollatsek & Rayner, Reference Henderson, Pollatsek and Rayner1989), but no actual saccadic motor plans leading to fixations (Altmann, Reference Altmann2011). Thus, to the extent that parafoveal processing takes place (i.e., processing of visual information extending from the fovea – the central 2 degrees of visual field – out to about 5 degrees of visual angle from fixation), competitor pictures may be identified by peripheral vision with limited direct saccades launched towards the competitors, resulting in an effect observable in terms of target fixations but not competitor fixations. While parafoveal processing likely takes place for both within- and cross-language competition, for within-language competitors, factors such as a higher resting level of activation of words from the language being used may contribute to more direct and frequent fixations to within-language competitors. In that context, measuring fixations to the target picture in the presence vs. absence of cross-language competitor in the display allows the measurement of the subtler, though significant, encroachment of the cross-language competitor on the processing of target words.
Second, the approach of comparing fixations across instead of within trials has the added benefit of circumventing some of the problems associated with the direct comparison of fixations to competitors with fixations to control pictures. Indeed, because an eye fixation can only be recorded at one location at a time, observations collected within a given trial are not independent (Barr, Reference Barr2008). Consequently, a greater proportion of fixations to one picture is automatically associated with a reduction in the proportion of fixations to other pictures.
While the results of this study improve our understanding of the impact that language context can have on the recruitment of a global inhibitory control strategy, there are several potential limitations that would be important to address in future work. One limitation is the use of a between-participants design. Although few significant differences were found between participants, the impact of a switch was always tested in one direction, from French to English, the target language of the visual world task. English was not the most proficient language of all participants, with the results that for native French participants tested in the switch condition the operative manipulation was confounded with task difficulty. The visual world task, when performed under the switch condition, was probably of uneven difficulty for native French bilinguals for an additional reason, which is that most of them experienced more than one switch prior to the visual world task, thus taxing their cognitive resources to a greater degree. Specifically, when they performed the monologue and dialogue in French, they probably needed to reactivate French following an attempt to globally suppress it when coming into our lab, an English environment. The language context manipulation, then, might have required an additional global inhibition effort on their part. Native French bilinguals in our study, however, may have also had more experience than native English bilinguals of switching between relatively monolingual language contexts and using a global language inhibition strategy, because most of them were McGill students and thus routinely needed to adopt an English monolingual mode. Future research ought to test the effects of such a language switch in both directions (L1 to L2, L2 to L1) in all participants to better control for such possibilities.
In conclusion, we examined the impact that a prior language manipulation can have on the extent of cross-language competition bilinguals experience when they then switch into a situation requiring another language, and found evidence that highly proficient bilinguals can globally suppress a non-target language, whether it is the L1 or L2, although doing so requires cognitive resources that may be diverted from other task demands. The influence of language context, and language mode, is increasingly recognized as a potential mitigating factor with respect to cross-language activation during bilingual language processing (Wu & Thierry, published online November 1, 2010), and more research is warranted to examine the interaction between switches in language mode and inhibitory control strategies (i.e. local vs. global or, similarly, reactive vs. proactive). Future work may also examine the impact of individual differences in the attitude towards, and use of, language switching in participants’ daily lives (Green, published online May 19, 2011), and also of both internal variables (e.g. general inhibitory control, reactive vs. proactive inhibition abilities, personal preference towards a certain strategy, proficiency) and external variables (processing load, fatigue, noise) on bilingual language processing (Braver, Reference Braver2012; De Groot & Christoffels, Reference De Groot and Christoffels2006).
Supplementary material
To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S1366728914000340