Introduction
There is ample evidence that speaking two languages has collateral disadvantages in language processing. Bilinguals, compared to monolinguals, appear to have a slower and less reliable lexical access processing as revealed by an increase in naming latencies, decreased verbal fluency performance and more frequent tip-of-the-tongue states even in their dominant language (e.g., Bialystok, Craik, & Luk, Reference Bialystok, Craik and Luk2008; Gollan, Fennema-Notestine, Montoya & Jernigan, Reference Gollan, Fennema-Notestine, Montoya and Jernigan2007; Gollan, Montoya, Fennema-Notestine & Morris, Reference Gollan, Montoya, Fennema-Notestine and Morris2005; Gollan, Montoya & Werner, Reference Gollan, Montoya and Werner2002; Ivanova & Costa, Reference Ivanova and Costa2008; Luo, Luk & Bialystok, Reference Luo, Luk and Bialystok2010; Sandoval, Gollan, Ferreira & Salmon, Reference Sandoval, Gollan, Ferreira and Salmon2010). These difficulties are more pronounced when speakers experience L1 attrition, a phenomenon that refers to the gradual deterioration of the mother tongue (e.g., Köpke & Schmid, Reference Köpke, Schmid, Schmid, Köpke, Keijzer and Weilemar2004; Schmid, Reference Schmid2002; Seliger & Vago, Reference Seliger, Vago, Seliger and Vago1991). Despite the various studies that have addressed both the origin of the bilingual disadvantage in lexical access and the origin of L1 attrition, the causes of these phenomena are still open to debate (e.g., Köpke, Reference Köpke, Köpke, Schmid, Keijzer and Dostert2007; Sandoval et al., Reference Sandoval, Gollan, Ferreira and Salmon2010; see also the special issue of Bilingualism: Language and Cognition on L1 attrition edited by Monika Schmid (Schmid, Reference Schmid and Schmid2010)).
A study by Levy, Mc Veigh, Marful and Anderson (Reference Levy, Veigh, Marful and Anderson2007) provided evidence for a domain-general memory mechanism that could underlie both the bilingual disadvantage in lexical access and L1 attrition. This mechanism is called RIF (retrieval-induced forgetting), and it is supposed to reveal the effects of inhibitory processes that suppress interfering memory traces during memory recall. The authors argued that retrieving words from an L2 involves the activation of the corresponding memory traces of L1 translations, causing a strong interference effect. Given this interference, the selection of the desired word in L2 may entail the inhibition of the translation word in L1. As a consequence, and on the additional assumption that the L1 suppression increases cumulatively over the number of times one retrieves a given word in L2, the more a bilingual speaker uses the L2 the more suppression is inflicted on the memory traces of the L1 representations. Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) argued that the decrease in the availability of L1 representations as a consequence of L2 use would be at the basis of L1 attrition. As they put it, “[n]ative-language words for ideas used most often in the foreign language are most vulnerable to forgetting” (Levy et al., Reference Levy, Veigh, Marful and Anderson2007, p. 33). The goal of the present study was to assess: (i) the theoretical implications of the mechanism proposed by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007), and (ii) the reliability of the observations themselves.
The mechanism proposed by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) to account for L1 attrition can also be used to explain the less severe bilingual disadvantages in speech production. For example, the finding that bilinguals are slower than monolinguals in lexical access could be explained as a reflection of the bilinguals’ need to resolve competition between the target word and its translation. In fact, several proposals have been put forward before along these lines (e.g., Bialystok et al., Reference Bialystok, Craik and Luk2008). Furthermore, this explanation of L1 attrition is based on the same mechanism that has been adopted by various models of bilingual lexical access to explain how bilinguals avoid lexical intrusions from the non-target language during speech production (e.g., Green, Reference Green1986, Reference Green1998). In sum, the mechanism proposed by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) provides a unified account for L1 attrition, the bilingual disadvantage in lexical access and the dynamics of bilingual lexical selection, in which inhibition of competing representations would be at the core of all three phenomena.
The basic mechanism of retrieval-induced forgetting and its extension to L1 attrition
Retrieval-induced forgetting is an account of forgetting based on the observation that the recall of a given piece of information can impair subsequent retrieval of related knowledge (e.g., Anderson, Bjork & Bjork, Reference Anderson, Bjork and Bjork2000). In a typical RIF experiment, subjects study a series of category–exemplar pairs (FRUIT–ORANGE, FRUIT–BANANA; DRINK–BOURBON). Later on, half of the pairs of half of the categories are practiced in a retrieval phase where subjects have to recall the exemplar names (FRUIT–O____). In a final test, subjects’ memory of all exemplar items is tested (FRUIT–O____, FRUIT–B___; DRINK–B____). As expected, a robust effect of facilitation is observed for retrieval practiced items in comparison to baseline items (items of unpracticed categories). More interestingly, subjects’ recall of unpracticed items of practiced categories is worse than memory of baseline items. This is precisely what the RIF effect is, and it has been argued to arise in the following manner. When subjects retrieve items during the practice phase, other related exemplars also become activated interfering with the target. This interference is resolved through the suppression of the competing items which leads to an impaired recall in the final test (e.g., Anderson, Bjork & Bjork, Reference Anderson, Bjork and Bjork2000; but see Camp, Pecher & Schmidt, Reference Camp, Pecher and Schmidt2007). RIF is not only found for members of taxonomic categories, but also for a variety of other situations such as memory of visuo-spatial objects and personality traits (e.g., Ciranni & Shiamura, Reference Ciranni and Shimamura1999; MacLeod & Macrae, Reference MacLeod and Macrae2001; Macrae & MacLeod, Reference Macrae and MacLeod1999). This fact has lead some authors to propose that RIF is a domain-general mechanism that is operative whenever there is a need to resolve interference between competing memory traces (e.g., Levy & Anderson, Reference Levy and Anderson2002), a situation that bilinguals have to face whenever speaking in one of their languages.
Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) extended this view to the case of bilingualism, testing the hypothesis that the retrieval of L2 words will hinder the subsequent retrieval of the corresponding L1 translations. Three different phases were included in their experiment. First, subjects studied pictures along with their L2 labels. Second, they were asked to name the pictures ten, five, one or zero (baseline items) times in L1 or L2 depending on a color cue. Third, they were presented with prompt words and asked to retrieve the L1 labels that rhymed (Experiments 1 and 2a) or were semantically related (Experiment 2b) with the prompt word and that had been presented in the study phase.
The third phase, in which participants are asked to retrieve the words in L1 is the critical one. In particular, what is important is the extent to which the retrieval of L1 words is affected by the number of times the corresponding L2 translations have been produced. When the L1 words were elicited by semantically related word prompts, L2 repetition benefited the retrieval of the L1 words. That is, the more times a word was named in L2 the better the recall of the corresponding L1 translation. Thus, no RIF was reported in this task. However, the crucial result upon which Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) based their conclusions refers to the rhyming task, where RIF across languages was present. In this task, the retrieval of L1 words was impaired as the naming in L2 increased, but only for less fluent bilinguals (see general discussion for the effect of proficiency). This result was interpreted as revealing that the more times one uses an L2 word, the harder it becomes to retrieve its L1 translation.
However, a closer look at the experimental evidence reported by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) reveals that the RIF effect across languages is not very robust and it is numerically rather small. First, against the authors’ arguments, the retrieval of words in L1 was not different for baseline, one and five repetitions. If anything, naming in L2 once helped subsequent retrieval of L1 translations as compared to baseline. That is, contrary to the authors’ conclusions, to retrieve L2 words one or five times does not hamper the subsequent retrieval of the corresponding L1 translations. The authors appear to neglect these two results and focus on just one observation, namely the significant difference in recall rates of those pictures that were named 10 times in L2 (34%) as compared to baseline items (41%). Numerically this is a rather small effect. To appreciate this, it is worthwhile translating the percentages of recall into natural frequencies. The distribution of items in the different conditions was uneven. In the baseline set (pictures that were never named) there were ten words, while in the ten repetition set there were only five words. Hence, in natural frequencies, the 41% recall rate in the baseline condition corresponds to an average retrieval of two words, and the 34% retrieval in the ten repetition condition corresponds to 1.7 words. This seems like a very small effect to account for L1 attrition. Similar problems arise when assessing the robustness of the interference effect produced by L2 naming on L1 recall in the phonological test in Experiment 2a. In this experiment, there were no differences between naming the pictures in L2 once, five times, or not naming the pictures at all. However, again there was a small effect after naming the pictures in L2 ten times compared to not naming them at all: 66% of the words (3.3 words in natural frequencies) were recalled in the former condition and 72% (3.6 words) in the latter. Nevertheless, this complex pattern of results might be used to argue that there is inhibition of the L1 translations during L2 production, but in doing so one should give an explanation of why neither one nor five repetitions are enough to see such inhibitory component. In our view, the small magnitude of the inhibitory effect and the lack of a systematic presence cast some doubts on the reliability and robustness of the phenomenon.
Leaving aside the issues regarding the robustness and reliability of the study of Levy et al. (Reference Levy, Veigh, Marful and Anderson2007), their results are remarkable considering previous evidence from the RIF literature. RIF effects are not present across the board, but rather are eliminated or reversed under various different conditions. For example, when subjects are encouraged to interrelate exemplars by focusing on the common properties of items, or when the practiced and unpracticed items belong to the same subcategory (i.e. “hoofed animals”), retrieval-induced forgetting can be reduced or eliminated (e.g., Anderson & McCulloch, Reference Anderson and McCulloch1999; Bauml & Hartinger, Reference Bauml and Hartinger2002). Similarly, it has been found that increasing the similarity between practiced and unpracticed items can lead to a facilitated recall of the latter, and this facilitation can even generalize to related items that did not belong to the initially studied materials (i.e. retrieval-induced facilitation, e.g., Anderson, Green & McCulloch, Reference Anderson, Bjork and Bjork2000; Chan, Reference Chan2009; Chan, McDermott & Roediger, Reference Chan, McDermott and Roediger2006). That is, RIF does not seem to be found when there is high level of semantic similarity between practiced and unpracticed items. Given that translation words are thought to have a very large semantically overlap (e.g., Zeelenberg & Pecher, Reference Zeelenberg and Pecher2003), one would expect to obtain retrieval induced facilitation rather than interference. Hence, the boundaries regarding the presence of RIF are not obviously met in the context of bilingual language production.
Retrieval-induced forgetting as the basis of L1 attrition: Theoretical implications
The conclusions reached by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) also have important theoretical implications when one puts them in the context of the directionality of the speech production process. Recall that the RIF effect was present (if anything) when the task involved phonological cue retrieval, but not when it involved semantic cue retrieval. From this observation the authors conclude that the inhibition occurs at the level of phonological representations and not at the level of lexico-semantic representations. This is an interesting proposal, but in a way it compromises the interpretation of RIF as the origin of L1 attrition since speech production is semantically and not phonologically driven. That is, we retrieve words because we want to convey a meaning not because they rhyme. And, it appears from Levy et al.'s results that when bilinguals have to retrieve words from the semantic system the supposed inhibition suffered by the phonological representations in L1 is completely irrelevant for the outcome of the speech production system. Consequently, the effects on lexical search prompted by phonological cues may be irrelevant for understanding how lexical access from the semantic system works.
As seen above, the observations of Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) might be relevant for constraining models not only related to L1 attrition and the bilingual disadvantage in lexical access, but also models of general bilingual language control. However, it is also apparent that the results are in part inconsistent with previous observations and theoretical proposals, and also that the effect reported is small and not very robust. Thus, it is fundamental to assess the reliability and generalizability of these results. This was the goal of the present experiment, in which we followed closely the experimental paradigm used by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) and tested a total of 141 bilinguals. To preview the results, we failed to observe any RIF in L1 associated with naming in a weaker or non-dominant language. Instead, naming pictures in the non-dominant language facilitated the retrieval from memory of the corresponding translations in L1 as compared with no naming whatsoever.
Experiment: Is the retrieval-induced forgetting (RIF) effect across languages a reliable phenomenon?
In this experiment, we tested 141 participants belonging to three different groups: low-proficient L2 learners (group 1), medium-proficient L3 learners (group 2) and high-proficient bilinguals (group 3) (from now on, we will refer to L1 as the dominant language and L2/L3 as the non-dominant language). We decided to include these three different groups to cover a relatively wide range of bilinguals. This is important given the lack of information on the type of participants tested in the study of Levy et al. (Reference Levy, Veigh, Marful and Anderson2007). Participants were first shown drawings along with their non-dominant language labels. Afterwards, they had to name a set of these drawings in the dominant or the non-dominant language according to a color cue. While some pictures of the first phase were presented one, five or ten times, others were not named at all. The language in which each picture was named remained the same throughout this phase. Finally, subjects’ memory of the dominant language labels was tested through the presentation of a written rhyme-cue. The crucial conditions were (i) language of picture-naming (dominant and non-dominant), and (ii) number of repetitions (0, 1, 5, or 10). The dependent measure was the percentage of correctly recalled dominant language labels in the final test.
The presence of RIF would be indexed by a worse performance in retrieval of words in the dominant language associated with an increase of repetitions in the non-dominant language. According to Levy et al.'s results, we should expect to find RIF for the group that is low-proficient in the non-dominant language, and perhaps also for the group of medium-proficient speakers, but not for the group of high-proficient bilinguals. In other words, following the results of Levy et al., we expected an interaction between the RIF effect and non-dominant language proficiency.
Method
Participants
Three groups of participants were included in the experiment. Group 1 comprised 56 native speakers of Spanish with low proficiency in English from the universities of País Vasco and Murcia. The languages tested were Spanish (dominant) and English (non-dominant). Group 2 comprised 53 native Spanish speakers with high proficiency in Catalan and medium proficiency in English from two universities located in Barcelona (University of Barcelona and Pompeu Fabra University). The languages tested were Spanish (dominant) and English (non-dominant). Group 3 comprised 32 native Spanish speakers with high proficiency in Catalan, from the University of Barcelona. The languages tested were Spanish (dominant) and Catalan (non-dominant). Participants were asked to fill out a questionnaire regarding their language history and proficiency (see Appendix A for more details).
Material
Forty drawings with concrete and unambiguous names in all testing languages were selected (see Appendix B for a complete list of materials). All of them had non-cognate names in all testing languages and differed in their final syllable in Spanish (e.g., vela, espelma, candle “candle”; perro, gos, dog “dog”). Twelve filler drawings were included all of which also had different final syllables in Spanish. For the final test, 40 Spanish nouns that rhymed with the 40 Spanish picture labels but not with those of the other testing language were selected (e.g., vela–tela, perro–cerro).
Procedure
There were three different phases in the Experiment (see Figure 1). Each trial throughout the whole experiment started with a blank interval of 700 ms and a fixation point (an asterisk) presented for 500 ms. In phase 1 (study), all the 40 pictures were presented along with the corresponding word in the non-dominant language of the participant. The 40 pictures with the words were presented one at the time for five seconds and the task of the participants was to study the picture and its label. To control for effects of primacy and recency, twelve filler drawings were included (6 at the beginning and 6 at the end) so in total there were 52 trials in this phase of the experiment.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-69156-mediumThumb-S1366728911000034_fig1g.jpg?pub-status=live)
Figure 1. Schematic representation of the procedure of the cross-language RIF experiment. In the study phase, participants were instructed to study the L2 labels (groups 1 and 3) or L3 labels (group 2) of 40 pictures. In the retrieval practice phase, 15 of these pictures were presented with a green frame (striped in the figure) indicating that they should be named in L1, and 15 were presented with a red frame (plain in the figure) indicating that they should be named in L2 (groups 1 and 3) or L3 (group 2). In each set of 15 pictures 5 were presented once, 5 were presented 5 times and 5 were presented 10 times for naming. Ten of the pictures were not presented at all during this phase. In the final test, participants were presented with 40 rhyme-words as a cue to recall the L1 label of all 40 pictures.
In the second phase (retrieval practice), participants were asked to name 75% of the pictures that had appeared in the first phase using the language indicated by a colored frame (e.g., green pictures named in the dominant language and red in the non-dominant language). Of this set of pictures, 25% were presented once (five in green and five in red), 25% five times (five in green and five in red), and 25% ten times (five in green and five in red). In this way, the amount of trials in each of the two testing languages was equal. Each picture appeared with the same color frame throughout the experiment so that the language in which a given picture was named remained the same during this phase. The remaining 25% of pictures (ten pictures) did not appear in this phase and served as a baseline condition for the final test. To control for effects of primacy and recency, the same twelve filler drawings were included (six at the beginning and six at the end), resulting in a total of 172 trials: ten pictures repeated once, ten repeated five times, ten repeated ten times and twelve fillers. Pictures were displayed until a response was given or for a maximum of four seconds. If participants did not answer, the correct name appeared on the computer screen during 500 ms.
In the third phase (final test), 40 words in the dominant language were presented, one at the time, on the computer screen. The task of the subjects was to provide a Spanish word that rhymed with the one presented on the computer screen and matched a previously viewed picture (marco–barco, etc.). The rhyme-word was presented for a maximum of four seconds and it disappeared with the detection of a response.
To control for effects of frequency, length and other possible confounding factors we decided to rotate the words through the experimental conditions. We thus created eight experimental lists because in the second phase of the experiment each picture could be named either in the dominant or in the non-dominant language and it could be named zero, one, five or ten times (2 languages × 4 repetitions).
Data analyses
Errors and omissions from the picture naming phase were analyzed by language. In the final rhyme-cued test the dependent measure was the percentage of correctly retrieved dominant language words. The analyses of the three groups of participants in the final test phase were firstly conducted separately. Two variables were included in a 2 × 4 ANOVA (language of picture naming: dominant vs. non-dominant; and number of repetitions (0, 1, 5 and 10).
Subsequently and given that the predictions are rather different for the dominant (a positive influence of repetition in subsequent word-retrieval in the dominant language) and the non-dominant language (a negative influence of repetition in subsequent word-retrieval in the dominant language), we also analyzed the results for the two languages separately. Finally, we conducted an omnibus analysis including the three groups of participants.
Only items that had been correctly named 80% of the time during the picture-naming phase are reported in the analysis in order to ensure that any effects could be unequivocally attributed to previous naming. Levy et al. (Reference Levy, Veigh, Marful and Anderson2007) did not specify whether they proceeded similarly (e.g., if they excluded or not the items that were not named or that were named erroneously). However, none of the effects reported below changed when all the responses were included in the analyses. Responses in which participants came up with a non-rhyming word or with a rhyming word that was not presented in the study phase were discarded from the analysis.
Results
Group 1: Spanish-dominant–English-non-dominant
Picture naming
Participants committed significantly more omissions in L2 (7%) than in L1 (1%) (F(1,55) = 56.353, MSE = 16.607, p ≤ .001), but significantly more errors in L1 (8%) than in L2 (3%) (F(1,55) = 24.637, MSE = 29.695, p ≤ .001).
Final test
The main effect of repetition was significant (F(3,165) = 19.96, MSE = .047, p ≤ .001). Neither the main effect of language (p > .8) nor the interaction between language and repetition was significant (p > .9).
When analyzing the two languages separately, for the dominant language there was a main effect of repetition (F(3,165) = 12.315, MSE = .041, p ≤ .001), revealing a lower recall for the baseline condition (no repetitions) than for all the other conditions (all ps < .01). Thus, not surprisingly, repeating words in the dominant language helped subsequent access to those same words in the rhyming task. More important, for the non-dominant language, there was also a positive effect of repetition on recall performance (F(3,165) = 8.832, MSE = .050, p ≤ .001). This effect was significant for all repetition conditions against baseline (all ps < .01). That is, unlike in Levy et al.'s (Reference Levy, Veigh, Marful and Anderson2007) study, naming a picture in the non-dominant language facilitated later recall of the translation word in the dominant language. Thus, as is apparent in Figure 2, there was no RIF across languages.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-93277-mediumThumb-S1366728911000034_fig2g.jpg?pub-status=live)
Figure 2. Percentage of correctly recalled Spanish words (y-axis) as a function of amount of repetitions (0, 1, 5, or 10; x-axis) broken by naming language (L1 Spanish or L2 English) in the picture naming phase. The interlingual RIF effect would be indexed by a lower percentage of recall after naming in English than for the baseline condition (zero repetitions). Error bars represent standard error.
Group 2: Spanish-dominant–English-non-dominant
Picture naming
Participants committed significantly more omissions in L2 (3%) than in L1 (0.5%) (F(1,52) = 24.851, MSE = 5.581, p ≤ .001), but significantly more errors in L1 (8%) than in L2 (3%) (F(1,52) = 20.052, MSE = 34.936, p ≤ .001).
Final test
The main effects of repetition (F(3,156) = 13.208, MSE = .064, p ≤ .001) and language (F(1,52) = 5.236, MSE = .046, p = .026) were significant. The interaction between these variables was not significant (p > .5).
When analyzing the two languages separately, for the dominant language there was a main effect of repetition (F(3,156) = 11.218, MSE = .052, p ≤ .001), revealing a lower recall for the baseline condition (no repetitions) than for all the other conditions (all ps < .01). Thus, again repeating words in the dominant language helped subsequent access to those same words in the rhyming task. More important, for the non-dominant language, there was also a positive effect of repetition on recall performance (F(3,156) = 5.272, MSE = .055, p = .002), with a significant difference between all repetition conditions against baseline (all ps < .02). Thus, no RIF across languages was present (see Figure 3).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-37200-mediumThumb-S1366728911000034_fig3g.jpg?pub-status=live)
Figure 3. Percentage of correctly recalled Spanish words (y-axis) as a function of amount of repetitions (0, 1, 5 or 10; x-axis) broken by naming language (L1 Spanish or L3 English) in the picture naming phase. The interlingual RIF effect would be indexed by a lower percentage of recall after naming in English than for the baseline condition (zero repetitions). Error bars represent standard error.
Group 3: Spanish-dominant–Catalan-non-dominant
Picture naming
Participants committed significantly more omissions in L1 (2%) than in L2 (1%) (F(1,31) = 4.030, MSE = 1.963, p = .053) and more errors in L1 (7%) than in L2 (4%) (F(1,31) = 7.599, MSE = 23.762, p = .01).
Final test
The main effect of repetition was significant (F(3,93) = 9.891, MSE = .052, p ≤ .001). The main effect of language approached significant values (F(1,31) = 2.888, MSE = .059, p = .099). No interaction between the two variables was observed (p > .4).
When analyzing the two languages separately, for the dominant language there was a main effect of repetition (F(3,93) = 8.045, MSE = .049, p ≤ .001), revealing a better recall in all repetition conditions as compared to baseline (all ps < .01). For the non-dominant language, there was also a positive effect of repetition on recall performance: (F(3,93) = 3.523, MSE = .043, p = .025), an effect that was present for all repetition conditions when compared to baseline (all ps < .05). Thus, no RIF across languages was present (see Figure 4).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-54862-mediumThumb-S1366728911000034_fig4g.jpg?pub-status=live)
Figure 4. Percentage of correctly recalled Spanish words (y-axis) as a function of amount of repetitions (0, 1, 5 or 10; x-axis) broken by naming language (L1 Spanish or L2 Catalan) in the picture naming phase. The interlingual RIF effect would be indexed by a lower percentage of recall after naming in English than for the baseline condition (zero repetitions). Error bars represent standard error.
Omnibus analysis
A 4 × 2 × 3 ANOVA with “repetition” (0, 1, 5 and 10) and “language” (dominant/non-dominant) as independent variables and group (1, 2 and 3) as a between subjects variable showed a significant main effect of repetition (F(3,414) = 38.981, MSE = .055, p ≤ .001), language (F(1,138) = 6.188, MSE = .051, p = .014) and group of participants (F(2,138) = 9.604, MSE = .028, p < .001) but no significant interactions between any of these variables (all ps > .3).
Discussion
The results of this experiment showed a consistent pattern across the three groups of participants. Naming pictures in the dominant language led to a better recall of those very same words in a subsequent rhyming task, replicating previous observations (Levy et al., Reference Levy, Veigh, Marful and Anderson2007). More important for our purposes is the effect of naming in the non-dominant language on the subsequent retrieval of the corresponding L1 translations in the rhyming task. We failed to see any detrimental effect of non-dominant language naming on dominant language recall. Instead, retrieval of dominant language words was substantially better when the corresponding translations were named in the non-dominant language than when they were not named at all.
Furthermore, the performance in the rhyming task of the three groups of participants was rather similar. That is, the pattern of performance was independent of the participant's fluency in the non-dominant language and of whether they performed the task in their L2 or L3. Note, however, that there was a main effect of group of participants in the omnibus analysis. That is, the group of medium-proficient participants performed slightly better in the recall phase both in their L1 and L2 when compared to the other two groups. The origin of this difference is not clear to us, but importantly this effect did not interact with either the language of testing or the amount of repetitions. Recall that, aside from naming pictures in the dominant language, participants in group 1 and 2 also named pictures in English in which they were low- and medium-proficient, respectively (probably the groups of participants that were most comparable to those of Levy et al., Reference Levy, Veigh, Marful and Anderson2007), and participants in group 3 also named pictures in Catalan in which they were high-proficient. Crucially, in none of the groups and in none of the three repetition conditions in the non dominant language did participants perform worse than in the baseline condition (the signature effect of RIF).
In fact, when considering the difference between the 10 repetitions condition in the non-dominant language and the baseline, only 28 participants out of 141 showed lower recall in the former condition (Figure 5 reports the collapsed analyses). A closer look at the individual data supports the notion that proficiency in the non-dominant language did not affect the percentage of recall in the rhyming task. As a proxy of language proficiency we took the average of the four questions answered by the participants regarding their knowledge of the non-dominant language. Also, we took the difference between the 10 repetitions condition and the baseline as an index of RIF (The data of the self-rated language proficiency for 18 participants was lost due to experimenter error. Therefore, the correlation only includes 123 participants). The correlation between these two indexes was near 0 (r = .06), and when the high-proficient group was excluded the correlation was even worse (r = .04). Hence, it appears that proficiency does not predict recall performance in the dominant language after having named pictures in the non-dominant language.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-76543-mediumThumb-S1366728911000034_fig5g.jpg?pub-status=live)
Figure 5. Subtraction of recall after ten repetitions in the non-dominant language from baseline recall (y-axis) as a function of participants’ self-rated non-dominant language proficiency (x-axis) broken by subjects. Negative values indicate worse recall after ten repetitions than for baseline (the RIF effect) while positive values indicate better recall.
In order to rule out the possibility that participants in our study employed a semantic strategy which yields facilitation as shown by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007), we had a closer look at the errors elicited in the final test of the experiment (approximately 8% of the total amount of responses). The errors were divided into four types: (i) phonologically related extra-experimental items (those errors that rhymed with the target but did not form part of the experiment such as raza “race” in response to plaza “square”); (ii) phonologically related experimental items (items that did form part of the experiment but only overlapped in the last two vowels with the rhyming cue such as cara “face” in response to plaza); (iii) phonologically unrelated extra-experimental items; and (iv) phonologically unrelated experimental items. We observed that the amount of extra-experimental items was larger than that of experimental items. If subjects would have employed a semantic strategy or a memory driven strategy to perform the task, one would have expected to find a larger amount of experimental items than extra-experimental items in the errors. However, this was not the case. We also observed that the amount of phonologically related errors was larger than that of unrelated errors, which clearly supports that participants were employing a phonological strategy. We further compared the error pattern of those participants who showed RIF with that of the rest of participants and found no significant difference.
Before moving on, it is worth discussing a potential difference in the performance of the groups we tested. For two of the groups we observed the expected effect of better recall when the naming was performed in the same language as that of recollection, but for the low-proficient bilinguals this difference was not significant. Note, however, that in the omnibus analyses there was no interaction between language and group of participants. Still, one may argue that the low-proficient bilinguals made use of a translation strategy during the picture naming phase (e.g., Kroll & Stewart, Reference Kroll and Stewart1994). Independently of the use or not of this strategy, the direction of the effect we obtain for all groups is the same. While strategy differences due to proficiency may play a role in the magnitude of the difference in recall between languages, it does not seem to have an influence on the effect in general. In sum, the lack of a difference in dominant language recall after naming in the dominant and the non-dominant language does not compromise at all the fact that non-dominant language naming helps the retrieval of dominant language representations. If anything, it reaffirms the notion that words that are frequently used in the non-dominant language do not cause long-lasting retrieval impairment of their translations in the dominant language.
Finally, we will dedicate some words to the apparently contradictory fact that participants in all three groups committed more errors in their dominant than in their non-dominant language. Recall that in the first phase of the experiment participants were familiarized with the picture labels in the non-dominant language, but not in the dominant language. Thus, participants knew which word they should use for each picture in the non-dominant language while they never received such an instruction for their dominant language, opening up the possibility of the use of alternative names for the pictures. Given the phonological nature of the final test of the experiment, such alternative names could not be considered as correct answers, thus increasing the error rate for the dominant but not for the non-dominant language.
General discussion
The goal of this study was to explore the presence of retrieval-induced forgetting (RIF) mechanisms in bilingual speech production. To do so, we assessed the reliability and generalizability of the RIF effect across languages reported by Levy et al. (Reference Levy, Veigh, Marful and Anderson2007). Recall that in their study there was an inhibitory effect of L2 speech production on L1 word recall. That is, those L1 words whose translations in L2 were produced several times were subsequently retrieved worse than those words that were not produced at all. To this end, we conducted a RIF experiment with 141 participants with differing proficiency levels in the non-dominant language. The following two main findings were obtained:
1. Increasing the number of times a word is named in the dominant language increases the successful retrieval of this word in the same language in a subsequent rhyming task.
2. Increasing the number of times a word is named in the non-dominant language also increases the successful retrieval of its translation in the dominant language in a subsequent rhyming task.
While the first result is in line with the results of Levy et al. (Reference Levy, Veigh, Marful and Anderson2007), the second one is in clear conflict. At present we do not have an explanation for these contrasting results, that is, the presence vs. absence of RIF across languages in Levy et al.'s (Reference Levy, Veigh, Marful and Anderson2007) and our study. However, we will now provide a general picture of the whole set of results that might be helpful in making sense of this discrepancy.
As mentioned in the introduction, the RIF effect across languages reported by Levy et al. was circumscribed. First, it was only present for those bilinguals with low fluency in their L2 (as indexed by differences in naming latencies between languages). For the other half of the participants no RIF whatsoever was observed. Second, even when assessing this sample, the RIF effect was present only between baseline and ten repetitions, but not between baseline and one and five repetitions. Thus, the RIF effect across languages was only observed for half of the participants, and only comparing ten repetitions vs. baseline. A closer look at our data reveals that only 28 participants showed a RIF effect, but none of the groups as a whole showed such an effect. All in all, this shows that, at best, the RIF effect across languages is a rather elusive phenomenon, and that caution needs to be exercised when deriving strong conclusions from Levy et al.'s observations.
Still, one may think of differences between the proficiency in the non-dominant language between the participants tested in our study and those of Levy et al.'s study as a potential source of the discrepant results. That is, it might be argued that the knowledge of the non-dominant language of our participants was greater than those of the previous study, and as a consequence we failed to observe any RIF effect across languages. This is a difficult argument to put forward since we do not know the characteristics of the sample used by Levy et al. Unfortunately, little information is given about the language history of the participants in Levy et al.'s study. We only know that they were undergraduate students who had recently completed at least one year of college-level Spanish – the non-dominant language (we do not even know if English was the first language or whether they knew another language or not). Hence it is unclear how homogeneous the sample was. Some hint about the heterogeneity in the sample can be found in the post-hoc analyses conducted in their study, where the sample of 64 participants was split into two groups of 32 individuals each. The individuals were assigned to the two groups according to the difference in the speed with which they named the pictures in the dominant vs. the non-dominant language. This difference was taken as a proxy for differences in language proficiency. For one of the groups, the difference in naming latencies was quite large (more than 200 ms) being faster in English. However, for the other group naming latencies were faster in Spanish rather than in English. Thus, it appears that the sample was quite heterogeneous in Levy et al.'s study. Acknowledging that it is always difficult to control for cultural differences in learning an L2 between Europe and the USA, we believe that the wide spectrum of bilinguals tested in our study presumably guarantee that we covered the values of Levy et al.
Our data question the utility of RIF in language attrition. Even if one is tempted to conclude that RIF is present when the knowledge of the non-dominant language is very low, it is hard to imagine how this could be the cause of L1 attrition. This is because in real life, language attrition actually happens when people practice the non-dominant language enough. As the authors put it, “[t]his phonological RIF arises precisely because frequent use engages inhibitory control to achieve the fluency desired by foreign-language speakers” (Levy et al., Reference Levy, Veigh, Marful and Anderson2007, p. 33; emphasis added). That is, frequent use is needed to promote RIF and, presumably, first language attrition. Consequently, one should expect RIF to be present also in the individuals tested in our experiment after a high amount of repetitions, or alternatively an interaction between language proficiency and RIF in the opposite direction to the one observed by Levy et al. (arguably, the more proficient speakers also use the non-dominant language more often than the less proficient ones). The fact that the vast majority of participants did not show the RIF effect and that there was no correlation between language proficiency and RIF seriously compromises the idea of this mechanism as the cause of first language attrition.
Note that the lack of a RIF effect across languages does not necessarily preclude the possibility that language control mechanisms make use of inhibitory processes (e.g., Kroll, Bobb, Misra & Guo, Reference Kroll, Bobb, Misra and Guo2008). Our results, however, do actually put constraints on whether one can use this inhibitory mechanism (at least as it is indexed by RIF) to account for the long lasting consequences of using a second language on the first language. In fact, there is another result that compromises the conclusion reached by these authors: “Native-language words for ideas used most often in the foreign language are most vulnerable to forgetting” (Levy et al., Reference Levy, Veigh, Marful and Anderson2007, p. 33). This conclusion would predict that high frequency words should be more affected by L1 attrition than low frequency words. This is because high frequency words are by definition used more often than low frequency words. And, given that word frequencies tend to correlate across languages (at least in similar cultures), it is reasonable to expect that high-frequency words should suffer more RIF than low-frequency words. In this scenario, we would expect a reduced frequency effect in bilingual speakers when using their L1 as compared to monolingual speakers. However, results by Gollan, Montoya, Cera and Sandoval (Reference Gollan, Montoya, Cera and Sandoval2008) actually revealed the opposite, namely larger frequency effects for bilingual than for monolingual speakers. In fact, this effect was interpreted, against Levy et al.'s assumption, as revealing that low-frequency words are especially sensitive to bilingual disadvantages.
Before concluding, we should note that the observed facilitation between translation words fits well with some observations of RIF experiments. As advanced in the Introduction, when participants are presented with close semantic competitors (e.g., deer, moose) practicing one of them (moose) helps rather than hinders the subsequent retrieval of the other item (deer) (e.g., Anderson, Green & McCulloch, Reference Anderson, Bjork and Bjork2000; Bauml & Hartinger, Reference Bauml and Hartinger2002). In fact, crucially in our context, this facilitatory effect can generalize to other close semantically related items that have not been presented in the study phase (e.g., gazelle) (e.g., Chan et al., Reference Chan, McDermott and Roediger2006). Close semantic overlap and generalization to unpracticed items are two conditions met in our experiment, where translation words have a large semantic overlap, and practicing words in only one language exerts effects on the subsequent retrieval of its (unpracticed) translation in the other language. That is, retrieving “cow” facilitated the recall of the Spanish translation “vaca” which overlaps largely in features with “cow” even though “vaca” had never been presented in the experiment. Interestingly, the contrastive effects of practicing items on the subsequent retrieval of semantically related items (sometimes facilitation and sometimes interference) have led to some authors to put forward explanations that might be useful to understand similar results in the bilingual field.
For the sake of simplicity we will only explain one of these models here, namely the feature suppression model by Anderson and Spellman (Reference Anderson and Spellman1995, see also Anderson, Green & McCulloch, Reference Anderson, Bjork and Bjork2000), but it should be noticed that other similar proposals have been put forward in the literature (e.g., center-surround mechanisms, Barnhardt, Glisky, Polster & Elam, Reference Barnhardt, Glisky, Polster and Elam1996; Carr & Dagenbach, Reference Carr and Dagenbach1990; text processing account, Chan, Reference Chan2009). In the feature suppression model, inhibition and activation are processes that are thought to co-occur whenever a given representation is retrieved from memory: related semantic representations have some shared and non-shared semantic features. When retrieving a given item, semantically related representations are activated by virtue of the shared features. However, the non-shared semantic features of semantically related items are inhibited, hence reducing the availability of such items. Thus, the presence of facilitation or inhibition would result from trade-off between the facilitation produced by shared features and the inhibition produced by non-shared features. For instance, retrieving “couch” would activate features such as “inanimate”, “furniture”, etc. shared with other representations such as “bed” and “table”. At the same time, those non-shared features of “bed” and “table” such as “to sleep”, “to put things on”, etc. would be inhibited. In this way, the extent to and manner in which retrieving “couch” would subsequently affect the retrieval of “bed” depends on the shared non-shared features ratio: the bigger the ratio the more likely facilitation is observed. In this scenario, nothing but facilitation should be observed when retrieving words in language A that have been previously practiced in language B.
Interestingly, in the field of bilingual language production there is a similar literature about inhibitory and facilitatory effects of related words across languages. A closer look at all these phenomena reveals a striking parallelism to those observed in the memory literature. For example, several studies have reported facilitated or at least non-affected behaviour for translation-words (e.g., Costa & Caramazza, Reference Costa and Caramazza1999; Costa, Miozzo & Caramazza, Reference Costa, Miozzo and Caramazza1999; Francis, Augustini & Saenz, Reference Francis, Augustini and Saenz2003; Lee & Williams, Reference Lee and Williams2001). On the other hand, there are inhibitory effects in cross-language semantic competitor priming (Costa et al., Reference Costa, Miozzo and Caramazza1999; Costa & Caramazza, Reference Costa and Caramazza1999; Lee & Williams, Reference Lee and Williams2001). Just to mention one example, naming a picture that is presented simultaneously with a written distracter in another language is easier if that word is the translation of the target to be produced, but harder if it is a semantic competitor, always compared to an unrelated condition (e.g., Costa et al., Reference Costa, Miozzo and Caramazza1999; Costa & Caramazza, Reference Costa and Caramazza1999). Thus, this contrast in online effects mimics the one between retrieval-induced facilitation and retrieval-induced forgetting and could be elegantly accounted for by the feature suppression model (but see Mahon, Costa, Peterson, Vargas & Caramazza, Reference Mahon, Costa, Peterson, Vargas and Caramazza2007 for an alternative account). Of course, such an account remains silent about how the bilingual speaker manages to restrict language production to only one language. It would be interesting to examine to what extent language membership behaves just as any other semantic feature in which case bilingual language production would be essentially similar to monolingual language production, only that one distinctive feature – language membership – would always have to be suppressed in order to produce speech in the intended language.
Conclusion
To conclude, the results of our study show that repeated production of words in a non-dominant language enhances memory for the translation words in the dominant language. Thus, although the mechanism of retrieval-induced forgetting might still be one of the many mechanisms that contribute to the bilingual disadvantage in speech production and to the phenomenon of first language attrition, the present results clearly calls into question such a unitary account.
Appendix A. Language history and the self-assessed proficiency for all participants
The language history and self-assessed proficiency scores of the participants of all the groups are presented in Table A1. The mean age and standard deviation are given in years. The “L2 /L3 onset” refers to the mean age (in years) at which participants started learning Catalan or English. The proficiency scores were obtained through a questionnaire filled out by the participants after the experiment. The scores are on a four-point scale, where 4 = native-speaker level; 3 = advanced level; 2 = medium level; and 1 = low level of proficiency. The self-assessment index represents the average and standard deviation of the participants’ responses in four domains (speech comprehension, speech production, reading, and writing).
Table A1. Participants’ self-reported language history and proficiency (standard deviations in parentheses).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-94786-mediumThumb-S1366728911000034_tab1.jpg?pub-status=live)
Appendix B. Materials used in the cross-language RIF experiment (the English translation in brackets)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921030336-88571-mediumThumb-S1366728911000034_tab2.jpg?pub-status=live)