THE INFLUENCE OF EMOTIONAL AND FOREIGN LANGUAGE CONTEXT IN CONTENT LEARNING
As study abroad programs become more common, it is imperative that we understand how foreign languages (FLs) affect our learning. For example, are we able to learn new content in a FL to the same extent as in our native language (NL)? There is a substantial amount of literature assessing this question in children, but there is little published research regarding adult learning. Furthermore, the current adult literature focuses mainly on memory for single words (e.g., Anooshian & Hertel, Reference Anooshian and Hertel1994; Ayçiçeği & Harris, Reference Ayçiçeǧi and Harris2004; Caldwell-Harris, Reference Caldwell-Harris2009; Ferre, Garcia, Fraga, Sanchez-Casas, & Molero, Reference Ferré, Garcia, Fraga, Sanchez-Casas and Molero2010). One possible mechanism for improving content learning in an FL—drawing from the NL literature—is using emotionality to enhance memory. Emotional items are easier to remember in our NL than in our FL (see Caldwell-Harris, Reference Caldwell-Harris2014 for a review). But, can this strategy be used to improve performance in an FL? Importantly, prior single-word research has found reduced emotionality effects in an FL, but what happens if emotionality is conveyed throughout a longer text rather than in single words? The current study attempts to expand on these questions, testing memory for information embedded in an emotional context, to see whether this can boost content learning in an FL.
One of the most common types of programs that use FL to teach new information is content and language integrated learning (CLIL). CLIL refers to a curriculum-based approach in which content courses are taught using a second language, to teach both content and language through immersion. Although research on the language learning aspects of CLIL quite conclusively shows an improvement in FL use and comprehension (Admiraal, Westhoff, & De Bot, Reference Admiraal, Westhoff and De Bot2006; Aguilar & Rodríguez, Reference Aguilar and Rodríguez2012; Bergroth, Reference Bergroth, Björklund, Mard-Miettinen, Bergström and Södergard2006; Dalton-Puffer, Reference Dalton-Puffer2007; Jiménez Catalán & Ruiz de Zarobe, Reference Jiménez Catalán and Ruiz de Zarobe2009; Ouazizi, Reference Ouazizi2016; Serra, Reference Serra2007; Xanthou, Reference Xanthou2011; although see Dallinger, Jonkmann, Hollm, & Fiege, Reference Dallinger, Jonkmann, Hollm and Fiege2016 for no improvement), the research on content learning is less clear-cut (Dalton-Puffer, Reference Dalton-Puffer2011). There are studies that find positive effects (Day & Shapson, Reference Day and Shapson1996; Jäppinen, Reference Jäppinen2005; Ouazizi, Reference Ouazizi2016; Pérez Cañado, Reference Pérez Cañado2018; Surmont, Struys, Van Den Noort, & Van De Craen, Reference Surmont, Struys, Van Den Noort and Van De Craen2016; Van de Craen, Ceuleers, & Mondt, Reference Van de Craen, Ceuleers and Mondt2007; Xanthou, Reference Xanthou2011), while others find negative (Anghel, Cabrales, & Carro, Reference Anghel, Cabrales and Carro2016; Dallinger et al., Reference Dallinger, Jonkmann, Hollm and Fiege2016; Fernández-Sanjurjo, Fernández-Costales, & Arias Blanco, Reference Fernández-Sanjurjo, Fernández-Costales and Arias Blanco2017) or null effects (Admiraal et al., Reference Admiraal, Westhoff and De Bot2006; Bergroth, Reference Bergroth, Björklund, Mard-Miettinen, Bergström and Södergard2006; Serra, Reference Serra2007; Stohler, Reference Stohler2006). Consequently, these results paint a less than clear picture of how people learn new content in an FL.
The literature on adult FL-medium learning is more limited, with most of the reported benefits being associated with language (e.g., Yang, Reference Yang2014) and not content. These studies often show no difference between the control and experimental group in overall performance at the end of the course (e.g., Hernandez-Nanclares & Jimenez-Munoz, Reference Hernandez-Nanclares and Jimenez-Munoz2015), but very few examine the immediate understanding and learning of new content in an FL. Those that do report a difference find that instruction in an FL is detrimental, particularly without FL support (Roussel, Joulia, Tricot, & Sweller, Reference Roussel, Joulia, Tricot and Sweller2017). These results have been accounted for in the context of cognitive load theory, which suggests a working memory overload for individuals trying to learn content in a language in which they are not proficient (Roussel et al., Reference Roussel, Joulia, Tricot and Sweller2017). Importantly, contributing to this literature would influence and possibly improve teaching methods for adults studying in an FL.
Given the difficulties in learning new content in an FL, we need to find ways of compensating for or aiding in improving performance. One way of doing this is by applying what we know from NL studies. Considering this literature, one of the variables that aids learning is emotionality, as learning emotional words (see Caldwell-Harris, Reference Caldwell-Harris2014 for a review) and seeing neutral words in emotional contexts (Erk et al., Reference Erk, Kiefer, Grothe, Wunderlich, Spitzer and Walter2003; Erk, Martin, & Walter, Reference Erk, Martin and Walter2005) improve memory performance. However, several studies show that speakers are less emotional in an FL than in an NL context (Dewaele, Reference Dewaele2010; Harris, Gleason, & Ayçiçeǧi, Reference Ayçiçeǧi and Harris2006; Pavlenko, Reference Pavlenko2002). One might extrapolate from these studies that using emotionality as a tool to boost learning would not be as efficient in an FL. Indeed, Anooshian and Hertel (Reference Anooshian and Hertel1994) found that participants remembered emotional words better than neutral words in their NL, but not in their FL. This is in line with foreign language effect (FLE) research supporting a reduction in emotionality in an FL (Costa, Foucart, Arnon, Aparici, & Apesteguia, Reference Costa, Foucart, Arnon, Aparici and Apesteguia2014; Costa et al., Reference Costa, Foucart, Arnon, Aparici and Apesteguia2014; Costa, Vives, & Corey, Reference Costa, Vives and Corey2017; Hadjichristidis, Geipel, & Savadori, Reference Hadjichristidis, Geipel and Savadori2015; Keysar, Hayakawa, & An, Reference Keysar, Hayakawa and An2012, but see Vives, Aparici, & Costa, Reference Vives, Aparici and Costa2018). Conversely, other studies find the same effects of emotion on memory in both languages (Ayçiçeǧi & Harris, Reference Ayçiçeǧi and Harris2004; Caldwell-Harris, Reference Caldwell-Harris2009; Ferré, Ventura, Comesaña, & Fraga, Reference Ferré, Ventura, Comesaña and Fraga2015; Ponari et al., Reference Ponari, Rodriguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015). Therefore, it is not clear how the effects of emotionality in an FL compare to those of the NL.
Nevertheless, these conflicting results may be explained by alternative accounts, such as a reduction in intuitive responses and depletion of cognitive resources (Geipel, Hadjichristidis, & Surian, Reference Geipel, Hadjichristidis and Surian2015a, Reference Geipel, Hadjichristidis and Surian2015b, Reference Geipel, Hadjichristidis and Surian2016) or triggering of different cultural norms (Gawinkowska, Paradowski, & Bilewicz, Reference Gawinkowska, Paradowski and Bilewicz2013) in the FL. Gawinkowska et al. (Reference Gawinkowska, Paradowski and Bilewicz2013) suggest that the FLE is due to a difference in social and cultural norms rather than a difference in emotional impact between languages. Regardless of the origin of the effect, it is not clear whether people respond similarly to emotional stimuli in their NL and FL, nor whether they benefit from the effects of emotionality on memory the same way in an FL as in an NL. Furthermore, the paradigms used thus far predominantly focus on emotionally charged words in isolation rather than in context (e.g., Anooshian & Hertel, Reference Anooshian and Hertel1994; Ayçiçeği & Harris, Reference Ayçiçeǧi and Harris2004; Caldwell-Harris, Reference Caldwell-Harris2009; Ferré et al., Reference Ferré, Garcia, Fraga, Sanchez-Casas and Molero2010) and are limited to using single-word auditory material. This is particularly relevant because, contrary to this approach, information taught in classrooms is most commonly conveyed in context.
The objective of this study is to investigate content learning and how it is affected both by an FL and an emotional context. There is little research directly comparing acquisition of new concepts and knowledge in a bilingual’s NL and FL. Likewise, there is no research looking into the effects of emotionality in this context, nor listening to texts manipulating emotional context semantically. Understanding how these variables interact can contribute to classrooms that use an FL as the medium of teaching, improving methods and efficacy. To address this, we had participants listen to two descriptions of countries (one positive and one neutral) in either their NL (Spanish) or an FL (English), followed by a multiple-choice test. Using longer texts than those used in prior research, we aimed to create a more realistic replication of information processing and acquisition. Thus, participants were required to learn interrelated facts that made a coherent whole, rather than independent pieces of information disconnected from each other (see Frances, de Bruin, & Duñabeitia, 2019, for a similar study using vocabulary learning and nonrelated information). This would allow them to create more complex networks of meaning, which in turn would allow us to understand how semantic context can affect memory for individual facts within these larger conceptual networks. We hypothesized that despite the fact that their overall performance was likely to be poorer in the FL than in the NL contexts, bilinguals would not show an FLE, but instead would present similar emotionality effects in both languages. The rationale for this is that, if the FL affects responding by reducing reliance on intuition or simply requires more cognitive resources—as suggested before—the effect of emotionality should remain the same.
METHODS
PARTICIPANTS
Participants were 76 native Spanish speakers (38 in each language group, 9 male, Mage = 33.86, SDage = 9.14), recruited through language schools and randomly assigned to either the NL or FL context. All participants completed a test of English vocabulary (LexTALE; Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012) and had a minimum score of 60%. This is equivalent to a minimum of a B2 level according to the Common European Framework of reference for languages, with 50 participants at the B2 level range and 26 at the C1/C2 level (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). Participants in the two language contexts were matched on age and education level (i.e., highest level of schooling achieved, in all cases at least high school) according to the sociodemographic information gathered, as well as multiple language variables. They were asked to rate their English level overall on a 1-to-10 scale as well as their listening, reading, speaking, and writing skills in that language. They also reported their estimated age of acquisition of English and the amount of time spent living in an English-speaking country (M = 3.08 months SD = 4.65 months; all were living in Spain at the time of testing). Finally, they were matched on English and Spanish vocabulary knowledge as assessed by LexTALE (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012) and the LexTALE-Esp (Izura, Cuetos, & Brysbaert, Reference Izura, Cuetos and Brysbaert2014). For a summary of these variables, see Table 1 and online supplementary materials for means, distributions, and Bayes factors. The study and protocols were approved by the ethics committee at the BCBL.
TABLE 1. Matched means and standard deviations

Note: Numbers in parentheses refer to standard deviation for the FL and NL groups, except for in the final line (Bayes Factor) where they refer to error percentage. With BF01 a positive number above 1 supports no difference between the two groups, with 3 and above implying moderate evidence that the means are equal. Age and age of acquisition of English are in years, the self-ratings of level of English are on a scale from 1 to 10, and the LexTALEs are scored from 0 (chance) to 1 (perfect score).
INSTRUMENTS
We created the descriptions of two imaginary countries including 50 different items of information (e.g., national sport and population—see online supplementary materials for the list of test items). These two descriptions were then modified with filler sentences to include a more positive or neutral description of the country (e.g., neutral: “The population of Tecamer is defined politically as left wing, although they are considered generally quite moderate in their political, economic, and social opinions” and positive: “The population of Tecamer is defined politically as left wing and supports freedom, tolerance, and social inclusion as well as equal opportunity, leading many campaigns against discrimination”). The Spanish and English versions were created simultaneously and were matched on length. The texts were 50 to 56 sentences long and the average number of words in the English and Spanish versions were matched (1278.5 and 1317, respectively). The two emotional conditions were matched within languages on lemmatized word frequency of the content words (Spanish using LEXESP database, Sebastián-Gallés, Martí, Carreiras, & Cuetos, Reference Sebastián-Gallés, Martí, Carreiras and Cuetos2000; English using the HAL database, Lund & Burgess, 1996—Table 2). Importantly, the positive and neutral versions of the texts significantly differed on the mean valence and arousal of the words used, according to the ANEW database (Bradley & Lang, Reference Bradley and Lang1999) (valence: BF01 = 2.42 × 1011, error% = 5.22 × 10−18; arousal: BF01 = 3.07 × 1010, error% = 4.14 × 10−17). The number of high arousal (arousal >5) and high valence (valence >5) words also varied by condition (6% of the neutral condition and 12% of the positive condition was high valence word—see Appendix).
TABLE 2. Average word frequency by language and emotional condition

Note: Numbers in parentheses refer to standard deviation for the FL and NL groups, except for in the final row (Bayes Factor) where they refer to error percentage. With BF01 a positive number above 1 supports no difference between the two groups, with 3 and above implying moderate evidence that the means are equal.
These four texts (two countries, each with a neutral and a positive version) were read aloud and recorded by four female native Spanish speakers and four female native English speakers. Each recording lasted between 6.85 and 8.07 minutes (Mduration = 7.51 minutes, SDduration = .333 minutes).
PROCEDURE
Participants accessed the experiment through LimeSurvey (Schmitz, Reference Schmitz2019). First, they filled out a demographics and language questionnaire and then listened to two audio files, one of each country in a given emotionality and different speakers (out of the four possible ones in that language). Each participant heard recordings in only one language and carried out the rest of the study in that same language. The order of the countries, emotional condition, and emotional condition/country matching were all randomized across participants to avoid any strategic or order effects. Once participants finished listening to the audio files, they proceeded to answer 50 multiple-choice questions about the stimuli content. These questions had four answer choices and participants were asked to pick one for each of the countries.
ANALYSIS
The size of the sample was determined using GPower (Faul, Erdfelder, Lang, & Buchner, Reference Faul, Erdfelder, Lang and Buchner2007), assuming a small to medium size interaction (η p2 = .05) and 95% power.
We carried out a two-way mixed ANOVA exploring the effects of emotionality and language on performance in the test to address whether performance was better in the NL or FL, whether emotional semantic context affects performance, and whether there was an interaction between the two. A main effect of language would indicate whether participants perform better in one of their languages, whilst a main effect of emotionality would reveal whether the emotional manipulation affected performance. Finally, any interaction between language and emotionality would show whether the effect of emotionality is modulated by language—meaning, emotionality affects people differently in the FL than the NL. In all cases, assumptions of statistical tests were met.
We followed these tests up with Bayes factors (Jeffreys, Reference Jeffreys1961), which represent the likelihood of one model—in this case, the null hypothesis—over another—in this case, the alternative hypothesis. For example, a BF01 of 5 means that the null hypothesis is five times more likely to be true than the alternative one and a BF01 of .2 means that the alternative hypothesis is five times more likely to be true than the null. These Bayes Factors have become increasingly common as an alternative to frequentist models (Poirier, Reference Poirier2006), in particular for ANOVAs (Rouder, Morey, Speckman, & Province, Reference Rouder, Morey, Speckman and Province2012).
RESULTS
First, we calculated the internal consistency between the questions of each country and found that the tests had good internal consistency (Mufelo α = .84; Tecamer α = .86).
We removed participants who were outliers, meaning 1.5 IQR away from the median in either condition (positive or neutral) for each language group. Using this procedure, we removed one participant from the English group and three from the Spanish group. The same tests were carried out with and without the outliers and the results were consistent between the two.
We carried out a two-way mixed ANOVA with emotionality and language on performance on the test (see Table 3 for means, standard deviations, and 95% confidence intervals). There was a significant main effect of emotionality, such that participants performed better in the positive (M = 69.00%, SD = 13.95%) than the neutral condition (M = 65.97%, SD = 14.71%), F(1,70) = 8.54, p = .005, η p2 = .109, BF01 = .146, error% = 1.26 × 10‒6 (see Figure 1 and online supplementary materials). There was also a main effect of language, such that participants performed better in their NL (Spanish: M = 74.6%, SD = 11.2%) than in their FL (English: M = 60.3%, SD = 11.6%), F(1,70) = 26.83, p < .001, η p2 = .277, BF01 = 1.40 × 10‒4, error% = 1.29 × 10‒7. There was no interaction between the two factors, F(1,70) = .104, p = .748, η p2 = .001. A Bayesian repeated measures ANOVA comparing the model with the interaction (emotionality * language) and without the interaction term confirmed that there was moderate evidence that the addition of the interaction term led to an equally likely model, BF01 = 4.12, error% = 3.15—namely, no interaction was more than four times more likely than an interaction. We also ran a Bayesian independent samples t-test on the emotionality effect—namely the score on the positive condition minus the score on the neutral one for each of the language conditions—and again found moderate evidence in support of the null hypothesis, BF01 = 3.93, error% = .012.
TABLE 3. Average accuracy in percent correct by condition

Note: Participants showed no effect of order, t(75) = .019, p = .891, BF01 = 7.85, error% = 7.39 × 10‒6, showing moderate evidence that participants performed similarly regardless of order. Furthermore, there was moderate evidence that the two country descriptions were equally easy to remember, t(75) = 1.23, p = .270, BF01 = 4.35, error% = 5.15 × 10‒6.

FIGURE 1. Violin plot showing the distribution of accuracy values by language and condition. Participants in the native language condition (Spanish) did better on the task than those who carried out the task in their foreign language (English). In addition, participants did better when the information was presented in a positive rather than a neutral context. Nevertheless, the effect was the same in both languages.
DISCUSSION
In the current study, we addressed the questions of whether learning new information in an FL could be improved using an emotional semantic context and whether this effect would be the same in the NL and FL. The main task of the study required participants to listen to descriptions of countries and answer questions about them. Although participants performed better in their NL, results suggested that they benefited equally from the positive emotional context in both languages.
Preceding studies on the effects of emotionality on memory have mainly used visual stimuli. In contrast, the current study emulates information transfer in classroom settings by focusing on aural stimuli. Results showed statistically reliable emotionality effects with auditory information in both the NL and the FL. The partial eta squared of this effect is considered to be of medium effect size, within the context of educational research (Richardson, Reference Richardson2011). This corresponds to 10.9% of the variance explained and a practical difference of 3% on the current test. Although relatively discrete, this effect could be the difference between passing and failing an exam for a student that is struggling in a class. In more general terms, this study suggests that emotionally loaded semantic contexts—not just emotional content—conveying new pieces of information can improve memory.
Given that there are no studies addressing the particular questions of the current study—namely, looking at the effects of emotional context on content learning—the results need to be understood within the wider literature. The effects found here (NL: 2.7%, FL: 3.3%) were smaller than those of single-word studies with known words. In particular, these studies show effects between 7 and 26% in the NL and between 9.5 and 18% in the FL (Anooshian & Hertel, Reference Anooshian and Hertel1994; Ayçiçeǧi & Harris, Reference Ayçiçeǧi and Harris2004; Caldwell-Harris, Reference Caldwell-Harris2009; Ferré et al., Reference Ferré, Garcia, Fraga, Sanchez-Casas and Molero2010)—with one exception showing a nonsignificant effect in the FL (Anooshian & Hertel, Reference Anooshian and Hertel1994). Studies manipulating emotional context rather than emotional content have found larger effects than the current one in recall (12%) but not in recognition—no accuracy difference, only in response time (Erk et al., Reference Erk, Kiefer, Grothe, Wunderlich, Spitzer and Walter2003, Reference Erk, Martin and Walter2005). However, studies on new word learning show smaller effects (2–3.5%), more similar to the ones in the current study (Ferré et al., Reference Ferré, Ventura, Comesaña and Fraga2015). Overall, these results suggest that the effects of emotionality are reduced when only the context is manipulated and when there is learning of new content, rather than repeating information that is already known. Therefore, our results are in accordance with those reported by prior literature and are within the predictable effect size.
The key result in this study is that the effect of emotionality is the same in the FL and the NL. This result is consistent with many recent studies using emotionality in single-word processing (Ayçiçeǧi & Harris, Reference Ayçiçeǧi and Harris2004; Caldwell-Harris, Reference Caldwell-Harris2009; Ferré et al., Reference Ferré, Ventura, Comesaña and Fraga2015; Ponari et al., Reference Ponari, Rodriguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015), and suggest that this effect extends beyond individual word-learning to content learning. But, perhaps more importantly, this result challenges the view that the FL, in general terms, leads to emotional distancing (see Costa, Duñabeitia, & Keysar, Reference Costa, Duñabeitia and Keysar2019).
These results relate to the FLE and the theoretical issue of its origin. Hayakawa et al. (Reference Hayakawa, Costa, Foucart and Keysar2016) suggest that there are two main ways of explaining the FLE on moral decision making: a reduction in emotional processing and increasing psychological distance. Both of these accounts would predict a reduced emotional effect in the FL compared to the NL. If emotionality is completely blocked, this described FLE would predict that emotionality and its effect on performance would be reduced or absent in the FL condition. With respect to psychological distance, the conclusion is the same: this would make the information seem more abstract, reducing the effect of emotionality. Therefore, neither of these ideas is consistent with our results—namely, an equal effect of emotionality in the NL and FL. However, if the FLE is circumscribed to only the manipulation of known information and its prior associations, it would explain why learning new information does not show the same effects. For example, learning the word “home” using neutral language would lead to more difficulty in learning it and a reduced emotional response for that word, whereas if it is presented using emotional language, perhaps it would be remembered better—showing an emotionality effect.
Looking at the results from this perspective, the current findings do not necessarily have to contradict the existence of the FLE. Instead, they suggest a possible mechanism for how it arises. Gawinkowska et al.’s (Reference Gawinkowska, Paradowski and Bilewicz2013) idea that the effect is due to social and cultural norm differences would suggest that emotionality should affect both language conditions equally in this case. This is consistent with our results because, if the FLE is circumscribed to differences in norms, it should not be present. Importantly, Geipel et al.’s (Reference Geipel, Hadjichristidis and Surian2015a, Reference Geipel, Hadjichristidis and Surian2015b, Reference Geipel, Hadjichristidis and Surian2016) suggestion that the origin of this effect is a reduction of intuitive responses and a depletion of cognitive resources would imply a decrease in performance overall in the FL, but not necessarily any difference in emotionality. This reduction of cognitive resource availability explains our data better, predicting our decrease in performance in the FL, as well as the consistency of emotionality effects between languages.
In other words, the results of the current study could suggest that, rather than emotionality being reduced overall in an FL context, learners’ cognitive resources are taxed, affecting emotionality differently according to the task. Furthermore, if the reduction in emotionality is observed in cases in which only already-known information is concerned, perhaps it is because they are lacking emotional associations within that language. These results suggest that providing FL learners with more emotional materials—as in this case—could help them learn these associations.
It is worth noting that, although we did not intend to manipulate interest—and effectively the content was the same between conditions—perhaps the positive condition could have also presented the information in a more interesting way than the neutral one, contributing to the effect we found (see Hidi, Reference Hidi1990 for a review on the effect of interest on learning). In future studies, the effect of emotionality could be contrasted with that of “interest” or engagement. In addition, the effect we observe here might be increased further by engaging the participants in an activity where they have to use this new content or by making the information to be remembered self-relevant. For example, with the current materials, engagement could be increased by asking participants to not only listen passively but also to actively decide if they would want to move to the described country. Nevertheless, the current results open way for a new way of looking at both emotionality effects and learning in a FL that, with further replications, could provide a useful tool for teaching in a nonnative language.
CONCLUSION
The current study reports a well-controlled experiment in line with CLIL approaches, as participants learned the same content in either their NL or an FL and were then tested using exactly the same task and materials. Learning in an FL may sometimes hinder memory of new content as a consequence of the difference in language knowledge and use with the NL. However, the use of emotional semantic contexts can be a short-term tool in the classroom, particularly during aural exercises or verbal transmission of new information to boost memory. Considering the emotional distancing or detachment that has been typically associated with FL contexts (see Costa et al., Reference Costa, Duñabeitia and Keysar2019), the use of emotionally loaded materials or activities in classroom settings could be useful for partially counteract existing FLEs.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S027226311900072X
APPENDIX
NUMBER OF EMOTIONAL WORDS AND THE AVERAGE RATING OVERALL BY LANGUAGE AND CONDITION

Note: N stands for the number of words with values >5. The means and standard deviations are overall on a scale from 1 to 9.