Introduction
A salient issue in bilingualism research pertains to whether bilinguals process information differently in each language, and which cognitive domains are either enhanced or constrained by the manipulation of two languages. An additional matter has emerged recently: a question of whether bilinguals experience emotions differently depending on the language in which they express them or in which emotional stimuli are received (Ayçiçeği-Dinn & Caldwell-Harris, Reference Ayçiçeği-Dinn and Caldwell-Harris2009; Altarriba & Basnight-Brown, Reference Altarriba and Basnight-Brown2011; Rosselli, Vélez-Uribe & Ardila, Reference Rosselli, Vélez-Uribe, Ardila, Ardila, Cieślicka, Heredia and Rosselli2017). If such differences exist at the cognitive and neuropsychological level, it might reflect a delay in the acquisition of emotional connotations, as opposed to lexical elements of the second language, which could indicate differences in emotion-cognition coupling (Conrad, Recio & Jacobs, Reference Conrad, Recio and Jacobs2011) between languages.
Differences in emotion processing in bilinguals
Emotion-cognition coupling implies a dissociation between the automatic physiological responses elicited by emotional stimuli and its cognitive appraisal. In theory, during the acquisition of the native language (L1), emotional connotations become inherent to words through emotional socialization, and all the aspects of language, including sensory and visceral representations as well as autobiographical memories, become a whole in a language embodiment process (Pavlenko, Reference Pavlenko2007). Conversely, emotional content could take time to be integrated with the lexical form of the words in a second language (L2), creating a dissociation between them. Learning a second language typically occurs in an instructional setting (Altarriba, Reference Altarriba2008), which lacks the experiential factors that facilitate emotion-cognition coupling; bilingual, and multilingual participants have consistently reported a dissociation in how emotion words are felt in their languages (Dewaele, Reference Dewaele2004, Reference Dewaele2008; Dewaele & Pavlenko, Reference Dewaele and Pavlenko2001). For example, swear, and taboo words seem less emotional (Dewaele, Reference Dewaele2004) and taboo words feel less negative in L2 (Vélez-Uribe & Rosselli, Reference Vélez-Uribe and Rosselli2017), while positive words seem to have stronger connotations in L1 (Dewaele, Reference Dewaele2008).
Altogether, these findings have given rise to the idea that L1 is associated with deeper emotional connotations and L2 with greater emotional distance. In selective code-switching, some bilinguals will switch to the language that provides greater emotional distance when discussing difficult topics (Bond & Lai, Reference Bond and Lai1986).
Findings from memory studies have shown that emotional words result in higher recall rates than neutral words with an overall recall advantage for the L1 in late bilinguals (Anooshian & Hertel, Reference Anooshian and Hertel1994). However, Ayçiçeği and Harris (Reference Ayçiçeği and Harris2004) found that words in L2 were more easily recalled in all categories, including childhood reprimands (Ayçiçeği & Harris, Reference Ayçiçeği and Harris2004). Moreover, Ayçiçeği-Dinn and Caldwell-Harris (Reference Ayçiçeği-Dinn and Caldwell-Harris2009) found a similar effect in both languages; but in an emotion intensity-rating task, this effect was only found in L1. Additional evidence supporting similar recall in both languages was found by Ponari, Rodríguez-Cuadrado, Vinson, Fox, Costa and Vigliocco (Reference Ponari, Rodríguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015), regardless of the age of acquisition (AoA), and Ferré, García, Fraga, Sánchez-Casas, and Molero (Reference Ferré, García, Fraga, Sánchez-Casas and Molero2010) in balanced and unbalanced bilinguals.
Studies using an Emotional Stroop task in bilinguals, which results in higher levels of interference in L1 than in L2 for emotional content (Altarriba, Reference Altarriba2008), have reported an emotional Stroop effect in both languages (Eilola et al., Reference Eilola, Havelka and Sharma2007), as well as a larger effect in L2 (Sutton, Altarriba, Gianico & Basnight-Brown, Reference Sutton, Altarriba, Gianico and Basnight-Brown2007). Similarly, using a rapid serial visual presentation (RSVP) task, Colbeck and Bowers (Reference Colbeck and Bowers2012) found lower error rates for taboo words in bilinguals and ascribed it to a reduction of emotionality in L2. Lastly, Altarriba and Basnight-Brown (Reference Altarriba and Basnight-Brown2011) found higher interference in L2 on an Affective Simon Task.
Few studies have investigated the physiological evidence of differential processing of emotional content in bilinguals. Harris, Ayçiçeği and Gleason (Reference Harris, Ayçiçeği and Gleason2003) used skin conductance responses (SCRs) and reported that taboo words presented the strongest effect in both languages, but higher overall in L1. Childhood reprimands elicited high responses in L1 but not in L2 (Harris et al., Reference Harris, Ayçiçeği and Gleason2003). Furthermore, Harris (Reference Harris2004) found that taboo words showed the highest reactivity in both languages, and childhood reprimands resulted in some differences between languages where late learners showed higher SCRs in L1, but they were similar in early learners. Similarly, Caldwell-Harris and Ayçiçeği-Dinn (Reference Caldwell-Harris and Ayçiçeği-Dinn2009) found higher autonomic reactivity for items presented in L1. These discrepancies might reflect sample heterogeneity as well as the operational definition of variables, including bilingualism.
Emotion words: emotion-label and emotion-laden words
In the study of emotion processing, it might be appropriate to refer to emotion words as emotion-label words and classify them along with emotion-laden words under the broad category of emotion words (Zhang, Wu, Meng & Yuan, Reference Zhang, Wu, Meng and Yuan2017). Emotion-label words refer to directly to affective states (e.g., happy, sad, etc.), and emotion-laden words elicit emotions indirectly (e.g., war, home, etc.) (Pavlenko, Reference Pavlenko2008, Reference Pavlenko2012).
Altarriba and Basnight-Brown (Reference Altarriba and Basnight-Brown2011) argue that emotion-laden words provide a more sensitive measure of the differences between languages than emotion-label words. Some studies report discrepancies between the two types of words in repetition blindness (Knickerbocker & Altarriba, Reference Knickerbocker and Altarriba2013) and priming tasks (Kazanas & Altarriba, Reference Kazanas and Altarriba2015; 2016). In bilinguals, the distinction between the two types of words might be more pronounced in the dominant language (Kazanas & Altarriba, Reference Kazanas and Altarriba2016). However, differences in processing emotion-label and emotion-laden words are not always found (Martin & Altarriba, Reference Martin and Altarriba2017; Vinson, Ponari & Vigliocco, Reference Vinson, Ponari and Vigliocco2014).
ERPs in the study of emotion-word processing in bilinguals
Previous Event-Related Potentials (ERPs) studies in bilinguals did not distinguish between emotion-label and emotion-laden words (Chen, Lin, Chen, Lu & Guo, Reference Chen, Lin, Chen, Lu and Guo2015; Opitz & Degner, Reference Opitz and Degner2012) while other researchers used only emotion-laden words (Conrad et al., Reference Conrad, Recio and Jacobs2011). Recent ERP studies that analyzed the differences between the two types of words in monolingual participants suggest that this distinction is only evident in some ERP components. There might be some interhemispheric differences with right hemisphere dominance for emotion label words reflected on the N170 and negative emotion-label words eliciting larger amplitudes on the LPC (Zhang et al., Reference Zhang, Wu, Meng and Yuan2017). This study was conducted with Chinese monolinguals, and the lateralization effects on the LPC have not been replicated in other languages. Furthermore, Wang, Shangguan, and Lu (Reference Wang, Shangguan and Lu2019) failed to find any disparities between the two types of words in the LPC in Chinese and suggested that the differential effects of the two word-types occur early in processing and are evident only in the P1 and P2 components.
Moreover, Wang et al. (Reference Wang, Shangguan and Lu2019) found that emotion-label words were predominantly reflected on the P2, but only negative emotion-laden words had the same effect. The frontal N200 component is larger for negative emotion-label than negative emotion-laden words over the left hemisphere, and for positive emotion-label words than for positive emotion-laden words over the right hemisphere (Zhang, Wu, Yuan & Meng, Reference Zhang, Wu, Yuan and Meng2019). Some components, such as the P100 (Zhang et al., Reference Zhang, Wu, Meng and Yuan2017), the early posterior negativity (EPN), and the late positive complex (LPC), do not reflect the distinction between the two types of words (Wang et al., Reference Wang, Shangguan and Lu2019).
The EPN, which is prominent at occipitotemporal electrode sites and peaks at about 370 ms after stimulus onset (Hajcak, Weinberg, MacNamara & Foti, Reference Hajcak, Weinberg, MacNamara and Foti2012), increases in amplitude with emotional valence (Citron, Reference Citron2012); more so for positive words when compared to neutral words (Conrad et al., Reference Conrad, Recio and Jacobs2011), and it seems to reflect automatic processing of emotion content (Citron, Reference Citron2012; Hajcak et al., Reference Hajcak, Weinberg, MacNamara and Foti2012). The EPN reflects early lexical access (Conrad et al., Reference Conrad, Recio and Jacobs2011) and an attention shift toward words with emotional relevance at early processing stages, and appears as a negative deflection (Luck, Reference Luck2014) or as a reduction in positivity (Weinberg & Hajcak, Reference Weinberg and Hajcak2010), especially with prolonged exposure to stimuli. The LPC presents like an increased positivity at centroparietal electrodes and peaks between 500 and 800 ms. The LPC seems to respond to valence and presents larger amplitudes for emotion than for neutral words (Citron, Reference Citron2012). The EPN and the LPC appear unaffected by the distinction between emotion-label and emotion-laden words (Wang et al., Reference Wang, Shangguan and Lu2019), except for valence-restricted differences in lateralization, with larger LPC amplitudes over the right hemisphere for negative emotion-label words (Zhang et al., Reference Zhang, Wu, Meng and Yuan2017).
Kim (Reference Kim1993) examined ERPs in a sample of 20 English monolinguals and 40 Korean–English bilinguals (Becoming Bilinguals and Stable Bilinguals). The valence decision task (VDT) required participants to classify English nouns and adjectives into positive, neutral, or negative categories. All groups presented higher amplitudes at Pz and T4 electrode sites, but there were no differences in P300 amplitude between word categories. The Becoming Bilingual group showed longer latencies in both P300 and N200 waves, possibly due to interference between languages. N200 amplitudes differed between the Becoming Bilingual and Monolingual groups only, perhaps reflecting more effortful processing in the less proficient language. Group similarities could be a result of a shallow processing task hindering the effect in a processing-extensive component like the P300. Also, the experiment only included words in English impeding the examination of language differences.
In a lexical decision task (LDT), Conrad et al. (Reference Conrad, Recio and Jacobs2011) included positive, neutral, and negative words and nonwords. Participants were late bilinguals (AoA > 12) who differed in order of acquisition (OoA; 40 German–Spanish and 26 Spanish-German). There was a clear onset difference, with peaks in L2 delayed by approximately 50 ms when compared with L1. Additionally, L2 data for native Spanish-speakers reflected a delay consistent with the results for the native German-speakers. The differential effect of valence between languages was present for negative words only in German native speakers (Conrad et al., Reference Conrad, Recio and Jacobs2011). L2 and L1 effects on the EPN and LPC were similar. The EPN reflected a 50–100 ms processing delay (Conrad et al., Reference Conrad, Recio and Jacobs2011), suggesting that the effects of emotional content are present in both languages and reflect quantitative differences in processing emotion words between languages. However, both groups included late learners of L2 with different L1 (German or Spanish). Since both bilingual groups reported similar L2 proficiency, it remains unclear whether controlling for proficiency in two otherwise similar groups of bilinguals would yield different results. Additionally, the LDT requires shallow analysis of words, which is more likely to be reflected on the EPN than on the LPC. It is unknown whether the same results would be obtained using a task requiring deeper semantic processing.
Furthermore, Opitz and Degner (Reference Opitz and Degner2012) employed a lexical monitoring task (LMT) with 16 French-German and 17 German-French bilingual participants living in Germany. The experiment included negative, positive, and neutral nouns, with no distinctions between emotion-label or emotion-laden, as well as 30 pseudowords. Both groups showed greater EPNs for positive and negative words in both languages compared to neutral words, suggesting that there was no effect of language on amplitude, but there was a delay in L2. The EPN was consistently enhanced for emotion words when compared to neutral words. Differences in latency for emotion words were observed for all participants in L2 when compared to neutral words, possibly corresponding to similar attentive processes and to differences in the time course of the conceptual identification of words in L2. These results suggest that the lexical access to emotional words in highly proficient L2 users is delayed due to higher interference between the lexical representations of the languages and requires more cognitive resources, attenuating the perception of emotional valence in words (Opitz & Degner, Reference Opitz and Degner2012). The longer latency in the EPN is consistent with Conrad et al.'s (2011) findings; both studies used similar bilingual groups, with a different AoA but similar proficiency. Proficiency was controlled to ensure that both groups were comparable in their L2 proficiency, but not as a between-groups factor. Both studies analyzed recordings obtained while performing LDTs or LMTs, which only require distinguishing between words and nonwords. Further exploration into deeper levels of semantic processing, particularly in the LPC component, might detect effects beyond mere attentional processes.
Moreover, Chen et al. (Reference Chen, Lin, Chen, Lu and Guo2015) tested 24 Chinese–English bilinguals who had never been in an English-speaking environment. The LDT included positive, neutral, and negative words and pseudowords. “Emotional words” included a combination of emotion-label and emotion-laden words obtained from several sources (ANEW; Bradley & Lang, Reference Bradley and Lang1999; Eilola et al., Reference Eilola, Havelka and Sharma2007; Harris, Reference Harris2004; Sutton et al., Reference Sutton, Altarriba, Gianico and Basnight-Brown2007). All conditions evoked similar early ERP components (P1 and N2). Emotion words elicited larger negative deflections than neutral words in L1, consistent with the EPN. Emotion words generated smaller positive waves than neutral words starting at the 500–800 ms window at centroparietal electrodes sites. In L2, neutral words presented lower positivity than emotion words at the parietal sites during the 400–500 ms window (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015). The effects of emotion words in L2 were significantly delayed compared to previous studies and resembled the scalp distribution of the N400 rather than the LPC. The sample included late unbalanced bilinguals, with low L2 proficiency and no immersion in the L2 environment, but there was no comparison among bilinguals with different linguistic profiles. Amplitude variations might be due to differences in processing between languages, as indicated by the detection of a component resembling the distribution of the N400, which is associated with semantic processing.
Conrad et al. (Reference Conrad, Recio and Jacobs2011) and Opitz and Degner (Reference Opitz and Degner2012) proposed that emotion word processing differences could be only quantitative, as reflected in delayed latencies in L2 compared to L1. On the other hand, Chen et al. (Reference Chen, Lin, Chen, Lu and Guo2015) suggested that the differences might be qualitative, as indicated by dissimilarities between languages in ERP amplitudes. Since previous studies only included unbalanced bilinguals (with lower levels of proficiency in L2), these discrepancies could be addressed by comparing two bilingual groups with the same OoA of languages but with varying proficiency. Previous studies report a delay in the onset of ERP components in the less proficient language (Houlihan, Stelmack & Campbell, Reference Houlihan, Stelmack and Campbell1998); therefore, comparing balanced and unbalanced bilinguals would be useful. The disparities in proficiency might be reflected not only in the latencies, but also in amplitude (Yang, Perfetti, Tan & Jiang, Reference Yang, Perfetti, Tan and Jiang2018). Additionally, a deeper processing task could capture differences in the processing of emotion words between languages; a finding that would be more evident in the LPC, which is more sensitive to the effect of emotional valence during semantic processing (Fischler & Bradley, Reference Fischler and Bradley2006), and is attenuated in shallow processing tasks (Palazova, Mantwill, Sommer & Schacht, Reference Palazova, Mantwill, Sommer and Schacht2011).
Aims and hypotheses
The current study aimed to investigate emotions in bilinguals with two levels of language proficiency (balanced and unbalanced) through ERP analysis. The balanced bilingual group had comparable levels of proficiency in both languages, while the unbalanced group had dissimilar levels of proficiency between languages, as indicated by the Bilingualism Index (BI; see Method section). The present study analyzed the electrophysiological correlates of emotion words by comparing ERPs evoked by exposing participants to negative, neutral, and positive words in the visual modality and requiring them to rate their valence in three categories (negative, neutral, and positive) in both languages (English and Spanish). Based on previous findings, it was expected that emotion content processing differences would be most evident for the unbalanced than the balanced group, considering that those with more similar levels of proficiency were expected to have comparable processing and emotional reactivity across their languages. Four hypotheses were tested:
1) EPN Latency: The latency of the EPN, associated with early processing and automatic activation of emotional connotation in emotionally valenced words, was explored. Consistent with previous findings (Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012), we expected to find significant differences in EPN latency between languages, with the latency for emotion words in this component greater in L2 in the unbalanced group but not in the balanced group. An interaction was expected between language and valence, with balanced bilinguals presenting similar EPN latencies in both languages for emotion but not neutral words, and unbalanced bilinguals presenting longer EPN latencies for emotion words presented in L2 but not in L1. The 2 x 2 x 3 x 3 General Linear Model (GLM) analysis included two levels of proficiency (balanced and unbalanced), two languages (English and Spanish) and three levels of stimulus valence (negative, neutral, and positive) and was performed with the latency of the EPN peak extracted from the 200–400 ms time window in three occipital electrodes (Ch16-O1, Ch17-Oz, and Ch18-O2) as dependent variables.
2) EPN Amplitude: Previous studies have not found differences in the amplitude of this component when comparing a bilingual's languages. The effect of valence was expected to be evident in larger negative deflections for emotion words (negative and positive) than for neutral words, as supported by findings from monolingual studies (Citron, Reference Citron2012). In addition, since ERPs were elicited by verbal content, the EPN amplitude was expected to present left hemisphere dominance across all word categories (Herbert, Junghöfer & Kissler, Reference Herbert, Junghöfer and Kissler2008). The 2 x 2 x 3 x 3 GLM analysis included two levels of proficiency (balanced and unbalanced), two languages (English and Spanish) and three levels of stimulus valence (negative, neutral, and positive), and was performed with the amplitude of the EPN peak extracted from the 200–400 ms time window in three occipital electrodes (Ch16-O1, Ch17-Oz, and Ch18-O2) as dependent variables.
3) LPC Latency: Since the LPC is considered to reflect more elaborative processing, the effect of emotional valence in L2 compared to L1 might become more evident in this component when the task requires a conscious appraisal of valence. ERPs are expected to present longer latencies in L2 for emotion words but not for neutral words. Greater latencies were expected for negative compared to neutral, and neutral compared to positive words when presented in L1, which was expected to be attenuated by the proficiency factor, with the unbalanced group presenting the largest differences in latency. The 2 x 2 x 3 x 3 GLM analysis included two levels of proficiency (balanced and unbalanced), two languages (English and Spanish), and three levels of stimulus valence (negative, neutral, and positive) and was performed with the latency of the LPC peak extracted from the 400–650 ms time window from three parietal electrodes (Ch14-P3, Ch13-Pz, and, Ch19-P4) as dependent variables.
4) LPC Amplitude: The effects of proficiency, language, and valence were expected to be evident in differences in amplitude of this component in L2 compared to L1. ERPs were expected to present greater amplitudes in L2 for emotion words but not for neutral words, with greater amplitudes observed for negative compared to neutral, and for neutral compared to positive words when presented in L1, which would be expected to be attenuated by the proficiency factor, with the unbalanced group presenting the largest differences in amplitude. The 2 x 2 x 3 x 3 GLM analysis included two levels of proficiency (balanced and unbalanced), two languages (English and Spanish), and three levels of stimulus valence (negative, neutral, and positive) and was performed with the amplitude of the LPC peak extracted from the 400–650 ms time window from three parietal electrodes (Ch14-P3, Ch13-Pz, and Ch19-P4) as dependent variables.
Method
Participants
Participants were recruited from the student body at Florida Atlantic University from different courses and received extra credit for participating. The initial sample consisted of 78 Spanish–English bilinguals. During the selection process, two English–Spanish bilinguals whose first language was English were excluded because they differed from the rest of the bilingual sample that reported Spanish as L1. Six participants were excluded because they were more proficient in Spanish than in English, which made their profiles different from the rest of the participants whose most proficient language was English. Two trilingual participants, one with L1 Italian and one with L1 Portuguese, were also removed. One participant diagnosed with epilepsy and under anticonvulsant treatment was also excluded. Four participants were excluded because of problems associated with EEG recording errors and 25 because of low-quality EEG data. The remaining 20 participants were excluded during artifact rejection because they exceeded the 20% maximum threshold for rejected trials (see EEG data analysis section). These artifacts were related to movement, perspiration, and a faulty USB cable.
The final sample in this Institutional Review Board (IRB) approved study (Florida Atlantic University-IRB), included 42 Spanish–English (83.3% females) bilinguals, divided into two subgroups based on Bilingualism Index (BI) scores (Gollan, Salmon, Montoya & Galasko, Reference Gollan, Salmon, Montoya and Galasko2011; Rosselli et al., Reference Rosselli, Loewenstein, Curiel, Penate, Torres, Lang and Duara2019), which was created from the Spanish and English proficiency variables from the Language Experience and Proficiency Questionnaire (LEAP-Q; Marian, Blumenfeld & Kaushanskaya, Reference Marian, Blumenfeld and Kaushanskaya2007). The BI was calculated by dividing the lowest by the highest proficiency score for each participant. A score closer to zero indicates unequal levels of proficiency between languages, while a score of one indicates equal levels of proficiency in both languages. The bilingual sample was divided based on the median (Med = .8), aiming for a similar sample size for the balanced (BI ≥ .8) and unbalanced (BI < .8) groups. This method is sample-specific, and the high median cut-off point reflects the high proficiency in both languages.
The bilingual subsamples included 22 balanced and 20 unbalanced participants. Balanced and unbalanced bilinguals were significantly different in Spanish proficiency, F(1,40) = 148.17, p < .001, ηp2 =.79, and BI, F(1,40) = 165.43, p < .001, ηp2 =.81, but not in English proficiency, F(1,40) = .54, p = .47, ηp2 = .01, age, F(1,40) = .54 p = .47, ηp2 = .01, or education, F(1,40) = .001, p = .97, ηp2 = .00 (see Table 1 for additional demographic information).
Note. (*p < .001) Table 1 depicts the Means and Standard Deviations for the variables included in Univariate analyses for sample characterization purposes in the Participants section.
Participants were exposed to Spanish since birth, so the age of exposure to Spanish was equivalent. There were no significant differences in the age of exposure to English, F(1,40) = 1.51, p =.23, ηp2 = .04. The groups differed significantly in Spanish exposure, F(1,40) = 7.72, p = .014, ηp2 = .16, but not in English exposure, F(1,40) = 3.53, p = .07, ηp2 = .08, see Table 1. Information about the scale, the BI, proficiency, and exposure can be found in the LEAP-Q section
Materials and procedure
Word rating task
The word rating task (WRT) was administered in both languages (English and Spanish) and requires participants to rate words on a 1–9 valence scale (1 = most negative and 9 = most positive), and to enter zero for unknown words (Vélez-Uribe & Rosselli, Reference Vélez-Uribe and Rosselli2017). Participants entered their ratings with their right hand using the numeric keypad. Trials with ratings of zero were excluded. Valence ratings were provided through the Self-Assessment Manikin (SAM; see Figure 1) as a rating system (Bradley & Lang, Reference Bradley and Lang1994), which has high validity when assessing emotional stimuli (Morris, Reference Morris1995). The task was divided into two blocks, counterbalanced by language: English and Spanish, and included three categories of stimuli: negative, neutral, and positive, presented randomly within language blocks. The experimenters communicated and gave instructions to participants in the language corresponding to each block. Each condition (English and Spanish) included 330 words (110 per valence category), presented using E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA, 2012) in 18-point Courier New font for 2000 ms, preceded by a 2000 ms fixation point and followed by a 2000 ms static white noise visual mask. This type of mask disrupts the memorization of words (Andrade, Kemps, Werniers, May & Szmalec, Reference Andrade, Kemps, Werniers, May and Szmalec2002), preventing them from affecting the following word. Refer to Figure 2 for an illustration of the procedure.
English words were selected from the Affective Norms for English Words (ANEW; Bradley & Lang, Reference Bradley and Lang1999). Spanish words were selected from the Spanish adaptation of ANEW, which includes 1,034 of the words from the English version of ANEW (Redondo, Fraga, Padrón & Comesaña, Reference Redondo, Fraga, Padrón and Comesaña2007). The Spanish and English ANEW words are strongly correlated, and the Spanish words have high validity and are useful for researchers comparing emotion reactivity in English–Spanish bilinguals (Redondo et al., Reference Redondo, Fraga, Padrón and Comesaña2007).
The words were selected to fit the three established categories aiming for 100 words per category, with an additional 10% to compensate for trials lost to EEG artifacts. Words with no available data in Spanish and cognates were excluded. Next, 200 words within each valence range were randomly selected and screened in terms of accuracy of translation and international applicability by four raters selected from lab personnel, resulting in the final set of 110 words per category. The words had the following characteristics: a) negative (valence M = 2; Range: 1–3), b) neutral (valence M = 5; Range: 4–6), and c) positive (valence M = 8; Range: 7–9). All selected words in both languages from the normative data (ANEW) and the current study are included in the Appendix, while descriptive statistics for the WRT are included in Table 2. We did not collect arousal data, but analyses of variance (ANOVAs) were conducted to ensure equivalence between the selected words in English, F(1,128) = .156, p = .693, and in Spanish, F(1,128) = .518, p = .692.
Note. Table includes Means and Standard Deviations for the Word Rating task (WRT).
Measures of language experience
Language Experience and Proficiency Questionnaire (LEAP-Q): The LEAP-Q is a self-report questionnaire for assessing proficiency, with high levels of reliability and validity (Marian et al., Reference Marian, Blumenfeld and Kaushanskaya2007). Proficiency scores are divided into three subscales (speaking, understanding spoken language, and reading) on a 0 to 10 scale as follows: 0 = none, 1 = very low, 2 = low, 3 = fair, 4 = slightly less than adequate, 5 = adequate, 6 = slightly more than adequate, 7 = good, 8 = very good, 9 = excellent, 10 = perfect. Proficiency was calculated by averaging the scores from the three subscales for each language. The BI (see Participants section) was calculated using the Spanish and English proficiency scores.
For sample characterization purposes, exposure scores were calculated for each language by averaging scores from six LEAP-Q questions that provide information about the amount of exposure when speaking to friends, family, when watching TV, listening to music, reading, and time spent learning in a self-instruction setting. Scores are provided on a scale from 0–10, with zero indicating no exposure and 10 indicating complete exposure. Detailed information about the linguistic profile of the sample can be found in Table 3.
Procedure
The experiment was conducted in a quiet, dimly illuminated room. After providing informed consent, participants were prepared, beginning with head measurements for cap fitting and electrode positioning purposes, followed by electrode gelling in preparation for EEG recording. Participants first completed the questionnaires providing demographic, linguistic, and educational background information, followed by the Word Rating Task (WRT), which they performed while EEG was being recorded. Bilingual participants completed both counterbalanced English and Spanish blocks of the WRT. The session lasted about three hours.
EEG data were recorded using an ActiChamp (Brainproducts GmbH, Germany) reference-free at 500 Hz sampling rate while maintaining all electrode impedances below 10kΩ with a bandpass filter of .01–100 Hz. The 32 active electrodes, according to the 10–20 system (Sharbrough, Reference Sharbrough1991), were attached to the actiCAP (Brainproducts GmbH, Germany) with the ground electrode placed on the forehead. A conductive gel was applied between the electrodes and the scalp to decrease impedance. EEG data were recorded using Pycorder software (Brainproducts GmbH, Germany). Words were presented using E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA, 2012). Event markers for ERP averaging were recorded into the EEG data corresponding to word onset.
EEG data analysis
As mentioned above, participants rated words on a 1–9 valence scale and entered zero for unknown words. Valid trials in the ERP analysis included items with nonzero responses; therefore, the first step during data analysis was to recode trials with “zero” responses to be excluded from data analyses in English (M = 1.09) and in Spanish (M = 2.03). EEG data analysis was conducted using Analyzer 2.0 Software (Brain Products, Germany). The data were re-referenced offline to the linked mastoids reference. Filtering was performed with a Butterworth Zero Phase Filter with a low cutoff of 0.1 Hz and a high cutoff of 30 Hz, with a notch filter of 60 Hz. Segmentation was conducted based on the word onset marker in epochs beginning 200 ms before stimulus presentation and ending 800 ms after.
Epochs containing excessive artifacts were eliminated. Artifact rejection was performed on semi-automatic mode verified by eye inspection. The maximal voltage step allowed between adjacent points in all channels was 75 μV, and a maximal allowed difference of values in intervals of 150 μV (interval length: 200 ms). Only amplitudes ranging from -100 μV to 100 μV were included. Participants exceeding 20% of artifacts segments were excluded. The recommended criteria for artifact rejection correspond to a consistent threshold for rejection that can be a maximum of 25% of trials rejected for included participants (Luck, Reference Luck2014).
Averaging was conducted for each participant in each language and within each word category. Peak detection was then conducted on averaged data for the EPN component in sub-segments from 200–400 ms in the electrodes of interest (Occipital electrodes: O1, O2, and Oz), and for the LPC component in sub-segments from 400–650 ms post-stimulus onset for the electrodes of interest (Parietal electrodes: Pz, P3, and P4). Latency and amplitude for each component's peak for each participant in each language, word category, and electrode were extracted to be used as dependent variables.
Results
Word rating task
To determine if the WRT was successful in detecting differences between word categories, a 2 (language) x 3 (valence) GLM analysis was conducted using the valence ratings obtained (see Table 2). The analysis resulted in a nonsignificant main effect of language, F(1, 39) = .09, p = .77, ηp2 = .002. For valence, Mauchly's test of sphericity was significant, X2 (2) = 29.71, p < .001; therefore, the degrees of freedom were adjusted according to the Greenhouse-Geisser correction (ξ = .65), and there was a significant main effect of language, F(1.3, 50.57) = 456.74, p < .001, ηp2 = .92. Results from the post-hoc analyses indicated significant differences between positive and neutral (p < .001), neutral and negative (p < .001), and positive and negative words (p < .001). The interaction between language and valence was not significant (p > .05). Valence was equivalent in both languages, and the ratings for all word categories fell within the expected valence values for the selected stimulus set, while the significant differences between word categories confirm that the emotion effect was present.
Control for Priming Effects: The order of administration of the tasks was counterbalanced, but the analysis was redone, including it as a covariate to exclude the possibility of it having an effect. The significance of the results remained the same and the effect of order of administration was not significant, F(1,38) = .05, p = .82, ηp2= .001.
EPN: Event-related potentials
The EPN was observed as a reduction of positivity on the ERP wave, with a mean peak onset at 310.24 ms and an amplitude of 1.08 μV. Two (balanced and unbalanced bilinguals) x 2 (language) x 3 (valence) x 3 (electrodes: O1, O2, and Oz) GLM analyses were conducted separately for two dependent variables: latency and amplitude (Tables 4 and 5). The ERP waves for each bilingual group, in each language for all valence categories, are depicted in Figures 3–6.
EPN latency
The analysis of EPN latency did not result in significant differences between bilingual groups, F(1,38) = .07, p = .79, ηp2 = .002, and the interactions of bilingual group with language, valence, and electrode were not significant (p > .05). The main effect of language, F(1,38) = 4.98, p = .03, ηp2 = .12 was significant, with greater latencies in English than Spanish. The main effect of electrode, F(2,37) = .98, p = .70, ηp2 = .02, and valence, F(2,37) = 1.54, p = .22, ηp2 = .06, were not significant. The interaction between language and valence, F(2,27) = .35, p = .71, ηp2= .02, and between valence and electrode, F(4,35) = .81, p = .53, ηp2= .08, were not significant. However, the interaction between language and electrode, F(2,37) = .30, p = .048, ηp2= .15, was significant. The three-way interactions and the four-way interaction were not significant (p > .05). The means and standard deviations for the ERPs latencies are depicted in Table 4.
EPN amplitude
The analysis of EPN amplitude resulted in a nonsignificant main effect of bilingual group, F(1,38) = 1.34, p = .25, ηp2 = .03, and the interactions between bilingual group and language, valence, and electrode were not significant (p > .05). The main effect of language was significant, F(1, 38) = 4.71, p = .04, ηp2 = .11. The amplitude was larger for words in Spanish than in English. The main effect of valence was also significant, F(2. 37) = 4.10, p = .03, ηp2 = .18. The results of the post-hoc analyses indicated significant differences between negative and positive (p < .01) and between neutral and positive (p < .05), but not between neutral and negative words (p = .18). The amplitudes were larger for positive than for neutral words, and for neutral than for negative words. The main effect of electrode was not significant, F(2,37) = 1.71, p = .20, ηp2 = .08. The interaction between language and valence was not significant, F(2,37) = .57, p = .57, ηp2= .03. The interactions between language and electrode and valence and electrode were also not significant, F(2,37) = .33, p = .72, ηp2 = .02, and F(4,35) = 1.18, p = .34, ηp2= .12, respectively. The three-way interactions and the four-way interaction were not significant (ps > .05; Table 5).
LPC
The LPC had an overall amplitude of 5.18 μV and presented as an increased positivity with a mean peak onset of 531.47 ms, consistent with the characteristics of the LPC (Luck, Reference Luck2014). Two (bilingual group) x 2 (language) x 3 (word category) x 3 (electrode: Pz, P3, and P4) GLM analyses were conducted separately on two dependent variables: latency and amplitude.
LPC latency
The analysis of LPC latency resulted in a nonsignificant main effect of bilingual group, F(1,38) = .07, p = .79, ηp2 = .002, and the interactions of bilingual group with language, valence, and electrode were not significant (p > .05). The main effect of language, F(1,38) = 3.23, p = .08, ηp2 = .08, and the main effect of valence, F(2,37) = .33, p = .72, ηp2 = .02, were not significant. Mauchly's test of sphericity was significant for the main effect of electrode, X2 (2) = 18.55, p < .001; therefore, the degrees of freedom were adjusted according to the Greenhouse-Geisser correction (ξ = .72). The effect of electrode was significant, F(1.43, 54.51) = 3.77, p = .04, ηp2 = .09. Post-hoc analyses indicated significant differences between the midparietal electrode, Pz, and the right electrode, P4 (p < .01). The interaction between language and electrode F(2,37) = 3.46, p = .04, ηp2 = .16, was significant. The interactions between language and valence, F(2,37) = 1.31, p = .28, ηp2 = .07, and valence and electrode, F(4,35) = 1.62, p = .19, ηp2= .16, were not significant. In addition, the three-way interactions and the four-way interaction were not significant (p > .05).
LPC amplitude
Analysis of LPC amplitude resulted in a nonsignificant main effect of bilingual group, F(1,38) = .63, p = .43, ηp2 = .02. The interactions of bilingual group with language, valence, and electrode were not significant (p > .05).
There was a significant main effect of language F(1,38) = 11.33, p = .002, ηp2 = .23, with larger overall amplitudes in English than in Spanish. In addition, there was a significant main effect of valence, F(2, 37) = 6.75, p = .003, ηp2 = .27. Post-hoc analyses indicated significant differences in amplitude between negative and neutral (p = .008), between neutral and positive (p = .001), but not between positive and negative words (p = .46). The mean amplitude was larger for positive than for negative words, which, in turn, was larger than that of neutral words. Mauchly's test of sphericity was significant for the main effect of electrode, X2 (2) = 7.15, p < .05; therefore, the degrees of freedom were adjusted according to the Greenhouse-Geisser correction (ξ = .85). The effect of electrode was significant, F(1.70,64.65) = 52.32, p < .001, ηp2 = .58. The post-hoc analyses indicated significant differences between the midparietal electrode, Pz, and the right parietal electrode, P4, (p < .001), and between the two lateral electrodes, P3 and P4 (p < .001), but not between the midparietal, Pz, and the left parietal electrode, P3, (p = .24). The amplitude over the midparietal electrode, Pz, was larger than over the left, P3, which in turn presented larger amplitude than the right, P4, electrode.
In addition, there was a significant interaction between language and valence, F(2,37) = 9.06, p = .001, ηp2 = .33, where negative words resulted in larger amplitudes than positive words in English, but in smaller amplitudes than positive words in Spanish. The interaction between language and electrode F(2,37) = 1.62, p = .21, ηp2 = .08 was not significant; however, the interaction between valence and electrode, F(4,35) = 3.05, p = .03, ηp2 = .26, was significant. All valence categories presented the largest amplitudes over the midparietal electrode, Pz, and greater amplitudes over the left, P3, than the right electrode, P4. However, the amplitudes for positive words were larger than for negative, and for negative than neutral words over the left, P3, and midparietal, Pz electrode, but over the right electrode, P4, the amplitude for negative words was larger than for positive words, and for positive than neutral words. The Language x Valence x Bilingual group interaction was significant, F(2,37) = 3.87, p = .03, ηp2= .17. The balanced group presented greater amplitudes for emotion words than for neutral words in both languages, as did the unbalanced group in English. However, the unbalanced group presented a different pattern in Spanish, where positive words elicited greater amplitudes than neutral words and neutral than negative. All the remaining three-way interactions and the four-way interaction were not significant (p > .05). For a comparison of ERPs between the balanced and unbalanced groups per language and valence, refer to Figure 7.
Control for priming effects
The effect of order of administration as a covariate was not significant in the EPN latency model, F(1,37) = 2.27, p = .13, ηp2= .06, nor for the EPN amplitude model, F(1,37) = .12 p = .73, ηp2= .003. The effect was also nonsignificant for the LPC latency model, F(1,37) = .06 p = .76, ηp2 = .003, nor for the LPC amplitude, F(1,37) = .65 p = .43, ηp2= .02.
The study data can be made available upon request.
Discussion
The present study aimed to analyze electrophysiological correlates of emotion word processing in Spanish–English bilinguals with different levels of language proficiency. The bilingual sample was divided into two groups: balanced (similar proficiency levels) and unbalanced (different proficiency levels). ERPs were obtained while participants performed a valence word rating task (WRT; Vélez-Uribe & Rosselli, Reference Vélez-Uribe and Rosselli2017).
Four hypotheses were formulated based on two ERP components highlighted in the previous literature about the processing of emotion words in monolinguals (Citron, Reference Citron2012; Hajcak et al., Reference Hajcak, Weinberg, MacNamara and Foti2012; Luck, Reference Luck2014) and bilinguals (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015; Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012): the EPN and the LPC. The hypotheses aimed to test the differences between the bilingual groups in both languages, English and Spanish, and three valence categories, negative, neutral, and positive, analyzing latency and amplitude for each component. Since similar levels of language proficiency would reflect on comparable emotional reactivity between languages, the overall expectations included greater differences between languages in the valence effect (processing advantage of emotion words, positive and negative, over neutral words) in the unbalanced group compared to the balanced group.
Longer EPN latencies were expected for emotion words when compared to neutral words for the unbalanced but not for the balanced group. The unbalanced group was expected to show differences between languages, with longer latencies for emotion words in English (L2) than in Spanish (L1). Results indicated no differences between balanced and unbalanced bilinguals. The latencies differed, however, between languages, with longer latencies in English than in Spanish for both groups.
There were no significant differences in latencies across valence categories, but the mean latencies presented the expected pattern, with shorter latencies for emotion words than neutral words across conditions. The latencies of ERP components act as an index of processing speed and indicate the time course of cognitive processes (Luck, Reference Luck2014). The current findings are consistent with the idea that emotional stimuli are inherently salient (Hajcak et al., Reference Hajcak, Weinberg, MacNamara and Foti2012) and recruit attentional resources earlier than neutral stimuli (Citron, Reference Citron2012; Luck, Reference Luck2014), which facilitates its processing by attributing greater perceptual relevance to it (Palazova et al., Reference Palazova, Mantwill, Sommer and Schacht2011). These results suggest that, like the results reflected in the WRT scores, the task elicited a valence effect (a processing advantage of emotional over neutral stimuli), but it was similar in both languages for both bilingual groups. The language effect is consistent with previous studies in bilinguals that found significant differences in EPN latencies for emotion words between languages, showing a delay in processing reflected in longer latencies in L2 (Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012). In our sample, latencies were shorter in English (L2) than in Spanish (L1), which could be an effect of higher English proficiency. Even though the current sample is highly bilingual and lives immersed in a highly bicultural environment, South Florida, US, most of their education was in English, which could reflect in more efficient processing in English than in Spanish.
The second hypothesis regarding the EPN amplitude expected emotion words to elicit larger amplitudes than neutral words across the bilingual groups, and significant differences between languages. The bilingual groups did not differ in the EPN amplitude. However, the amplitude differed between emotion and neutral words. The overall amplitude of the EPN was larger for positive than neutral, and for neutral than negative words. These results are partially inconsistent with the expected configuration of the EPN, where positive words are expected to elicit larger amplitudes, like what we observed (Luck, Reference Luck2014; Weinberg & Hajack, Reference Weinberg and Hajcak2010); but negative words reflected the unusual pattern of smaller amplitudes than neutral words. However, a similar pattern has been observed in the P1, a component that occurs before the EPN, with similar scalp distribution (Scott, O'Donnell, Leuthold & Sereno, Reference Scott, O'Donnell, Leuthold and Sereno2009).
The effect of language was significant, and the overall EPN amplitude was significantly larger for words in Spanish than in English, in both bilingual groups across valence categories and might reflect the higher English proficiency of both groups, resulting in larger amplitudes in the less proficient language (Yang et al., Reference Yang, Perfetti, Tan and Jiang2018). The results were similar for both groups, suggesting that proficiency differences might not have been sufficient to detect valence differences between languages. These findings are not entirely consistent with previous literature in the EPN in bilinguals, which has successfully elicited a valence effect, but has not found differences in the amplitude of the EPN between languages (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015; Conrad et al., Reference Conrad, Recio and Jacobs2011; Kim, Reference Kim1993; Opitz & Degner, Reference Opitz and Degner2012). It is possible that task-related demands increased the amount of processing required, and intensified the differences between languages, favoring the most proficient language. Altogether, results suggested that the effects of valence on automatic processing can be elicited in both languages in bilinguals and that the differences are subtle and difficult to detect at the level of the EPN amplitude. The current sample reported high levels of proficiency in both languages: they were mainly educated in English, and most participants reported English as their dominant language, a characteristic that can modify the emotion-coupling differences between languages by favoring the dominant language (Harris et al., Reference Harris, Ayçiçeği and Gleason2003). The differences between languages combined with the lack of interaction between language and valence might indicate a high level of emotion-cognition coupling in both languages, at least at the level of processing reflected by the EPN. It could also mean that the EPN is so sensitive to the effect of valence, that it responds to emotion words as long as the meaning is known, regardless of other linguistic factors, such as order of acquisition (Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012), L2 proficiency (Kim, Reference Kim1993), or lack of immersion in the L2 environment (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015). The second hypothesis also predicted an overall left hemisphere dominance for the amplitude of the EPN, corresponding with linguistic processing, but the effect of electrode and its interactions with other factors were not significant.
The third hypothesis predicted overall longer latencies for emotion words than for neutral words, and that the unbalanced group would present longer latencies than the balanced group across word categories in the LPC. Even though the pattern was as predicted, with the unbalanced group presenting longer latencies than the balanced group, the differences were not significant. The LPC latency differed between electrodes, with longer latencies over the midparietal, than the left parietal and over the left parietal than the right parietal electrodes. The obtained latency distribution is consistent with the expected scalp distribution of the LPC (Citron, Reference Citron2012; Luck, Reference Luck2014). The latency of ERP components reflects the onset of the effects elicited by the stimulus (Palazova et al., Reference Palazova, Mantwill, Sommer and Schacht2011); therefore, it might reflect the earlier allocation of cognitive resources to language processing by the left hemisphere.
The fourth hypothesis predicted that the LPC amplitude would reflect the effects of bilingual group, language, and valence. It predicted larger amplitudes for emotion than for neutral words, with proficiency acting as an attenuating factor influencing greater differences between emotion and neutral words for the unbalanced than for the balanced group. There were no significant differences between bilingual groups, but there was a significant effect of language, where both groups presented larger amplitudes in English than in Spanish.
The significant effect of valence revealed that the LPC amplitude was larger for positive than negative and negative than neutral words. The interaction between language and valence was significant, which indicated differences in processing valence between languages. Negative words resulted in larger amplitudes than positive words in English, but the opposite occurred in Spanish. We found a significant interaction between bilingual group, language, and valence, where the LPC amplitude was larger for negative than for positive, and for positive than for neutral words for both groups in English, but this differed regarding the Spanish words. These effects are consistent with the advantage of emotional stimuli over neutral stimuli reflected in the amplitude of the LPC (Citron, Reference Citron2012). Results were different for the unbalanced group: the amplitude was larger for positive than for neutral, and for neutral than for negative words. A similar valence effect has been reported in monolinguals in a word identification task (Hinojosa, Carretié, Valcárcel, Méndez-Bértolo & Pozo, Reference Hinojosa, Carretié, Valcárcel, Méndez-Bértolo and Pozo2009) and bilinguals in an LDT (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015); this has been attributed to the early allocation of resources to processing emotion words, making it necessary to dedicate more resources to neutral words at later stages (Citron, Reference Citron2012). These valence differences between bilingual groups across languages suggest that the balanced and unbalanced groups might process emotion content similarly in English, the most proficient language in the present sample, but differently in Spanish. The valence effects reflected on the ERPs were consistent across languages for the balanced group, but not for the unbalanced group, suggesting similar emotional reactivity in their two languages for balanced bilinguals; which could also indicate that emotion-cognition coupling reflected at on the LPC is similar between languages for the balanced but not for the unbalanced group. The lower level of proficiency in Spanish for the unbalanced group, compared with the balanced group, might reflect attenuation of the valence effect for negative words in Spanish for this group. Most of the previous studies in bilinguals have not analyzed this component (Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012). However, Chen et al. (Reference Chen, Lin, Chen, Lu and Guo2015) did not detect any differences between languages, even though their participants had acquired their L2 in an instructional setting and had never been immersed in the L2 environment. However, since the LPN amplitude is selectively responsive to semantic processing and not evident in superficial tasks (Fischler & Bradley, Reference Fischler and Bradley2006; Palazova et al., Reference Palazova, Mantwill, Sommer and Schacht2011), it could relate to the application of an LDT. Conversely, the differences detected in the present study might reflect the deeper processing elicited by the valence rating task (VRT), which might have been more sensitive to the attenuation of emotion effects in the less proficient language for the unbalanced group. Also, the attenuated amplitude for negative words in Spanish for the unbalanced group could indicate weaker emotional reactivity to negative words in the less proficient language. However, it could also reflect that negative stimuli had greater relevance at earlier processing stages, as indicated by the EPN amplitudes for this group in Spanish, allowing for the allocation of resources to the processing of neutral words (Citron, Reference Citron2012).
Furthermore, the LPC amplitude also differed between electrodes, with greater amplitudes for midparietal, than the left parietal and over the left parietal than the right parietal electrodes. This distribution is related to the findings for the LPC latency, which showed longer latencies over the midparietal than the left parietal and the left parietal than the right parietal electrodes. The effect, however, varied across the valence dimension, as reflected by a significant interaction involving the two lateral electrodes. The amplitude for emotion words was greater than for neutral words on both sides. On the left side, the amplitude for positive words was higher than for negative words. On the right side, however, it was larger for negative than for positive words, consistent with previous findings (Zhang et al., Reference Zhang, Wu, Meng and Yuan2017) for negative emotion-label words. The latency and amplitude distribution are consistent with the scalp distribution of the LPC, larger over the midparietal region (Citron, Reference Citron2012; Luck, Reference Luck2014), and might reflect a hemispheric lateralization, with more extended and effortful processing on the language-dominant left hemisphere (Opitz & Degner, Reference Opitz and Degner2012).
The results of the current study are not entirely comparable to previous ERP studies in the field of bilingualism and emotion, but some factors are worth highlighting. These comparisons should be interpreted with caution due to the different nature of the tasks, the languages involved, and the sample characteristics. Firstly, most studies have used LDT or LMT (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015; Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012), which require the participants to judge the lexical characteristics by determining if the stimuli are words. These studies found longer EPN latencies in L2 and concluded that the differences between languages were only quantitative. We detected similar differences in the latency of the EPN. However, the valence effects were similar across bilingual groups and languages, indicating that this aspect of the EPN might not be sensitive to the differences of processing emotion content between languages in bilinguals, even with a VRT.
Secondly, previous studies have focused on the order of acquisition as a factor (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015; Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012), controlling for proficiency as inclusion criteria to ensure participants were sufficiently competent in L2. Only one study divided groups by proficiency levels. Kim (Reference Kim1993) included Becoming Bilinguals and of Stable Bilinguals, similar to the balanced and unbalanced groups in the current study. However, the task was only applied in English (L2). Like the present study, Kim (Reference Kim1993) found no group differences and no differences P300 amplitude. Considering that the LPC is included in the P300 family (Citron, Reference Citron2012), the current results are comparable to those from Kim's (Reference Kim1993) study, but they differed regarding LPC amplitude in L2 for the unbalanced group. Since the LPC reflects deeper levels of processing, this might reflect the differences in processing demands between the VDT and our VRT.
Lastly, the bilinguals’ languages and the degree of immersion differed across studies, including Korean–English bilinguals living in Korea (Kim, Reference Kim1993), German–Spanish and Spanish–German bilinguals living in Germany (Conrad et al., Reference Conrad, Recio and Jacobs2011), German–French and French–German bilinguals living in Germany (Opitz & Degner, Reference Opitz and Degner2012) and Chinese–English bilinguals living in China (Chen et al., Reference Chen, Lin, Chen, Lu and Guo2015). The bilingual samples included in the current study were highly bilingual and were immersed in a highly bicultural environment. The differences in the results may be related to sample differences, the inherent qualities of the languages involved, and immersion factors.
The current study did not find significant differences between bilinguals, which could be attributed to the relatively high level of proficiency between languages in both groups. Replication of the current findings should include a less proficient group of bilinguals. The valence effects were consistent in English for latencies and amplitudes of both components, and in Spanish for the balanced group but not for the unbalanced group, suggesting that the valence effects, with emotion words showing different amplitudes than neutral words, can be elicited in both languages in bilinguals (Conrad et al., Reference Conrad, Recio and Jacobs2011; Opitz & Degner, Reference Opitz and Degner2012). However, this might be modulated by proficiency, as evidenced by the differences in LPC amplitude in Spanish for the unbalanced group.
Nevertheless, the current study presents limitations such as the small sample size. Increasing the size of these subsamples and enhancing diversity in proficiency levels in both languages might yield more conclusive results. The inclusion of an objective measure of language proficiency as opposed to self-report might be a more methodologically sound way to classify participants by levels of proficiency. The length of the experiment could have caused fatigue in the participants, potentially impacting their attention and other neural processes. A potential methodological limitation of the current study is the inclusion of both emotion-label and emotion-laden words, which may involve differential electrophysiological processing (Zhang et al., Reference Zhang, Wu, Meng and Yuan2017; Zhang, Teo & Wu, Reference Zhang, Teo and Wu2018).
Future directions include expanding on the bilingual data with a more diverse sample in terms of proficiency; also, testing emotion-label and emotion-laden words separately in bilinguals in an experiment including semantic processing, such as in contextual relation to sentences, to attempt to determine if there are differences in semantic processing of emotion words between a bilingual's languages. Analyses of ERP components such as the N400, involved in semantic processing and expanding the analyses to frontal and temporal electrodes, would be suggested. Furthermore, experiments with neuroimaging techniques aimed at detecting differences in neuroanatomical correlates of emotion processing between bilinguals might help better elucidate the relationship between language and emotions in bilinguals.
Acknowledgements
Our most sincere gratitude to Valeria Torres for her editorial support.