1. Introduction
Intonation plays a vital role in speech communication, from basic functions such as grouping words into phrases for marking prosodic junctures (delimitative functions) and marking relative prominence among prosodic constituents (culminative function) to more complex functions such as signaling the structure of discourse information and communicating attitudes and emotions. The processes that underlie speech perception and comprehension and the resulting outcomes are indeed modulated by intonation in native listeners (e.g., Brown, Salverda, Dilley & Tanenhaus, Reference Brown, Salverda, Dilley and Tanenhaus2011, Reference Brown, Salverda, Dilley and Tanenhaus2015a; Brown, Salverda, Gunlogson & Tanenhaus, Reference Brown, Salverda, Gunlogson and Tanenhaus2015b; Christophe, Peperkamp, Pallier, Block & Mehler, Reference Christophe, Peperkamp, Pallier, Block and Mehler2004; Ito, Jincho, Minai, Yamane & Mazuka, Reference Ito, Jincho, Minai, Yamane and Mazuka2012; Ito & Speer, Reference Ito and Speer2008; Kim & Cho, Reference Kim and Cho2009; Kim, Mitterer & Cho, Reference Kim, Mitterer and Cho2018; Salverda, Dahan & McQueen, Reference Salverda, Dahan and McQueen2003; Salverda, Dahan, Tanenhaus, Crosswhite, Masharov & McDonough, Reference Salverda, Dahan, Tanenhaus, Crosswhite, Masharov and McDonough2007; Spinelli, Grimault, Meunier & Welby, Reference Spinelli, Grimault, Meunier and Welby2010; Steffman, Reference Steffman2019). Yet, the learning of intonational cues in second/foreign-language (L2) speech perception and comprehension has received little attention in research, at least compared to the learning of segmental (i.e., consonant, vowel) information (for examples of L2 studies on the perception and/or comprehension of intonation, see Mok, Yin, Setter & Nayan, Reference Mok, Yin, Setter and Nayan2016; Ortega-Llebaria & Colantoni, Reference Ortega-Llebaria and Colantoni2014; Ortega-Llebaria, Nemogá & Presson, Reference Ortega-Llebaria, Nemogá and Presson2015; Ortega-Llebaria, Olson & Tuninetti, Reference Ortega-Llebaria, Olson and Tuninetti2018; Puga, Fuchs, Setter & Mok, Reference Puga, Fuchs, Setter and Mok2017; Tremblay, Broersma & Coughlin, Reference Tremblay, Broersma and Coughlin2018; Tremblay et al., Reference Tremblay, Broersma, Coughlin and Choi2016; Tremblay, Coughlin, Bahler & Gaillard, Reference Tremblay, Coughlin, Bahler and Gaillard2012). Accordingly, the theories and models that have been proposed to explain L2 speech perception and comprehension have often focused on the learning of segmental categories (e.g., Best & Tyler, Reference Best, Tyler, Munro and Bohn2007; Flege, Reference Flege and Strange1995; van Leussen & Escudero, Reference van Leussen and Escudero2015) and in general have little to say about the learning of intonation.
To illustrate, at the segmental level, theories of L2 speech learning indicate that the perceptual learning of L2 sounds and the use of L2 sounds in speech comprehension are affected by how L2 sounds relate acoustically and perceptually to existing sounds in the native (i.e., first) language (L1) and whether L2 and corresponding L1 sounds form segmental and lexical contrasts (e.g., Best & Tyler, Reference Best, Tyler, Munro and Bohn2007; Flege, Reference Flege and Strange1995; van Leussen & Escudero, Reference van Leussen and Escudero2015). Specifically, L2 listeners have been proposed to perceptually assimilate L2 sounds to L1 sounds when the L2 and L1 sounds are realized similarly (from an acoustic and/or articulatory perspective), leading L2 listeners to have perceptual difficulties when two contrastive L2 sounds assimilate to the same L1 sound. However, since intonation can serve different non-lexical functions (e.g., as briefly mentioned above, intonation serves delimitative and culminative functions that are largely independent from segmental and lexical information), and given that its acoustic realization is much more variable than that of segmental information, L2 listeners’ learning and use of intonation may differ in important ways from their learning and use of segmental information. It is thus unclear whether the mechanisms proposed to influence the perceptual learning of segmental categories similarly explain the perceptual learning of intonation in L2 listeners.
Another important difference between segmental and intonational information relates to how “phonological” differences among languages are operationalized. At the segmental level, the term “phonological” refers to segmental changes that create lexical contrasts among words composed of contrastive sounds in a given language. At the intonational level, on the other hand, the term “phonological” usually refers to how different tones (e.g., high and low) are combined to generate a particular tune in a given language (e.g., Pierrehumbert, Reference Pierrehumbert1980). Crucially, like segmental information, the intonational systems of languages also differ at the phonetic level (e.g., how phonologically similar tonal make-ups may differ in terms of their precise timing in a prosodic domain; cf. Ladd, Reference Ladd2012). In light of the differences between segmental and intonational information, several issues have yet to be resolved – for instance, how phonological and phonetic aspects of the L1 and L2 intonations modulate the learning of intonational cues in L2 speech perception and comprehension.
With the goal of shedding some light on these questions, the present study investigates how phonological and phonetic aspects of the L1 intonation modulate the use of tonal cues in L2 speech segmentation. Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016) have proposed that if the L1 and L2 intonations are phonologically similar yet phonetically different, listeners may perceive the L2 intonation as identical to the L1 intonation and not learn the fine-grained differences between them, a learnability problem they referred to as the Prosodic-Learning Interference Hypothesis (Tremblay et al., Reference Tremblay, Broersma, Coughlin and Choi2016). The authors suggest that listeners may perceptually assimilate L2 tonal categories to L1 tonal categories, as has been proposed for the learning of segmental information (e.g., Best & Tyler, Reference Best, Tyler, Munro and Bohn2007; Flege, Reference Flege and Strange1995; van Leussen & Escudero, Reference van Leussen and Escudero2015), and that L2 tonal categories should be sufficiently distant from L1 tonal categories for L2 listeners to be able to learn these new categories. Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s proposal thus entails that the mechanisms underlying the learning of segmental and intonational information do not differ in significant ways.
The present study elaborates on these possibilities by examining whether the learning of intonational cues to word boundaries in L2 speech segmentation is indeed more difficult if the L1 and L2 have phonologically similar but phonetically different intonations than if they have phonologically different intonations. More precisely, this study re-examines Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis by investigating the use of tonal information in the segmentation of Seoul Korean (thereafter, Korean) in three listening groups: native Korean listeners, French-speaking L2 learners of Korean, and English-speaking L2 learners of Korean. Korean and French (discussed below) have phonologically similar intonational patterns but their exact phonetic realization differs between the two languages. Investigating the use of tonal cues to word boundaries in French-speaking L2 learners of Korean will illuminate the degree to which intonational cues for detecting word boundaries can be learned when the intonations of the L1 and L2 are phonologically similar but phonetically different. Furthermore, since the intonation of English (discussed below) differs phonologically from that of Korean, comparing the performance of English-speaking L2 learners of Korean with that of proficiency-matched French-speaking L2 learners of Korean will elucidate the extent to which phonological (dis)similarity influences the learning of intonational cues to word boundaries in L2 speech segmentation, thus providing an ideal test of Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s hypothesis.
The intonation of Korean has been analyzed as having an Accentual Phrase (AP) with the default tonal pattern of L(HL)H when the AP-initial segment is lenis (Jun, Reference Jun1998, Reference Jun2000). In Korean, the AP-final H is loosely anchored to the AP-final syllable and the (subsequent) AP-initial L is closely anchored to the (following) AP-initial syllable; this means that the peak of the F0 rise in the AP-final syllable can vary, but it is sufficiently early to allow the F0 to be low and reach its target L tone early on in the (subsequent) AP-initial syllable (Jun, Reference Jun2000, example (9)). French is similar to Korean in that it also has an AP with the default tonal pattern of L(HL)H (Jun & Fougeron, Reference Jun, Fougeron and Botinis2000, Reference Jun and Fougeron2002). In French, however, the AP-final H tone is closely anchored to the AP-final syllable and the (subsequent) AP-initial L tone is loosely anchored to the (following) AP-initial syllable; consequently, the last (non-reduced) syllable of AP-final words has an F0 rise that peaks towards the end of the AP, and the F0 lowers in the (subsequent) AP-initial syllable and reaches its target L tone somewhat later (e.g., Jun & Fougeron, Reference Jun and Fougeron2002, Figure 8a; Welby, Reference Welby2006). Hence, Korean and French are phonologically similar but differ in the phonetic realization of the tonal categories that signal the beginning and end of phrases (and thus words). Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis therefore predicts that French-speaking L2 learners of Korean and Korean-speaking L2 learners of French should have difficulty learning the fine-grained phonetic differences between, respectively, the Korean and the French intonations.
In contrast to both Korean and French, the intonational realization of words in English is not dependent on the position of the word in the phrase, but rather on the distribution and types of pitch accents and on the types of boundary tones (i.e., tones that occur at the end of a large phrase) in relation to discourse information (e.g., Beckman & Pierrehumbert, Reference Beckman and Pierrehumbert1986; Ladd, Reference Ladd2012; Pierrehumbert, Reference Pierrehumbert1980; Pierrehumbert & Hirschberg, Reference Pierrehumbert, Hirschberg, Cohen, Morgan and Pollack1990). Moreover, unlike Korean and French, English has lexical stress, with words being statistically more likely to be stressed on the first syllable than elsewhere in the word (e.g., Clopper, Reference Clopper2002; Cutler & Carter, Reference Cutler and Carter1987). When English words receive a neutral (i.e., H) pitch accent, the F0 rise associated with the pitch accent occurs on the stressed (often word-initial) syllable (Beckman & Elam, Reference Beckman and Elam1997). English is thus phonologically (and phonetically) different from both Korean and French. Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis therefore predicts that English-speaking L2 learners of Korean and English-speaking L2 learners of French should be successful in learning the phonological (and phonetic) differences between the English and, respectively, the Korean and the French intonations.
In a visual-world eye-tracking study, Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016) investigated the use of fundamental frequency (F0) rise as a cue to the end of phrase-final French words by native French listeners, Korean-speaking L2 learners of French, and English-speaking L2 learners of French. The two groups of L2 learners of French did not differ in their proficiency in or experience with French. Participants heard stimuli where the monosyllabic target word and the first syllable of the following adjective (e.g., chat lépreux ‘leprous cat’) were temporarily ambiguous with a disyllabic competitor word (e.g., chalet ‘cabin’). The monosyllabic target word was manipulated so that it would contain or not contain an H tone, realized as an F0 rise that peaked at the offset of the monosyllabic word. Native French listeners showed a greater target-over-competitor fixation advantage when the F0 signaled the end of words in phrase-final position than when it did not, with this effect emerging early on in the word recognition process. English-speaking L2 learners of French also showed this effect but later in the word recognition process, indicating that they needed time to integrate this cue. By contrast, Korean-speaking L2 learners of French did not show a facilitative effect of the F0 rise cue, indicating that they were unable to use the phrase-final F0 rise to locate word boundaries in French (e.g., Kim & Cho, Reference Kim and Cho2009; Tremblay, Cho, Kim & Shin, Reference Tremblay, Cho, Kim and Shin2019, discussed below). It was on the basis of these results that Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016) proposed the Prosodic-Learning Interference Hypothesis. The authors suggested that Korean listeners could not use the F0 rise to locate the end of phrase-final words in French because the F0 of the AP-final H tone peaked too late in the AP-final syllable and did not reach its subsequent L target sufficiently early in the following AP. Importantly, Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016) proposed that the similarity between the French and Korean intonations posed a learnability problem for Korean-speaking L2 learners of French, who did not show evidence of having learned the fine-grained differences between the two systems.
The present study provides a further test of Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis by investigating Korean, French, and English listeners’ use of F0 cues to word boundaries in the segmentation of Korean speech. The prosodic structure of Korean proposed by Jun (Reference Jun1998, Reference Jun2000) is such that the AP-final H tone should provide an intonational cue to the end of words in phrase-final position, and the AP-initial L tone should provide an intonational cue to the beginning of words in phrase-initial position. Native Korean listeners have indeed been found to use these two intonational cues to segment Korean speech into words. Using word-spotting experiments, Kim and Cho (Reference Kim and Cho2009) showed that Korean listeners’ speech segmentation was enhanced by the AP-final H tone in the presence of a subsequent AP-initial L tone and by the AP-initial L tone in the presence of a preceding AP-final H tone, suggesting that the contrast between the H and L tones helped Korean listeners break the speech signal down into individual words. Similarly, using an artificial-language learning paradigm, Kim, Broersma, and Cho (Reference Kim, Broersma and Cho2012) showed that Korean listeners’ learning of statistical dependencies among syllables in a continuous artificial speech stream was enhanced by the presence of both an H tone on the word-final syllable and an L tone on the word-initial syllable (compared to when tonal information did not signal word boundaries). Finally, also in an artificial-language learning paradigm, Tremblay et al. (Reference Tremblay, Cho, Kim and Shin2019) found that Korean listeners’ learning of statistical dependencies among syllables benefited from the presence of a word-final H tone if the scaling of word-initial L tone was sufficiently low. These results suggest that Korean listeners may pay closer attention to the phonetic realization of the AP-initial L tone than to that of the AP-final H tone, possibly explaining Korean listeners’ inability to use a late-peaking F0 rise to locate the end of phrase-final words in French (in Tremblay et al., Reference Tremblay, Broersma, Coughlin and Choi2016, this rise was not followed by an AP-initial L tone closely aligned with the beginning of the AP).
The findings reported for Korean listeners’ use of intonational cues to word boundaries differ from those reported for native French listeners. While French listeners’ speech segmentation has been found to benefit from an AP-final H tone (e.g., Christophe et al., Reference Christophe, Peperkamp, Pallier, Block and Mehler2004; Michelas & D'Imperio, Reference Michelas and D'Imperio2010; Tremblay et al., Reference Tremblay, Broersma, Coughlin and Choi2016; Tremblay et al., Reference Tremblay, Coughlin, Bahler and Gaillard2012), it does not appear to be contingent on the presence of a subsequent AP-initial L tone closely aligned with the beginning of the AP boundary. In fact, Spinelli et al. (Reference Spinelli, Grimault, Meunier and Welby2010) and Welby (Reference Welby2007) have shown that French listeners are more likely to perceive a word-initial boundary as the F0 at the onset of a segmentally ambiguous string increases (e.g., they are more likely to perceive /lafiʃ/ as l'affiche ‘the poster’ and less likely to perceive it as la fiche ‘the index card’ as the F0 on /la/ increases; Spinelli et al., Reference Spinelli, Grimault, Meunier and Welby2010). The lack of relationship between the AP-initial L tone and the beginning of phrase-initial words in French may be explained in two ways: (i) the AP-initial L tone is not closely aligned with the AP-initial syllable in French due to the late alignment of the AP-final H tone (unlike Korean); and (ii) APs are more likely to begin with a determiner in French than in Korean, thus making the AP-initial L tone less useful for locating the beginning of phrase-initial words in French.
Given these similarities and differences between Korean and French, we might expect French-speaking L2 learners of Korean to be more successful in their use of the AP-final H tone as a cue to the end of phrase-final words than in their use of the AP-initial L tone as a cue to the beginning of phrase-initial words in Korean. Importantly, Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis predicts that the phonological similarities between the Korean and French intonations should prevent French listeners from learning the fine-grained phonetic differences between French and Korean. While this is true for both the AP-final H tone and the AP-initial L tone, French listeners’ predicted that the non-target-like phonetic representation of the AP-final H tone is not expected to impair their segmentation of Korean, as this H tone would still occur in the final syllable of Korean APs and thus would still signal the end of phrase-final words in Korean (like French). By contrast, French listeners’ predicted non-target-like phonetic representation of the AP-initial L tone would prevent them from learning that the AP-initial L tone can signal the beginning of phrase-initial words in Korean (unlike French), thus impairing their ability to use the AP-initial L tone to locate the beginning of phrase-initial words in Korean.
As briefly mentioned above, tonal cues have also been found to enhance native English listeners’ speech segmentation when an F0 rise corresponding to an H tone is heard in word-initial position (e.g., Tyler & Cutler, Reference Tyler and Cutler2009). This finding was attributed to the statistical tendency for words to be stressed on the initial syllable in English (e.g., Clopper, Reference Clopper2002; Cutler & Carter, Reference Cutler and Carter1987), with the H tone being aligned with the stressed syllable (e.g., Beckman & Elam, Reference Beckman and Elam1997). Thus, for English listeners, learning to segment Korean speech into words involves learning a markedly different intonation and associating the AP-final H tone and the AP-initial L tone to, respectively, the end and beginning of phrase-final and phrase-initial words. Although both tones need to be learned, one might expect that it will be more difficult for English listeners to use the AP-initial L tone to locate the beginning of phrase-initial words in Korean given that word-initial boundaries are often signaled by an H tone in English. Crucially, the striking phonological differences between the Korean and the English intonations should enable English-speaking L2 learners of Korean to learn the differences between the two systems and use both the AP-final H tone and the AP-initial L tone in the segmentation of Korean speech.
In order to test these predictions, we conducted a visual-world eye-tracking experiment that orthogonally manipulated the AP-final and AP-initial tones in Korean.
2. Method
2.1. Participants
A total of 34 adult native Korean listeners (mean age: 25.5, standard deviation (SD): 2.5, 11 women), 24 French-speaking L2 learners of Korean (mean age: 26.6, SD: 4.2, 18 women), and 19 English-speaking L2 learners of Korean (mean age: 29.1, SD: 6.4, 9 women) participated in this study.Footnote 1 All participants were tested at a university in Seoul, South Korea, and none reported having speech impairments.
All Korean listeners identified Seoul Korean as their native dialect; all had parents who spoke Korean as their L1, and all spent their childhood and adolescence in Korea. All French listeners had parents who spoke French as their L1, and all lived in a French-speaking country during their childhood and adolescence (20 spent their childhood and adolescence in France and 2 in Belgium; 2 spent some of their childhood in Côte d'Ivoire and their adolescence in France). All English listeners had parents who spoke English as their L1, and all lived in an English-speaking country during their childhood and adolescence (13 spent their childhood and adolescence in the US, 2 in Canada, 1 in the UK, 1 in Ireland, 1 in Australia, and 1 in Australia and Singapore).
The L2 learners completed a mock reading test of the Test of Proficiency in Korean II (60th TOPIK II – Reading – Mock Test; https://www.topikguide.com/topik-ii-reading-mock-test-1/). TOPIK II targets Levels 3–6 (intermediate to advanced) in Korean. The mock test is out of 50, and the passing score for Level 3 (lower intermediate) is 20.Footnote 2 The L2 learners’ proficiency scores are presented in Table 1, together with their age of first listening exposure to Korean, number of years of Korean instruction, and months spent in Korea. As can be seen from the proficiency scores in Table 1, English listeners reached the passing score for Level 3 (lower intermediate), but French listeners did not quite do so. However, an independent-samples t-test assuming unequal variances revealed that the two L2 groups did not differ significantly from each other in their proficiency in Korean (t[32] = –1.45, p = .079). An independent-samples t-test assuming unequal variances also found the two L2 groups not to differ from each other in their number of years of Korean instruction (t[32] = –1.43, p = .081). An independent-samples t-test assuming equal variances further showed that the two L2 groups did not differ from each other in their age of first listening exposure to Korean (t<|1|). However, an independent-samples t-test assuming unequal variances revealed that the two L2 groups differed from each other in the number of months they have spent in Korea (t[21] = –2.7, p < .007), with English listeners having spent more time in Korea compared to French listeners. Thus, other than for the amount of time they had spent in Korea, the two L2 groups were comparable in their proficiency in and experience with Korean. Although the observed numerical trends (proficiency in Korean, years of Korean instruction) and significant difference (months spent in Korea) would be problematic given the predictions formulated in the preceding section, they are not given the results that were ultimately found (for details, see the Results and Discussion sections).Footnote 3
Note. Mean (SD)
2.2. Materials
Experimental sentences were created that contained a temporary lexical ambiguity between a disyllabic target word in post-boundary AP-initial position and a disyllabic competitor word spanning the AP boundary. The experimental sentences had the structure [Adverb]AP [Subject + case-marker]AP [Object + case-marker]AP [Verb]AP. In these sentences, the subject always ended with the case marker –ka, –i, or –to. The target word was the following disyllabic object (e.g., masul ‘magic’), and the competitor word was a disyllabic word that began with the same syllable as the case marker and ended with the first syllable of the target word (e.g., kama ‘palanquin’). The segmental competitor was thus heard before the target word was heard. All sentential contexts preceding the target word were semantically compatible with both the target and competitor words (e.g., paŋkɨm sɛsinpu-ka masul-ɨl… ‘just-now the-new-bride-subj magic-obj …’ vs. paŋkɨm sɛsinpu-ka kama… ‘just now the new-bride's palanquin…’). The experiment contained 36 experimental sentences, with the above three case markers each appearing in 12 sentences.
The 36 experimental sentences were interspersed with 108 filler sentences, 8 of which were used in the practice session. Of these filler sentences, 36 had the same sentence structure and the target word in the same position as the experimental sentences, but the subject did not contain a case marker, and the target word instead began with the syllables ka–, i–, or to–, each in 12 sentences. In these filler sentences, the competitor word began with the second syllable of the target word (e.g., target: kaʧi ‘eggplant’; competitor: ʧiʧin ‘earthquake’). The remaining filler sentences had a similar sentence structure but the location of the adverb varied across sentences. Of these sentences, 40 had the target word in subject position and 32 sentences had the target word in the adverb position. For these sentences, the competitor word overlapped with the target word in its first syllable (e.g., target: namu ‘tree’; competitor: napi ‘butterfly’).
The visual display contained orthographic representations of the target and competitor words and of two distracter words (for a validation of the use of orthography in visual-world eye-tracking experiments, see McQueen & Viebahn, Reference McQueen and Viebahn2007). The distracters were phonologically and semantically unrelated to the target and competitor, and showed the same type of segmental overlap with each other as did the target and competitor (e.g., for the experimental items, the first syllable of one distractor was the second syllable of the other distracter). All the auditory sentences and visual words used in the experiment can be found in Appendix A of the Supplementary Materials (Supplementary Materials).
Three repetitions of the sentences were recorded by a female native speaker of Korean. The experimental sentences were then resynthesized in order to create four tonal boundary conditions: H#L, H#H, L#L, and L#H, where # is the AP boundary. In the natural productions, the subject ended with the AP-final H tone and the object began with the AP-initial L tone. For each sentence, the stimulus in the H#L condition was created by mimicking the H#L tones from a different recording of the same sentence; the stimulus in the H#H condition was created by extending the AP-final H tone of the resynthesized H#L stimulus to the following AP-initial syllable, with a slight decline over the AP-initial H tone so that the rest of the contour would sound natural; the stimulus in the L#L condition was created by extending the AP-initial L tone of the resynthesized H#L stimulus to the previous AP-final syllable; and the stimulus in the L#H condition was created by reversing the AP-final H tone and the AP-initial L tone of the resynthesized H#L stimulus, with a slight decline over the AP-initial H tone so that the rest of the contour would sound natural. The filler sentences were similarly resynthesized so that the experimental sentences would not stand out. The resynthesis was done manually using the PSOLA function in Praat (Boersma & Weenink, Reference Boersma and Weenink2017). Figure 1 shows the four pitch contours created for an example experimental sentence.
The experimental items were distributed in four lists, with participants hearing each sentence in only one tonal condition. The four tonal conditions were counterbalanced across lists and interspersed with the 108 filler sentences.
2.3. Procedures
Experiment Builder software (SR Research) was used to create and administer the eye-tracking experiment, and Eyelink software (SR Research) was used to monitor participants’ eye movements. Eye movements were recorded at a sampling rate of 250 Hz using the head-mounted EyeLink II eye-tracker. The stimuli were heard with an ASIO-compatible sound card, ensuring accurate audio timing in relation to the recording of eye movements.
In each trial, participants saw four orthographic words in a (non-displayed) 2 x 2 grid for 3,000 milliseconds before a fixation cross appeared in the middle of the screen for 500 milliseconds; as the fixation cross disappeared, the four words reappeared on the screen in their original position and the auditory stimulus was heard (synchronously) over headphones. Participants were instructed to click on the target word with the mouse as soon as they heard the target in the stimulus. Their eye movements were recorded from the onset of the case marker (e.g., –ka ma…). The trial ended with the participants’ response, followed by an interval of 1,000 milliseconds.
The eye-tracking experiment began with the practice session (8 trials) followed by the main experiment (136 trials). The 36 experimental trials and 100 filler trials from the main experiment were presented in four blocks (34 trials per block), with each block containing 9 experimental trials. The order of the experimental and filler trials within a block and the order of blocks were randomized across participants. The participants were offered to take a break after the second block. The experiment lasted approximately 20 minutes.
After completing the experiment, the L2 learners filled out a word familiarity questionnaire that contained the target and competitor words used in the experiment. They rated each word on a scale from 0 (“I have never seen/heard this word”) to 4 (“I have frequently seen/heard this word, I know what it means, and I can provide a definition for it”). The two L2 groups did not differ in their ratings of the words (French: mean = 2.18, SD = 0.6, English: mean = 2.21, SD = 0.6, t<|1|).Footnote 4
2.4. Data analysis
Experimental trials that received distracter responses or no response, or for which eye movements could not reliably be tracked, were excluded from the analyses. This resulted in the exclusion of 5.7% of all the trials (1.2% of the Korean listeners’ data, 8.9% of the French listeners’ data, and 9.5% of the English listeners’ data).
For the remaining trials, we analyzed participants’ eye movements in the four regions of interest, corresponding to the four orthographic words on the screen. Eye movements were analyzed from the onset of the case-marker (e.g., –ka ma…) to examine the effect of the AP-final tone, and from the onset of the target word (e.g., masul) to examine the effect of the subsequent AP-initial tone. Proportions of fixations to the target, competitor, and distracter words were extracted in 8-ms time windows from the specified onset to 1,400 ms post-onset for the purpose of initial data visualization. Statistical analyses were conducted on the difference between the empirical-logit-transformed proportions of target and competitor fixations over smoothed 50-ms time windows (for a similar analysis of visual-world eye-tracking data, see Creel, Reference Creel2014). We will refer to this dependent variable as listeners’ target-over-competitor fixation advantage.
Listeners’ target-over-competitor fixation advantage was modeled using growth curve analysis (Mirman, Reference Mirman2014; Mirman, Dixon & Magnuson, Reference Mirman, Dixon and Magnuson2008). The growth curve analyses were run using the lme4 package in R (Bates, Maechler, Bolker & Walker, Reference Bates, Maechler, Bolker and Walker2015) for fitting linear mixed-effects models. Initial analyses compared Korean listeners’ fixations to French listeners’ and English listeners’ fixations, and additional analyses compared French listeners’ fixations to English listeners’ fixations. These analyses included the AP-final or AP-initial tone (high, low), listeners’ L1 (Korean [if applicable], French, English), orthogonally derived time polynomials (linear, quadratic, cubic), and their interactions as fixed effects, with Korean listeners’ performance in the high tone condition as baseline for the initial analyses and with French listeners’ performance in the high tone condition as baseline for the additional analyses. A backward-fitting function from the package LMERConvenienceFunctions (Tremblay & Ransijn, Reference Tremblay and Ransijn2015) was then used to identify the model that accounted for significantly more of the variance than all simpler models, as determined by log-likelihood ratio tests. Only the results of the model with the best fit are presented (for a discussion of this approach, see Mirman, Reference Mirman2014), with p values being calculated using the lmerTest package in R (Kuznetsova, Brockhoff & Christensen, Reference Kuznetsova, Brockhoff and Christensen2016). All analyses included participant as random intercept and the time polynomials as random slopes, thus modeling a different curve for each participant. The first set of analyses focuses on the effect of the AP-final tone (across AP-initial tones) and the second set of analyses focuses on the effect of the AP-initial tone (across AP-final tones).
If tonal information modulates lexical access, in addition to showing a significant effect of tonal information, listeners’ fixation curves should show a significant interaction between the AP-final and/or AP-initial tone and at least one of the time polynomials. More specifically, we would expect listeners’ fixation curves to be less ascending (linear time polynomial), more ‘U’-shaped (quadratic time polynomial), and/or less ‘S’-shaped (cubic time polynomial) in the non-enhancing tone condition (L for the AP-final tone and H for the AP-initial tone) than in the enhancing tone condition (H for the AP-final tone and L for the AP-initial tone). The shallower and/or more ‘U’-shaped characteristics of the fixation curve would be due to greater interference from the competitor word, and the less ‘S’-shaped characteristic of the fixation curve would be due to listeners’ target fixations reaching an asymptote in the enhancing tone condition but not in the non-enhancing tone condition.
We expect Korean listeners’ fixation curves to show an inhibiting effect of the AP-final L tone compared to the AP-final H tone and an enhancing effect of the AP-initial L tone compared to the AP-initial H tone. English listeners’ fixation curves are predicted to show the same pattern of results, though it may take these listeners more time to integrate the use of these tonal cues given the phonological (and phonetic) differences between the Korean and English intonations (for such results, see Tremblay et al., Reference Tremblay, Broersma, Coughlin and Choi2016). This may be especially true of the AP-initial tone, since word beginnings are often signaled by an H tone in English. Importantly, following Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis, we predict that French listeners would show the same pattern of results for the AP-final tone but not for the AP-initial tone. As indicated earlier, French listeners’ predicted non-target-like phonetic representation of the AP-initial L tone is expected to inhibit their learning that the AP-initial L tone can signal the beginning of phrase-initial words in Korean (unlike French), thus impairing their ability to use the AP-initial L tone to locate the beginning of phrase-initial words in Korean. In other words, the phonological similarities between the Korean and French intonations are predicted to pose a learnability problem for French listeners’ use of the AP-initial L tone in Korean speech segmentation.
3. Results
Korean, French, and English listeners’ raw proportions of target, competitor, and distracter fixations over 8-ms time windows from the onset of the manipulated tonal conditions (i.e., from the onset of the case marker) to 1,400 ms post onset can be found in Figure B1 of Appendix B of the Supplementary Materials (Supplementary Materials).
3.1. AP-final tone
Korean, French, and English listeners’ empirical-logit-transformed target-over-competitor fixation advantage from the onset of the AP-final tone (i.e., from the onset of the case marker, which has an average duration of 114 ms.) is shown in Figure 2. Values above 0 mean that listeners fixated the target word more than the competitor word. The solid line represents listeners’ actual data, whereas the dashed line represents listeners’ predicted target-over-competitor fixation advantage according to the growth curve analysis with the best fit (presented next, see Table 2). The shading represents one standard error above and below the mean. Recall that the phonologically canonical AP-final tone is H, so the directionality of the predicted tonal effect for the AP-final tone is H > L.
The growth-curve analysis with the best fit on all listeners’ target-over-competitor fixation advantage from the onset of the AP-final tone to 1,400 ms post onset is presented in Table 2. The model included the following fixed effects: the simple effects of time (linear, quadratic, cubic), AP-final tone, and L1; the two-way interactions between time (linear, quadratic) and AP-final tone, between time (linear, quadratic, cubic) and L1, and between AP-final tone and L1; and the three-way interactions between time (linear, quadratic), AP-final tone, and L1.
The relevant significant effects from Table 2 can be summarized as follows. The negative estimate for the significant simple effect of AP-final tone indicates that Korean listeners showed lower proportions of fixations when the AP-final tone was L than when it was H. The negative and positive estimates for the significant interactions between, respectively, the linear and quadratic time polynomials and the AP-final tone mean that Korean listeners’ fixation curve became less ascending and more ‘U’-shaped when the AP-final tone was L than when it was H. Additionally, the significant three-way interactions between the linear and quadratic time polynomials, the AP-final tone, and L1 (both French and English) indicate that both French and English listeners differed from Korean listeners in the effect of AP-final tone as a function of time. The negative estimates for the interactions with the linear time polynomial suggest that French and English listeners showed a stronger interaction (in the same direction) between the linear time polynomial and the AP-final tone compared to Korean listeners (whose corresponding interaction also had a negative estimate), and the negative estimates for the interactions with the quadratic time polynomial suggests that French and English listeners showed a weaker or reverse interaction between the quadratic time polynomial and the AP-final tone compared to Korean listeners (whose corresponding interaction had a positive estimate).
These results are as predicted for Korean listeners, whose target-over-competitor fixation advantage was inhibited when the AP-final tone changed from a phonologically canonical H tone to a non-canonical L tone (across AP-initial tones). This confirms that Korean listeners’ speech segmentation benefits from an AP-final H tone. Importantly, French and English listeners’ fixation curves differed from Korean listeners’ fixation curves in the strength and possibly directionality of the effect of AP-final tone as a function of time. Note that the lack of significant interaction between the AP-final tone and L1 for either L2 group suggests that French and English listeners did not differ from Korean listeners in the overall effect of AP-final tone, with the L tone inhibiting the three groups’ target-over-competitor fixation advantage across time, as predicted; the L2 learners differed from native listeners only in the precise shape of their fixation curves.
To understand how L2 learners’ fixation curves differed from those of Korean listeners, we ran an additional analysis that directly compared French and English listeners’ fixation curves; this analysis allowed us to examine the strength and directionality of the interactions between time and AP-final tone for the two L2 groups while also comparing the L2 groups to each other. The growth-curve analysis with the best fit on L2 listeners’ target-over-competitor fixation advantage from the onset of the AP-final tone to 1,400 ms post onset is presented in Table 3. The fixed effects included in the model were the simple effects of time (linear, quadratic, cubic) and AP-final tone, and the two-way interaction between time (linear) and AP-final tone. For this analysis, the alpha level was adjusted to .025.
The important significant effects from Table 3 can be described as follows. The negative estimate for the significant simple effect of AP-final tone indicates that French listeners showed lower proportions of fixations when the AP-final tone was L than when it was H. The negative estimate for the significant interaction between the linear time polynomial and the AP-final tone indicates that French listeners’ fixation curve was less ascending when the AP-final tone was L than when it was H, with the estimate for this effect being larger in size than the corresponding effect observed for Korean listeners (see Table 2). Crucially, the model with the best fit did not include any effect of L1 or any interaction between L1 and AP-final tone or between L1, AP-final tone, and the time polynomials. This means that the simple effect of AP-final tone and the two-way interaction between the linear time polynomial and the AP-final tone was true of both French and English listeners.
These results confirm the stronger interaction between the linear time polynomial and the AP-final tone for French and English listeners compared to Korean listeners (hypothesized from Table 2), suggesting that L2 learners had more difficulty recovering from the misleading AP-final L tone. It appears that, because of this difficulty, French and English listeners’ fixation curves in the AP-final L tone condition did not have more of a ‘U’-shape than their fixation curves in the AP-final H tone condition, unlike Korean listeners’ fixation curves. Thus, although the AP-final tone had different effects on the precise shape of native and L2 listeners’ fixation curves, the consequence of these effects was similar – to inhibit listeners’ target-over-competitor fixation advantage when the AP-final tone deviated from the phonologically canonical H tone, as predicted. Importantly, the L2 groups did not differ from each other in their ability to use the AP-final tone as a cue to the end of phrase-final words in Korean.
3.2. AP-initial tone
Korean, French, and English listeners’ empirical-logit-transformed target-over-competitor fixation advantage from the onset of the AP-initial tone (i.e., from the offset of the case marker and onset of the target word) is shown in Figure 3. Again, values above 0 mean that listeners fixated the target word more than the competitor word. The solid line represents listeners’ actual data, and the dashed line represents listeners’ predicted target-over-competitor fixation advantage according to the growth curve analysis with the best fit (presented next, see Table 4). The shading represents one standard error above and below the mean. Recall that the phonologically canonical AP-initial tone is L in Korean, so the directionality of the predicted tonal effect for the AP-initial tone is L > H.
The growth-curve analysis with the best fit on all listeners’ target-over-competitor fixation advantage from the onset of the AP-initial tone to 1,400 ms post onset is presented in Table 4. The model included the following fixed effects: the simple effects of time (linear, quadratic, cubic), AP-initial tone, and L1; the two-way interactions between time (linear, quadratic) and AP-initial tone, between time (linear, quadratic) and L1, and between AP-initial tone and L1; and the three-way interactions between time (linear, quadratic), AP-initial tone, and L1.
The relevant significant effects from Table 4 are the following. The positive estimate for the significant effect of AP-initial tone means that Korean listeners’ overall proportion of fixations was higher when the AP-initial tone was L than when it was H. The negative estimate for the significant interaction between the quadratic time polynomial and the AP-initial tone indicates that Korean listeners’ fixation curve had less of a ‘U’ shape when the AP-initial tone was L than when it was H. The negative estimate for the significant interaction between AP-initial tone and L1 for English listeners means that English listeners’ overall fixations showed a weaker or reverse effect of AP-initial tone compared to Korean listeners’ fixations (whose corresponding effect had a positive estimate). Moreover, the significant three-way interactions between the linear and quadratic time polynomials, the AP-initial tone, and L1 (both French and English) indicate that both French and English listeners differed from Korean listeners in the effect of AP-initial tone they showed as a function of time. The positive estimates for the interactions with the linear time polynomial suggest that French and English listeners showed a stronger interaction (in the same direction) between the linear time polynomial and the AP-initial tone compared to Korean listeners (whose corresponding interaction also had a positive estimate), and the positive estimates for the interactions with the quadratic time polynomial suggests that French and English listeners showed a weaker or reverse interaction between the quadratic time polynomial and the AP-initial tone compared to Korean listeners (whose corresponding interaction had a negative estimate).
These results are again as predicted for Korean listeners, whose target-over-competitor fixation advantage was enhanced when the AP-initial tone changed from a phonologically non-canonical H to a canonical L (across AP-final tones). This confirms that an AP-initial L tone helps Korean listeners locate the beginning of phrase-initial words in continuous speech. The English listeners’ weaker effect of AP-initial tone compared to Korean listeners is likely due to the reversal of the AP-initial tone effect from 200 to 700 ms in English listeners’ fixations. As with the AP-final tone, French and English listeners’ fixation curves differed from Korean listeners’ fixation curve in the strength and possibly directionality of the effect of the AP-initial tone as a function of time.
To understand these differences, we again ran an additional analysis that directly compared French and English listeners’ fixation curves; as with our analysis for the AP-final tone, this analysis allowed us to examine the strength and directionality of the interactions between time and the AP-initial tone for the two L2 groups while also comparing the L2 groups to each other. The growth-curve analysis with the best fit on L2 listeners’ target-over-competitor fixation advantage from the onset of the AP-initial tone to 1,400 ms post onset is presented in Table 5. The fixed effects included in the model were: the simple effects time (linear, cubic), AP-initial tone, and L1; and the two-way interactions between time (linear, cubic) and AP-initial tone, and between AP-initial tone and L1. The alpha level for this analysis was adjusted to .025.
The important significant effects in Table 5 can be summarized as follows. The positive estimate for the significant simple effect of AP-initial tone indicates that French listeners had higher proportions of fixations when the AP-initial tone was L than when it was H. The positive estimate for the significant interaction between the linear time polynomial and the AP-initial tone indicates that French listeners’ fixation curve was more ascending when the AP-initial tone was L than when it was H, with the estimate for this effect being much larger in size than the corresponding effect observed for Korean listeners (see Table 4). The negative estimate for the significant interaction between the cubic time polynomial and the AP-initial tone indicates that French listeners’ fixation curve was more ‘S’ shaped when the AP-initial tone was L than when it was H. (Note, however, that the AP-initial tone was not found to interact with both the cubic time polynomial and L1 in the previous analysis; see Table 4.) Importantly, the negative estimate for the two-way interaction between the AP-initial tone and L1 suggests that English listeners’ overall fixations show a weaker or reverse effect of AP-initial tone compared to French listeners. To further examine the effect of AP-initial tone in English listeners’ fixations, the model in Table 5 was releveled with English listeners as baseline. The releveled model revealed a significant effect of AP-initial tone with a positive estimate (β = 0.080, SE = 0.031, t = 2.585, p < .009), indicating that English listeners’ overall target-over-competitor fixation advantage is indeed greater when the beginning of phrase-final words is signaled by an AP-initial L tone. The lack of three-way interaction between the time polynomials, AP-initial tone, and L1 in the analysis of L2 learners’ data suggests that the two L2 groups showed similar fixation curves despite the advantage for the AP-initial H tone early on in English listeners’ fixations.
These results confirm the stronger interaction between the linear time polynomial and the AP-initial tone for French and English listeners compared to Korean listeners (hypothesized from Table 4), suggesting that L2 learners had more difficulty recovering from the misleading (i.e., non-canonical) AP-initial H tone. This may have caused French and English listeners’ fixation curves in the AP-initial L tone condition not to have less of a ‘U’-shape than their fixation curves in the AP-initial H tone condition, unlike Korean listeners’ fixation curves. Hence, although the AP-initial L tone had different effects on the exact shape of native and L2 listeners’ fixation curves, the consequence of these effects was similar – to enhance listeners’ target-over-competitor fixation advantage. Crucially, English listeners differed from both Korean listeners and French listeners in the size of the effect of the AP-initial tone due to a reversal of the effect early on in the trial. These L2 results are not expected: French listeners did not experience the predicted difficulty in using the AP-initial tone to locate the beginning of phrase-initial words in Korean, and English listeners’ weaker effect of AP-initial tone caused by the early advantage for words beginning with an AP-initial H tone suggests that they experienced more difficulty in using this tone than French listeners.
4. Discussion
The present study investigated whether the learning of intonational cues to word boundaries in speech segmentation would be more difficult if the L1 and L2 had phonologically similar but phonetically different intonations than if they had phonologically different intonations, thus providing another testbed for evaluating Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis. It examined how native Korean listeners, French-speaking L2 learners of Korean, and English-speaking L2 learners of Korean use tonal information in the segmentation of Korean speech. Participants completed a visual-world eye-tracking experiment that tested whether listeners’ speech segmentation would be inhibited by a phonologically non-canonical AP-final L tone (relative to a canonical AP-final H tone) and enhanced by a phonologically canonical AP-initial L tone (relative to a non-canonical AP-initial H tone). Both Korean and English listeners were predicted to show such effects, with English listeners integrating the use of tonal cues later on in the word recognition process due to the great phonological differences between the L1 and L2 intonation systems. French listeners were expected to show an inhibitory effect of the phonologically non-canonical AP-final L tone despite their predicted non-target-like phonetic representation of the AP-final H tone, because this H tone occurs in the final syllable of Korean APs and thus still signals the end of phrase-final words in Korean (like French). However, French listeners were not expected to show an enhancing effect of the canonical AP-initial L tone because their predicted non-target-like phonetic representation of the AP-initial L tone would inhibit their learning that the AP-initial L tone can signal the beginning of phrase-initial words in Korean (unlike French). This predicted difficulty was hypothesized to come from the phonological similarity but fine-grained phonetic differences between the Korean and French intonations.
As expected, the results first showed that Korean listeners’ speech segmentation was modulated by both the AP-final and the AP-initial tones, with listeners’ target-over-competitor fixation advantage being greater in the phonologically canonical tone conditions (AP-final H tone, AP-initial L tone) than in the phonologically non-canonical tone conditions (AP-final L tone, AP-initial H tone), as predicted. These tonal effects manifested themselves early on in the word-recognition process, suggesting that listeners make rapid use of this tonal information to distinguish between the target and competitor words. These results replicate the finding of earlier research that the AP-final H tone and the AP-initial L tone enhance Korean listeners’ speech segmentation (Kim et al., Reference Kim, Broersma and Cho2012; Kim & Cho, Reference Kim and Cho2009; Tremblay et al., Reference Tremblay, Cho, Kim and Shin2019). These findings suggest that listeners compute the prosodic structure of a given utterance by exploiting tonal patterns before and after a hypothesized lexical boundary, with the AP-final and AP-initial tonal cues affecting listeners’ target-over-competitor fixation and modulating lexical access.
Contrary to the predictions made on the basis of Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s Prosodic-Learning Interference Hypothesis, however, the results also showed that French listeners’ speech segmentation displayed a target-like effect of tonal information for both the AP-final tone and the AP-initial tone, with listeners’ target-over-competitor fixation advantage being greater in the phonologically canonical tone conditions (AP-final H tone, AP-initial L tone) than in the phonologically non-canonical tone conditions (AP-final L tone, AP-initial H tone). These results were not predicted by the Prosodic-Learning Interference Hypothesis, as French employs an AP-initial L tone that is phonologically similar to, yet phonetically different from, the corresponding AP-initial L tone in Korean, with the French tone not being closely anchored to the AP-initial syllable, unlike that in Korean. Therefore, it is not necessarily the case that the learning of intonational cues to word boundaries in speech segmentation is difficult to achieve if the L1 and L2 have phonologically similar but phonetically different intonations, contrary to Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016)'s proposed Prosodic-Learning Interference Hypothesis. These results are particularly striking given that the French participants did not even reach the lower intermediate proficiency threshold in the Korean reading proficiency test that they completed.
One important difference between the present study and that of Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016), however, is that the current L2 learners were tested in an environment where the target language was spoken and had spent more time in an L2-speaking environment. In other words, although the participants did not have advanced proficiency in Korean, they had daily exposure to Korean. If the Prosodic-Learning Interference Hypothesis is more likely to affect L2 learners who have less exposure to the L2, it may have made it difficult for the Korean-speaking L2 learners of French in Tremblay et al. (Reference Tremblay, Broersma, Coughlin and Choi2016) to learn the properties of French intonation (relative to English-speaking L2 learners of French). Crucially, perceptual assimilation difficulties may not be persistent for the learning of intonational information, with immersed L2 learners being able to perceive subtle L1-L2 intonational differences, as shown in the present study. This prediction, if correct, would suggest that the mechanisms that underlie the learning of intonational and segmental information are similar, but the nature of the information to be learned differs such that learning difficulties are less persistent for intonational information than for segmental information. More precisely, it may be the case that non-proficient L2 learners assimilate phonologically similar L2 tones to existing L1 tones like they do with phonologically similar L2 and L1 segments (e.g., Best & Tyler, Reference Best, Tyler, Munro and Bohn2007; Flege, Reference Flege and Strange1995; van Leussen & Escudero, Reference van Leussen and Escudero2015), but that, with sufficient exposure to the L2, they become more sensitive to the fine-grained phonetic details of intonation compared to those of segments. This difference may be due to the less categorical and non-lexical nature of intonational information compared to that of segmental information, and possibly to the greater ease with which intonational information can be extracted from the speech signal compared to segmental information (after all, newborn infants already show a preference for the prosody of the birth mother's language; e.g., Mehler & Dupoux, Reference Mehler and Dupoux1994). Future research should determine whether learning difficulties are indeed less persistent for intonational information than for segmental information.
The results also revealed that English listeners did not differ from French listeners in their use of the AP-final tone in speech segmentation but showed a weaker effect of AP-initial tone due to a reversal of the tonal effect early on in the word recognition process. This reversal, with English listeners’ target-over-competitor fixation advantage being greater when the AP-initial tone was H than when it was L, is likely a transfer effect from an L1-based speech segmentation routine, with English listeners’ early word activation benefiting from a word-initial H tone before showing the effect in the correct direction for Korean. One straightforward explanation of these results is that the learning of intonational cues may be more difficult if it requires listeners to suppress an L1-based relationship between a cue and a word edge than if it requires them to learn a new, L2-based relationship between a cue and a word edge. Functionally monolingual English listeners have been shown to use F0 rise as a cue to word-initial boundaries but not as a cue to word-final boundaries (e.g., Tremblay, Namjoshi, Spinelli, Broersma, Cho, Kim, Martínez-García & Connell, Reference Tremblay, Namjoshi, Spinelli, Broersma, Cho, Kim, Martínez-García and Connell2017; Tyler & Cutler, Reference Tyler and Cutler2009). For English listeners, therefore, suppressing the association between an H tone and word-initial boundaries is more difficult to accomplish than learning a new association between an H tone and the last syllable of a phrase-final word. Previous studies on the use of segmental cues in speech segmentation have shown that L2 learners can learn new, L2-based speech segmentation routines but have difficulty suppressing L1-based segmentation routines (e.g., Tremblay & Spinelli, Reference Tremblay and Spinelli2014; Weber & Cutler, Reference Weber and Cutler2006). The current results suggest that this may also be the case for the learning of intonational information.
5. Conclusion
The present study investigated how French-speaking and English-speaking L2 learners of Korean use intonational cues to locate word boundaries in Korean, with focus on whether the learning of intonational cues to word boundaries in speech segmentation is more difficult if the L1 and L2 have phonologically similar but phonetically different intonations than if they have phonologically different intonations (Tremblay et al., Reference Tremblay, Broersma, Coughlin and Choi2016). The results of an eye-tracking speech segmentation experiment with native Korean listeners and French-speaking and English-speaking L2 learners of Korean yielded the following findings: (i) both native and L2 listeners exploited intonational cues to word boundaries when segmenting Korean speech into words; (ii) the learning of L2 intonational cues that differ in subtle ways from L1 intonational cues is not necessarily difficult if L2 learners have had sufficient exposure to the L2 (French-speaking L2 learners of Korean); and (iii) it is more difficult to suppress segmentation routines that are based on L1 intonational cues than to learn segmentation routines that are based on L2 intonational cues (English-speaking L2 learners of Korean) (see also Tremblay & Spinelli, Reference Tremblay and Spinelli2014; Weber & Cutler, Reference Weber and Cutler2006). More broadly, the current study provides a strong incentive for future research to examine the effect of linguistic experience on L2 learners’ use of intonational cues in parallel with their use of segmental information (cf. Cho, McQueen & Cox, Reference Cho, McQueen and Cox2007) in speech segmentation.
Acknowledgements
This research is based upon work supported in part by the National Science Foundation (BCS-1423905, awarded to the first author) and by the Global Research Network Program through the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2016S1A2A2912410, awarded to the second and third authors).
Supplementary Material
Supplementary material can be found online at https://doi.org/10.1017/S136672892000053X