Hostname: page-component-745bb68f8f-f46jp Total loading time: 0 Render date: 2025-02-11T12:38:05.624Z Has data issue: false hasContentIssue false

Acoustic characteristics and learner profiles of low-, mid- and high-level second language fluency

Published online by Cambridge University Press:  21 February 2018

KAZUYA SAITO*
Affiliation:
Birkbeck, University of London
MELTEM ILKAN
Affiliation:
Birkbeck, University of London
VIKTORIA MAGNE
Affiliation:
University of West London
MAI NGOC TRAN
Affiliation:
Birkbeck, University of London
SHUNGO SUZUKI
Affiliation:
Lancaster University
*
ADDRESS FOR CORRESPONDENCE Kazuya Saito, Department of Applied Linguistics and Communication, Room 334, Birkbeck, University of London, 25 Russell Square, London, United KingdomWC1B 5DQ. E-mail: k.saito@bbk.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

In the context of 90 adult Japanese learners of English with diverse second language experience and 10 native speakers, this study examined the linguistic characteristics and learner profiles of low-, mid- and high-level fluency performance. The participants’ spontaneous speech samples were initially rated by 10 native listeners for global fluency on a 9-point scale (1 = dysfluent, 9 = very fluent), and then divided into four proficiency groups via cluster analyses: low (n = 29), mid (n = 30), high (n = 31), and native (n = 10). Next, the data set was analyzed for the number of pauses within/between clauses, articulation rate, and the frequency of repetitions/self-corrections. According to the results of a series of analyses of variance, the frequency of final-clause pauses differentiated low- and mid-level fluency performance; the number of mid-clause pauses differentiated mid- and high-level performance; and articulation rate differentiated high-level and nativelike performance. The analyses also found that the participants’ second language fluency was significantly associated with their length of residence profiles (0–18 years), but not with their age of arrival profiles (19–40 years).

Type
Original Article
Copyright
Copyright © Cambridge University Press 2018 

In the field of second language acquisition (SLA), there is a growing consensus among a range of theoretical perspectives that adult second language (L2) learners’ speech can continue to develop as a function of increased practice and experience (i.e., experience effects), and that the extent to which these learners can eventually improve their L2 performance is strongly tied to their age of acquisition (i.e., age effects; e.g., Flege, Reference Flege2016, for speech learning model). While much discussion in this research area has concerned the acquisition of L2 segmentals (Piske, MacKay, & Flege, Reference Piske, Mackay and Flege2001), a growing number of L2 speech researchers have also examined the underlying mechanisms of L2 fluency development.

The existing literature has extensively worked to illustrate which subconstructs of L2 speech (speed, breakdown, or repair) determine native speakers’ perception of fluency, and what kinds of learner factors are crucial for efficient and effective fluency development (for a comprehensive review, see Segalowitz, Reference Segalowitz2016). Due to the relatively limited quantity and quality of samples used in previous studies, however, little is known about the acoustic correlates of perceived fluency at different proficiency levels, and the role of learner variables (experience and age) in the attainment of various levels of L2 fluency performance.

In the context of 90 adult Japanese learners of English with diverse L2 experience and 10 native speakers (N = 100), this study aimed to examine the specific linguistic characteristics and learner profiles of low-, mid-, and high-level fluency performance. We elucidated which aspects of temporal information (speed, breakdown, and repair) native speakers differentially relied on while assessing the overall fluency of the native and nonnative speech samples. Subsequently, we probed the extent to which these low-, mid- and high-level L2 fluent learners differed in terms of the length of residence (0–18 years) and the age of arrival (19–40 years) to an L2 speaking environment.

BACKGROUND

L2 perceived, utterance, and cognitive fluency

In its broadest sense, fluency, especially in practice, has been considered as equivalent to general oral proficiency (Chambers, Reference Chambers1997). On a more narrow scale, many L2 scholars have focused on which acoustic properties relate to the optimal, smooth, and fluid delivery of L2 speech (utterance fluency), and how these features interact to influence native speakers’ fluency judgments of L2 speech (perceived fluency; Skehan, Reference Skehan2003; Tavakoli & Skehan, Reference Tavakoli, Skehan and Ellis2005). In the existing literature, the components of utterance fluency have been analyzed through three groups of objective measures: (a) breakdown (e.g., the number of filled and unfilled pauses between and within clauses); (b) speed (e.g., the number of pruned syllables uttered per minute); and (c) repair (e.g., the number of repetitions and self-corrections; Bosker, Pinget, Quené, Sanders, & de Jong, Reference Bosker, Pinget, Quene, Sanders and de Jong2013). As summarized in Table 1, it has been generally shown that native speakers’ fluency judgments can be mainly associated with the breakdown and speed measures, and, to a much lesser degree, linked to the repair measures.Footnote 1

Table 1. Summary of five major L2 fluency studies examining the relationship between perceived and utterance fluency

Furthermore, a growing number of SLA scholars have also examined the cognitive processes underlying fluent speech performance (cognitive fluency). For example, Kormos (Reference Kormos2006) proposed that certain aspects of utterance fluency measures (breakdown, speed, and repair) can reflect three different stages of L2 speech production, conceptualizing the message, encoding and formulating linguistic information, and monitoring one's own output. Specifically, one breakdown fluency measure, the number of final-clause pauses, is argued to signal L2 learners’ engagement with conceptualization and content planning; another breakdown measure, the number of mid-clause pauses, is related to the present state of L2 learners’ timely phonological, lexical, and syntactic encoding; and repair fluency measures reflect the amount of attentional resources that L2 learners have for the purpose of monitoring their own speech. Comparatively, speed fluency measures (e.g., speech/articulation rate) are thought to involve every dimension of L2 speech production (conceptualization, formulation, and monitoring), and act as a crucial indication of automatization.

Whereas much attention has been given to examining the complex relationships between perceived, utterance, and cognitive fluency, the extent to which these different components of fluency actually develop as L2 learners become more proficient over time (beginner → intermediate → advanced) has remained surprisingly understudied. As observed in Table 1, previous studies have drawn on relatively small data sets focusing on particular groups of L2 learners with relatively homogenous proficiency levels (N = 16–40), a common methodological problem in the field of SLA, as pointed out by Norris, Plonsky, Ross, and Schoonen (Reference Norris, Plonsky, Ross and Schoonen2015). According to a componential view of proficiency, L2 speech is a composite phenomenon constituting both global dimensions (e.g., perceived fluency) and subconstructs (e.g., breakdown, speed, and repair fluency; de Jong, Steinel, Florijn, Schoonen, & Hulstijn, Reference de Jong, Steinel, Florijn, Schoonen and Hulstijn2012), and the associations between these global and subskill domains may vary in relation to different proficiency levels (Higgs & Clifford, Reference Higgs, Clifford and Higgs1982). In L2 assessment research (which is outside of the domain of L2 fluency research), for example, it has been shown that the relative weight of vocabulary appropriateness as measured by global oral proficiency judgments (e.g., comprehensibility and communicative adequacy) may be strong for low-level proficiency L2 learners, while grammar and pronunciation accuracy could be more distinguishing of high-level proficiency L2 learners (e.g., Isaacs & Trofimovich, Reference Isaacs and Trofimovich2012).

Though few in number, some researchers have examined which acoustic variables constitute beginner-, intermediate-, and advanced-level L2 fluency. Adopting a cross-sectional design, Cucchiarini, Strik, and Boves (Reference Cucchiarini, Strik and Boves2000) compared the acoustic characteristics of the speech of two different groups of L2 Dutch learners (beginner vs. intermediate), finding that their perceived fluency was predicted by different types of utterance fluency (breakdown fluency for the beginner group vs. speed fluency for the intermediate group). More recently, Derwing, Munro, Thomson, and Rossiter (Reference Derwing, Munro, Thomson and Rossiter2009) longitudinally tracked 32 L2 learners of English over the first 2 years of their immersion in Canada. The results showed that the participants steadily improved their perceived fluency by developing their articulation rate throughout the project, but dramatically decreased the number of pauses in their speech only within the first year of the research. Using a quasi-experimental, pretest/posttest design, Lambert, Kormos, and Minn (Reference Lambert, Kormos and Minn2017) examined how L2 learners enhanced diverse dimensions of utterance fluency while repeating the same communicative task six times over a 2-hr English conversation session. The results showed that the subconstructs of L2 utterance fluency developed at different learning rates, as predicted by Kormos's (Reference Kormos2006) psycholinguistic model of speech production. The participants continued to increase their speech rate throughout the multiple task repetitions (i.e., automatizing all the relevant speech processing systems); their final-clause and mid-clause pauses significantly decreased over the first three or four repetitions; and their self-repairs substantially declined only between the fourth and fifth repetitions.

Taken together, the previous literature suggests that adult L2 learners’ improvement can be observed particularly (a) in the development of breakdown and speed fluency by enhancing their smooth and fluid access to the conceptualizer and formulator in the initial stage of SLA (beginner → intermediate); and (b) in the development of repair and speed fluency by optimizing the process of monitoring in the later stages of SLA (intermediate to advanced). Following this line of thought, the current study aimed to revisit the acoustic correlates of low-, mid-, and high-level fluent speech in the context of a relatively large-scale data set covering a wide range of proficiency levels (N = 100).

EXPERIENCE AND AGE EFFECTS ON ADULT SECOND SPEECH LEARNING

In the field of L2 speech research, scholars have explored two essential questions regarding the mechanisms underlying successful L2 pronunciation learning: (a) how adult L2 learners can quickly improve the spectral and temporal dimensions of consonants and vowels in relation to increased experience (i.e., the role of experience in rate of learning); and (b) the extent to which they can eventually refine the nativelikeness of their pronunciation proficiency, especially in accordance with learners’ age of acquisition (i.e., the role of age in ultimate attainment; for a comprehensive review, see Saito, in press). With respect to the former (rate of learning), length of residence (LOR) has been considered as a rough proxy for L2 experience, as it does not always mirror how L2 learners actually use the target language. For example, certain learners could choose to use their native language (L1) rather than their L2 as the primary language of communication for the duration of their potentially extensive residence (Flege & Liu, Reference Flege and Liu2001). However, there is ample evidence that adult L2 speech learning continues to take place as a function of increased LOR, as long as learners use the target language through interaction with other native and nonnative speakers on a daily basis (e.g., Derwing & Munro, Reference Derwing and Munro2013; Saito, Reference Saito2015).

Within the first few years of immersion, many adult L2 learners’ phonological forms quickly become intelligible, especially in the context of frequently used words (Munro & Derwing, Reference Munro and Derwing2008 for vowels; Saito & Munro, Reference Saito and Munro2014, for approximants). These L2 learners appear to continue to enhance their segmental (e.g., Baker, Reference Baker2010, for stops; Saito & Brajot, Reference Saito and Brajot2013, for approximants; Flege, Bohn, & Jang, Reference Flege, Bohn and Jang1997, for vowels) and prosodic (e.g., Trofimovich & Baker, Reference Trofimovich and Baker2006, for word stress and intonation) accuracy over an extensive period of time (e.g., 5–10 years of LOR). This process of phonological reattunement is assumed to facilitate learners’ comprehension and production of a number of phonologically similar words (e.g., minimal pairs; Bundgaard-Nielsen, Best, & Tyler, Reference Bundgaard-Nielsen, Best and Tyler2011), and is used as empirical support for many theoretical accounts that claim that even adult L2 learners can learn new sounds in a manner similar to L1 acquisition (e.g., Flege, Reference Flege2016, for speech learning model; Best & Tyler, Reference Best, Tyler, Bohn and Munro2007, for perceptual assimilation model—L2).

With respect to the ultimate attainment of L2 speech learning, a number of large-scale studies have demonstrated strong age effects on the final quality of adult L2 learners’ pronunciation proficiency after years of immersion in a L2 speaking environment. Whereas some L2 learners can achieve near-nativelike L2 pronunciation proficiency, especially when exposed to the target language from an early age (<6–7 years), other L2 learners with late age of acquisition (AOA) profiles (>12–14 years) tend to have detectable accents (e.g., Flege, Munro, & MacKay, Reference Flege, Munro and MacKay1995; Granena & Long, Reference Granena and Long2013; Saito, Reference Saito2013). This could be because adult L2 learners have already lost their access to the innate language acquisition device by which to pick up the target language at a nativelike level based on mere exposure (Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2008; Granena & Long, Reference Granena and Long2013), or because certain cognitive abilities (e.g., working memory, brain size, speech processing, and attentional/inhibitory control) relevant for successful language acquisition likely decline after the age of 18–20 years (Birdsong, Reference Birdsong2006; Saito, Reference Saito2013).

Although L2 pronunciation learning involves a wide range of acoustic phenomena (e.g., the accurate and fluent use of consonants, vowels, word stress, intonation, and rhythm), it is noteworthy that the aforementioned studies have been exclusively concerned with the effects of experience and age on the development of L2 segmental accuracy. What has remained understudied to date is the extent to which such findings could be generalized to other aspects of L2 pronunciation development (prosody, rhythm, and fluency). To our knowledge, very few studies have examined the role of LOR and AOA particularly in L2 fluency development and attainment, especially focusing on a range of L2 learners with varied experience, age, and proficiency profiles. With a total of 30 Korean learners of English (LOR = 0.1 to 15 years), Trofimovich and Baker (Reference Trofimovich and Baker2006) showed that the participants’ breakdown fluency (the number of pauses) was associated with their LOR, especially among the inexperienced and moderately experienced learners (LOR < 3 years). In the context of 102 experienced German learners of English (LOR > 10 years), Lahmann, Steinkrauss, and Schmid (Reference Lahmann, Steinkrauss and Schmid2017) did not find any significant relationship between participants’ age of arrival and their utterance fluency performance (breakdown, speed, and repair), suggesting that many L2 learners may be able to attain high-level fluency as a function of increased experience regardless of their starting AOA (different from L2 segmental learning, which is amenable to both experience and age effects throughout one's life).

To further examine this topic, the current study compared 90 Japanese learners of English with varied LOR and AOA profiles (see the Method section) with 10 native speaker baselines. Our data set departed in quantity/quality from the aforementioned studies (Trofimovich & Baker, Reference Trofimovich and Baker2006, for 30 inexperienced and experienced L2 learners; Lahmann et al., Reference Lahmann, Steinkrauss and Schmid2017, for 102 experienced learners but without any comparison with inexperienced learners nor native speakers). We aimed to identify low-, mid-, and high-fluent speakers by way of a cluster analysis based on 10 native speakers’ subjective judgment scores (i.e., perceived fluency). Subsequently, we investigated which components of utterance fluency (breakdown, speed, and repair) could distinguish between the three different Japanese (low, mid, and high) and the English baseline (native) groups. Finally, we explored whether and to what degree the grouping category (low, mid, or high) could be related to the participants’ LOR and AOA profiles. The following research questions were thus formulated:

  1. 1. How do breakdown, speed, and repair fluency correlate with native speakers’ intuitive judgments of fluency?

  2. 2. Which utterance fluency measures distinguish between learners at low, mid, high, and native levels of perceived fluency?

  3. 3. To what extent do experience and age factors influence the attainment of such different fluency levels?

METHOD

Participants in the current study included 100 speakers (90 nonnatives and 10 natives) who provided spontaneous speech samples, and 10 native listeners who rated all the speech samples for perceived fluency.

Speech samples

A total of 90 spontaneous speech samples were drawn from our unpublished corpus, which currently comprises 500+ Japanese learners of English with varied L2 learning experience in Japan, Canada, the United States and the United Kingdom (for details, see Saito, Reference Saito2017; for the materials deposited in IRIS, see Marsden, Mackey, & Plonsky, Reference Marsden, Mackey, Plonsky, Mackey and Marsden2016). All of them were native speakers of Japanese (both of their parents were L1 Japanese speakers) and started learning L2 English from Grade 7 in foreign language classroom settings in Japan.

Speakers

To cover a wide range of proficiency levels and learner profiles, the 90 participants were selected in accordance with the following categories, which were adapted from Trofimovich and Baker (Reference Trofimovich and Baker2006): inexperienced learners (LOR = 0 years), experienced learners (LOR < 5 years), and attainers (LOR > 6 years). For the latter two groups, care was taken to choose only those who had reported using L2 English as their main language of communication at the time of the investigation. According to the analyses of individual interviews, their L2 use was considered highly frequent on a 6-point scale (1 = infrequent, 6 = very frequent; M = 5.3; range = 4-6). This was done to avoid including L2 learners who actually continued to use L1 Japanese despite their residence in Canada, and whose LOR factor did not correlate with the actual quantity/quality of their L2 experience (Flege, Reference Flege2016). Finally, we selected from the same corpus data a baseline group of native English speakers who completed the same task in order to provide the baseline data for the purpose of comparison.

  1. 1. Inexperienced Japanese learners (n = 10). A total of 10 inexperienced university-aged Japanese learners (at the time of the project) were chosen to provide the lower range of the baseline data (L2 learners without any experience abroad; M age = 20.4 years; range = 18–21 years). Since they had never stayed nor studied abroad (LOR = 0 months), their performance was considered to serve as a proxy for the initial stage of Japanese learners’ L2 fluency development (solely based on their 6 years of foreign language experience in Japan).

  2. 2. Experienced Japanese learners (n = 40). A total of 40 Japanese learners were chosen for the “experienced” category (M age = 34.7 years; range = 22–48 years). These learners had a range of LOR profiles in Vancouver and Calgary, Canada (M = 1.4 years; range = 0.1 to 5 years) with widely different AOA points (M age = 28.3 years; range = 19–40 years). Given the cross-sectional (Trofimovich & Baker, Reference Trofimovich and Baker2006) and longitudinal (Munro, Derwing, & Saito, Reference Munro, Derwing, Saito, Levis and LeVelle2013) evidence that much L2 speech learning likely takes place over an extensive period of immersion (LOR = 0–5 years), their performance was assumed to represent the midstage of L2 fluency development.

  3. 3. Japanese attainers (n = 40). In line with the standards in L2 ultimate attainment research (e.g., DeKeyser, Reference DeKeyser2013), a total of 40 Japanese attainers were also included (M age = 40.2 years; range = 28–63 years). They had been in Canada for at least 6 years (M = 11.3 years; range = 6–18 years), and had various AOA profiles (M age = 27.1 years; range = 21–36 years). Their performance was considered to indicate the final stage of L2 fluency development.

  4. 4. Native English baselines (n = 10). To provide the upper range of the baseline data (targetlike forms without any foreign accents), this group comprised a total of 10 native speakers of English recruited in Vancouver, Canada (M age = 27.5 years; range = 18–37 years). At least one of their parents was an L1 English speaker. They reported that they had been using English as their L1 from birth onward and had limited knowledge/use of the other official language in Canada, French.

Task procedure

All the speakers engaged in a timed picture description task designed to elicit spontaneous language production, where the primary focus was on conveying meaning rather than form under communicative pressure (Spada & Tomita, Reference Spada and Tomita2010). The task was developed based on picture description tasks that have been widely used in previous L2 speech research where L2 learners explained a series of pictures in a sequence (e.g., Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009) or a single picture (e.g., Munro & Mann, Reference Munro and Mann2005).Footnote 2 In this task, the participants described seven pictures with a limited amount of planning time (i.e., 5 s per photo). Whereas the first four pictures were used as practice for participants to get used to the task procedure (describing a photo with little planning), the remaining three pictures were used for the final analyses. Given that the current study included inexperienced learners who had noted much difficulty in producing free speech, especially due to the significant lack of their conversational experience inside/outside classrooms, the decision was made to provide three key words so that they could at least start producing language without too much silence at the beginning of each picture description.Footnote 3

The first 10 s of each picture description were cut, combined, and saved in a WAV file for each speaker (10 s × 3 pictures = 30 s in total). Efforts were made to ensure that each sample started from the beginning of the picture description without initial dysfluencies (e.g., false starts or/and hesitations) and ended at a phrase boundary. The total length of each speech sample (30 s per speaker) was comparable to previous fluency research (e.g., Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013, for 20 s; Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009, for 20 s).Footnote 4 All the speech data were individually recorded in a quiet room in a community center, a university lab, or the participants’ residences using digital Roland-05 audio recorders (set to 44.1 kHz sampling rate with 16-bit quantization).

Perceived fluency analyses

Listeners

A total of 10 native listeners were recruited in London, United Kingdom, to assess all the speakers’ global fluency (M age = 29.3 years; range = 18–51 years). They were born and raised in English-speaking families in London and had at least one native English-speaking parent. None of the participants had studied Japanese prior to the project. Their familiarity with Japanese accented English was moderate (M = 2.9, range = 1–4) on a 6-point scale (1 = not at all, 6 = very much).Footnote 5

Rating procedure

Following the methodology by Derwing et al. (Reference Derwing, Munro, Thomson and Rossiter2009), the listeners received a brief explanation on the definition of perceived global fluency (i.e., the flow and smoothness of speech); notably, raters were not asked to pay attention to specific subconstructs of L2 speech (i.e., utterance fluency features), such as the number of pauses, repetitions, and self-corrections. Next, they proceeded with a practice session where they rated 3 speech samples (not included in the main data set) and explained their decisions for each sample. After we ensured that each listener focused on fluency (the raters’ comments mainly concerned “tempo” rather than overall proficiency, accuracy, nor complexity of L2 speech), they proceeded to assess a total of 100 speech samples that were played in a randomized order through Praat (Boersma & Weenink, Reference Boersma and Weenink2012) on a 9-point scale (1 = not fluent at all, 9 = very fluent).

As operationalized in the previous literature (e.g., Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013) and in order to tap into their initial intuitions and impressions about the L2 speech, the listeners were permitted to listen to each sample only once. In addition, the listeners were explicitly told to use the entire 9-point scale as much as possible, and were informed that the data set represented a wide scope of adult L2 fluency proficiency ranging from inexperienced learners (without any experience abroad) to experienced learners (with extensive LOR in an L2 speaking environment) to native speakers. Since each listener session took approximately 2 hr (including explanation, training, and rating), all the listeners took a 10-min break halfway through.

Interrater agreement

According to the results of Cronbach α analyses, the 10 listeners showed relatively high interrater agreement on their intuitive judgments of perceived L2 fluency (α = 0.98) in line with other fluency studies (e.g., Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013, for α = 0.97).

Utterance fluency analyses

All the speech samples were transcribed into analysis of speech units (Foster, Tonkyn, & Wigglesworth, Reference Foster, Tonkyn and Wigglesworth2000). Conforming to Kormos's (Reference Kormos2006) utterance and cognitive fluency model, these samples were coded for three different dimensions of utterance fluency (breakdown, speed, and repair), which are assumed to correspond to four stages of L2 speech production (conceptualization, formulation, articulation, and monitoring). We purposefully selected these utterance measures as they have been found to demonstrate little intercollinearity (Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013). For breakdown fluency, the number of filled (e.g., ah, oh, and eh) and unfilled (>250 ms; Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013) pauses in the middle and end of clauses were manually calculated and divided by the total number of words.Footnote 6 For speed fluency, the total phonation time (without all filled pauses) was divided by the total number of syllables (i.e., articulation rate). For repair fluency, the number of repetitions and self-corrections were divided by the total number of words.

Three trained researchers served as analysts for this portion of the study: the second, third, and fourth author. In a 1-hr training session, they received explicit explanation on each category of breakdown, speed, and repair fluency. Next, they practiced and discussed the analytic procedure with 5 similar speech samples (not included in the main data set). After they confirmed their clear understanding of the concept of the utterance fluency categories, they then analyzed 10 samples randomly selected from the main data set in order to check intercoder reliability. The results of Cronbach α analyses found relatively high α values for breakdown (α = 0.95 for filled pauses, α = 0.92 for unfilled pauses, α = 0.91 for final-clause pauses, and α = 0.91 for mid-clause pauses), speed (α = 0.93 for articulation rate), and repair (α = 0.96 for repetitions, α = 0.97 for self-corrections). Finally, the three researchers were randomly assigned to analyze a subset of 30 different speech samples, respectively.

RESULTS

The first objective of the statistical analyses was to identify three different levels of L2 fluency (low, mid, and high) based on the results of the 10 listeners’ rating scores of 90 nonnative speech samples. To this end, a hierarchical cluster analysis using Ward's method was adopted to categorize all the samples (n = 90) into smaller homogeneous groups. In accordance with a visual inspection of the dendogram (see Figure 1), a three-factor solution was adopted, dividing 90 Japanese learners into three groups: low (n = 29), mid (n = 30), and high (n = 31).

Figure 1. Dendrogram tree of hierarchical clusters based on the participants’ perceived fluency scores.

The descriptive statistics of the participants’ perceived fluency scores are summarized in Table 2. According to the results of 95% confidence interval analyses, there was no overlapping of the groups’ mean scores, indicating that the four groups (low, mid, high, and native) significantly differed in their perceived fluency performance at a p < .05 level.

Table 2. Descriptive statistics of perceived and utterance fluency scores

Note. Perceived fluency scores were based on a total of 10 native listeners’ intuitive judgements on a 9-point scale (1 = dysfluent, 9 = very fluent)

The second objective of the statistical analyses was to investigate the relationship between the five utterance fluency measures: final-clause pause ratio, mid-clause pause ratio, articulation rate, repetition ratio, and self-correction ratio (for the descriptive results, see Table 2). According to the results of Pearson correlation analyses (as summarized in Table 3), three significant correlations were found: between final-clause pauses (breakdown) and articulation rate (speed); mid-clause pauses (breakdown) and articulation rate (speed); and repetitions and self-corrections (repair; p < .005, Bonferroni corrected). In contrast, such significant correlations were not found for the two breakdown measures (mid-clause vs. final-clause pauses). The repair measures were not significantly associated with the breakdown nor the speed measures (p < .005). In keeping with Kormos's (Reference Kormos2006) proposal, the results suggest that the five utterance fluency measures included in the current study seem to tap into the participants’ abilities to perform three separate cognitive operations during L2 speech production: (a) conceptualization (final-clause pauses and articulation rate), (b) formulation (mid-clause pauses and articulation rate), and (c) monitoring (repetitions and self-corrections).

Table 3. Results of Pearson correlation analyses between five utterance fluency measures

Note. *p < .005 (Bonferroni corrected)

The third objective of the statistical analyses was to illustrate the acoustic correlates of the 10 native listeners’ intuitive fluency judgments of the 100 native and nonnative speakers. Given that the listeners demonstrated relatively high interrater agreement as to L2 fluency judgments (α > 0.95), the perceived fluency scores were averaged across raters to generate a single score for each speaker. To analyze the relationship between the perceived and utterance fluency, the mean fluency scores (dependent variable), and all the breakdown, speed and repair measures (independent variable) were analyzed via a set of Pearson correlation analyses. As shown in Table 4, the perceived fluency scores were significantly linked to mid- and final-clause pauses and articulation rate (p < .010, Bonferroni corrected). However, the role of the repair factor (repetition and self-correction) in perceived fluency remained unclear, as the correlation between the repetition ratio and perceived fluency reached only marginal significance (p = .014).

Table 4. Correlation coefficients between perceived fluency scores and five utterance fluency measures

Note. *p <.001 (Bonferroni corrected)

To further examine the relative weights of the five utterance measures in the perceived fluency scores, a stepwise multiple regression analysis was performed. As summarized in Table 5, the regression model, which included three utterance fluency variables (articulation rate, mid-clause pauses, and final-clause pauses), accounted for 45.0% of the variance in accuracy, with no evidence of strong collinearity in the model (VIF = 1.85; see Table 5). According to this model, the native listeners used speed fluency (articulation rate) as a primary cue, and breakdown (mid- and final-clause pauses) as a secondary cue for the perceived fluency judgments.

Table 5. Results of multiple regression analysis using acoustic variables as predictors of perceived fluency

Note. The variables entered into the regression equation included mid-clause pause ratio, final-clause pause ratio, articulation rate, repetition ratio, and self-correction ratio.

The fourth objective of the statistical analyses was to examine how the five utterance fluency scores (final-/mid-clause pauses, articulation rate, repetitions, and self-corrections) distinguished between the four different perceived fluency groups (low, mid, high, and native). A set of one-way analyses of variance (ANOVAs) were performed with perceived fluency level as the grouping factor and each of the utterance fluency scores as the dependent variable (Bonferroni corrected, p < .016).

As shown in Table 6, the results of ANOVAs found that whereas the final-clause pause factor distinguished between low and mid levels of perceived fluency (p = .015), the pause ratio of the other groups (mid, high, and native) appeared to be similar (p > .016). The mid-clause pause factor differentiated not only between low and mid levels of perceived fluency (p = .001), but also between mid and high levels of perceived fluency (p = .005). There was no statistically significant difference in the mid-clause pause ratio between the high and native fluency groups (p > .016). The articulation rate factor distinguished between all four different levels of perceived fluency (p = .006 for low and mid, p = .002 for mid and high, and p < .001 for high and native). Finally, the ANOVAs did not find any significant group effects for the repair factors (repetition and self-correction ratio) at a p < .016 level.

Table 6. Summary of group differences for low, mid, high and native levels of perceived fluency

Note. ap < .016 (Bonferroni corrected)

The fifth and final objective of the statistical analyses was to illustrate what kinds of learner profiles, experience (LOR) and age (age of arrival), could identify those L2 participants who actually attained different levels of perceived fluency. With respect to the effect of L2 experience, we ran a one-way ANOVA to see whether the three groups of Japanese learners (29 low-fluent learners, 30 mid-fluent learners, and 31 high-fluent learners) significantly differed according to their LOR backgrounds (0–18 years). As shown in Table 7, the results yielded a significant effect of group, F (2, 87) = 49.264, p < .001, ηp 2 = 0.53, indicating that the experience factor (LOR) accounted for 53% of the variance in the participants’ perceived fluency performance. A set of multiple comparison analyses further revealed that the LOR factor significantly distinguished three different levels of perceived fluency (p < .001 for low and mid, and mid and high) at a p < .025 level (Bonferroni corrected).

Table 7. Descriptive statistics of learner length of residence profiles

In terms of the influence of the participants age profiles (age of arrivals), we eliminated from the data set a total of 10 inexperienced Japanese learners, all of whom belonged to the low-fluency group, as they had no AOA records due to the lack of their experience abroad. With the remaining 80 Japanese learners (19 low-fluent learners, 30 mid-fluent learners, and 31 high-fluent learners), results of a one-way ANOVA did not find a significant effect of group, F (2, 77) = 0.441, p = .645, ηp 2 = 0.02 (summarized in Table 8). Thus, the results here hinted that AOA did not play a substantial role in the attainment of high-level fluent speech.

Table 8. Descriptive statistics of learner age of acquisition profiles

To further examine the learner profiles of L2 learners who actually attained “nativelike” fluency performance, the following procedure in the nativelikeness literature was adopted (e.g., DeKeyser, Reference DeKeyser2013). We calculated the means and confidence intervals (CI) of the baseline group's perceived fluency scores (see Table 2) and then counted how many Japanese learners’ fluency performance fell within two CIs of the baseline mean values. Out of the 90 participants, only 7 learners’ fluency performance was identified as nativelike. As shown in Table 9, these participants’ LOR and AOA profiles widely ranged, suggesting that neither LOR nor AOA could be a reliable predictor for the incidence of attaining nativelike L2 fluency.

Table 9. Learner profiles of seven Japanese learners who attained nativelike fluency performance

DISCUSSION AND CONCLUSION

In the context of 100 spontaneous speech samples (produced by 90 inexperienced/experienced Japanese learners and attainers, and 10 native speakers) and 10 native listeners who judged the overall fluency of the speech data on a 9-point scale (1 = dysfluent, 9 = very fluent), the current study was designed to probe the complex mechanisms underlying perceived (overall impression), utterance (breakdown, speed, and repair) and cognitive (conceptualization, formulation, and monitoring) fluency at different proficiency levels. To examine the generalizability of the topic (L2 fluency development) to the overall framework of L2 speech learning (Flege, Reference Flege2016), the study also aimed to identify whether and to what degree the different proficiency levels could be related to L2 learners’ individual differences in terms of overseas experience (operationalized as LOR) and AOA (the first intensive exposure to L2 English). A summary of the results is presented in Table 10.

Table 10. Summary of Acoustic Characteristics and Learner Profiles of Low, Mid, High and Native Fluency

Note. Dashed lines separate different fluency levels that are distinguished by a given acoustic and learner variable. CI for confidence intervals

With respect to the first research question (the relationship between perceived and utterance fluency), the results of the correlation and multiple regression analyses showed that the native listeners tend to use speed (articulation rate) as a primary acoustic cue (explaining 45% of variance) and breakdown (final- and mid-clause pauses) as a secondary acoustic cue (explaining 12% of variance) for their overall fluency judgments. Comparatively, the extent to which they relied on the repair-related information (repetitions and self-corrections) remained unclear. The relative importance of the acoustic information in perceived fluency (speed > breakdown > repair) here is in line with findings reported in existing studies (e.g., Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013).

Turning to the second research question, the current study further expounded whether and to what degree the listeners differentially used breakdown, speed, and repair information while assessing different levels of speech fluency. To this end, four proficiency categories: low (n = 29), mid (n = 30), high (n = 31), and native (n = 10), were determined based on cluster analyses of the 10 native listeners’ fluency ratings of the 100 speech samples. The results of the series of ANOVAs provided three unique findings. The native listeners used all of the breakdown/speed measures to differentiate low- and mid-level fluency; two measures (mid-clause pauses and articulation rate) to differentiate mid- and high-level fluency; and only one measure (articulation rate) to differentiate high- and native-level fluency. The results here lend some empirical support to Cucchiarini et al.’s (Reference Cucchiarini, Strik and Boves2000) claim that the acoustic correlates of perceived fluency may differ depending on the level of proficiency, with breakdown fluency being a relatively strong predictor for beginners’ L2 fluency, and speed fluency for more advanced learners’ fluency.

In terms of the third research question (the role of LOR and AOA in low-, mid-, and high-level fluency), the ANOVAs showed that the three proficiency groups significantly differed according to the participants’ LOR profiles, but not according to their AOA profiles. The results of the CI analyses (summarized in Tables 7 and 8) suggest that L2 learners may need a different amount of experience to achieve mid-level fluency proficiency (LOR > 3.7 years) and high-level fluency proficiency (LOR > 8.8 years) regardless of their age of arrival in an L2 speaking environment (18–40 years). The results here concur with previous findings on the presence of strong experience effects (Trofimovich & Baker, Reference Trofimovich and Baker2006), but a lack of any significant age effects (Lahmann et al., Reference Lahmann, Steinkrauss and Schmid2017) on L2 fluency development. Of note, this temporal aspect for L2 speech learning is different from the widely accepted view in regards to L2 segmental acquisition, where both experience and age effects are equally strong (Flege, Reference Flege2016).

Although the cross-sectional nature of the data set in the current study does not directly relate to development per se, several scholars have provided theoretical (de Jong et al., Reference de Jong, Steinel, Florijn, Schoonen and Hulstijn2012) and empirical (Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009) evidence that L2 learning takes place on a continuum of perceived fluency (low → mid → high) as a function of increased experience. Given that the current study featured a large number of L2 learners with diverse proficiency (low to high) and experience (0–20 years of LOR) profiles, it can be argued that examining the acoustic characteristics of their speech can provide several tentative explanations for how L2 learners develop different aspects of utterance fluency (final-clause pauses, mid-clause pauses, and articulation rate) to reach low, mid, high, and native fluency-proficiency levels over an extensive period of time (>10 years) with a varied degree of learner awareness (explicitly and implicitly). In particular, their developmental patterns could be discussed in relation to Kormos's (Reference Kormos2006) proposal of the different stages of cognitive operations during L2 speech production (conceptualization, formulation, articulation, and monitoring), and the amount of L2 experience required to reach each fluency level (operationalized as LOR).

In the initial stage of L2 fluency development (low → mid-level fluency), we would like to argue that much learning can be observed, particularly in the decreasing number of final-clause pauses; this claim stems from the finding that many L2 learners in the current study with adequate amounts of experience (LOR = 3.7–7.1 years) demonstrated nativelike pause frequency. As Kormos (Reference Kormos2006) suggested, the frequency of final-clause pauses is hypothesized to capture the efficient and timely conceptualization during L2 speech production (see Götz, Reference Götz2013; Lambert et al., Reference Lambert, Kormos and Minn2017). Thus, the findings here indicate that inexperienced L2 learners (e.g., LOR < 0.8 years) may conceptualize what to say more slowly. Given that spontaneous production entails various levels of processing operations in parallel (Skehan, Reference Skehan2014), this delay in conceptualization could be due to the interaction of problems at both conceptualization and formulation. That is, inexperienced L2 learners’ relatively weak representational and processing systems in the target language require excessive amounts of cognitive resources for linguistic encoding and formulation, leaving considerably less cognitive capacity that they could use for conceptualization.

As their L2 experience and proficiency increases (e.g., approximately 5 years of immersion), these learners may continue to enhance and then attain the more prompt and robust retrieval of the preverbal message even during spontaneous L2 speech production (like speaking an L1). Although the mildly experienced learners’ conceptualization processes may reach the nativelike efficiency in terms of the final-clause pause ratio, the other aspects of their fluency performance (mid-clause pause ration and articulation rate) could be still substantially different from advanced-level L2 learners (e.g., LOR = 8.8–12.4 years) and native speakers.

In the later stages of L2 fluency development (mid → high-level fluency), the frequency of L2 learners’ mid-clause pauses appears to reach nativelike levels, suggesting that their linguistic encoding processes seem to be optimized in keeping with their gradually developing phonetic, lexical, and grammatical knowledge over approximately 10 years (8.8–12.4) of LOR. To reach native-level perceived L2 fluency, however, even such experienced L2 learners still need to enhance their articulation rate by automatizing both the conceptualization and the formulation processes at a faster speed (Trofimovich & Baker, Reference Trofimovich and Baker2006). Since we did not find LOR nor AOA to be predictors of perceived nativelike fluency performance, it remains open to further investigation which factors (“beyond” LOR and AOA), such as cognitive abilities (e.g., Granena & Long, Reference Granena and Long2013, for aptitude; O'Brien, Segalowitz, Collentine, & Freed, Reference O'Brien, Segalowitz, Collentine and Freed2006, for working memory), motivation (Saito, Dewaele, & Hanzawa, Reference Saito, Dewaele and Hanzawa2017, for integrativeness, instrumentality vs. metacognition) and personality (e.g., Dewaele & Furnham, Reference Dewaele and Furnham2000, for extroversion vs. introversion), could facilitate this.

At the same time, it is crucial to point out that our findings confirmed a generally accepted view that L2 learners’ linguistic systems and behaviors are essentially different from those of native speakers (e.g., Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2008). As shown in our data set, many of even highly experienced L2 learners still significantly differed from the native baseline group especially in terms of speed fluency (while they demonstrated nativelike breakdown and repair fluency). It is theoretically intriguing to further pursue the underlying mechanism for the attainment of native-level fluency proficiency (i.e., what kinds of L2 learners can ultimately achieve all dimensions of utterance fluency at nativelike levels?). From educational perspectives, however, the current study rather suggests that L2 learners should selectively work on certain temporal features directly related to their different learning goals, such as the attainment of mid-level fluency (final-/mid-clause pauses and articulation rate) and high-level fluency (mid-clause pauses and articulation rate). All the relevant features here have been identified as crucial for successful L2 comprehensibility (Isaacs & Trofimovich, Reference Isaacs and Trofimovich2012). Our arguments here concur with a growing number of scholars who have claimed that L2 learning should concern/aim at increasing comprehensibility and intelligibility rather than nativelikeness as a realistic, prioritized, and attainable goal (e.g., Derwing & Munro, Reference Derwing and Munro2013; Isaacs & Trofimovich, Reference Isaacs and Trofimovich2012; Jenkins, Reference Jenkins2014; Saito, Reference Saito2015).

The repair factor (i.e., frequency of repetitions and self-corrections) did not significantly relate to the native listeners’ fluency ratings, nor did it distinguish between any proficiency groups in the current study (low vs. mid vs. high vs. native). In conjunction with the finding that both L2 learners and native baselines produced similar number of repairs during their picture descriptions, our findings echo previous studies evidencing the weak role of the repair phenomenon in L2 fluency (e.g., Kormos & Dénes, Reference Kormos and Dénes2004; Prefontaine, Kormos, & Johnson, Reference Prefontaine, Kormos and Johnson2016). At the same time, however, the lack of any significant associations related to repair fluency also cast doubt on the construct validity of the repair measures used in the current (and other existing) studies. Although we used separate categories to capture two types of repair (i.e., repetitions and self-corrections), different types of repair could be further analyzed at a fine-grained level, such as appropriateness repair (specifying ambiguous and/or incoherent content message more precisely) and error repair (modifying erroneously activated lexical, syntactical, morphological, and phonological forms at the sage of the formulation; Kormos, Reference Kormos1999). Future studies are warranted to scrutinize precisely which types of repair could be uniquely tied to L1 and L2 fluency (for similar arguments, see Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013).

Another issue that needs to be discussed is the lack of age effects on the final quality of L2 fluency attainment. One possible interpretation is that age effects may be relatively weak for those dimensions of L2 speech where much learning likely takes place as long as L2 learners regularly use and practice the target language for an extensive period of time. One such linguistic feature with much learning potential includes the approximation of nativelike fluency. As shown in the current study, Japanese learners with extended amounts of L2 experience (LOR > 8.8 years) seemed to attain similar results to the native speaker baseline in many dimensions of utterance fluency (e.g., the frequency of final-/mid-clause pauses, repetitions, and repairs); a significant difference between the high- and native-level fluency groups was observed only in articulation rate (for similar findings, see also Trofimovich & Baker, Reference Trofimovich and Baker2006). Previous literature has indicated that strong age effects can be clearly observed for acquisitionally difficult dimensions of L2 speech, such as prosodic and segmental accuracy (Flege et al., Reference Flege, Munro and MacKay1995; Saito, Reference Saito2013, in press) and lexicogrammatical complexity (Lahmann, Steinkrauss, & Schmid, Reference Lahmann, Steinkrauss and Schmid2016). To summarize, the results here suggest that there are unique learning patterns generalizable to various dimensions of L2 speech learning (i.e., strong and extensive experience effects; Flege, Reference Flege2016, for segmentals; Trofimovich & Baker, Reference Trofimovich and Baker2006, for suprasegmentals), and specific to L2 fluency attainment (i.e., weak age effects and high-level achievement; Lahmann et al., Reference Lahmann, Steinkrauss and Schmid2017).

To close, two primary limitations need to be acknowledged with an eye toward further replication and elaboration of this topic. First, all the fluency analyses were based on 30 s of spontaneous speech elicited by a single task: a timed picture description where speakers were not required to conceptualize a great deal of content message (which is argued to influence the frequency of final-clause pauses; Kormos, Reference Kormos2006). Following previous L2 speech literature, the findings need to be replicated with various task modalities/demands, such as pretask and online planning time (Yuan & Ellis, Reference Yuan and Ellis2003), task repetition (Ahmadian & Tavakoli, Reference Ahmadian and Tavakoli2011), and single versus dual task conditions (Révész, Michel, & Gilabert, Reference Révész, Michel and Gilabert2016). Another crucial limitation of the study concerns the lack of instruments evaluating the influence of L1 fluency on L2 fluency. Whereas the L1–L2 fluency link is particularly strong among inexperienced learners (Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009), speakers’ L1 speech rate seems to continue to be a strong predictor of L2 speed fluency (de Jong, Groenhout, Schoonen, & Hulstijn, Reference de Jong, Groenhout, Schoonen and Hulstijn2015). To disentangle L1 speaking styles from any discussion related to L2 fluency proficiency, it is necessary for future studies to adopt both L1 and L2 fluency measures.

ACKNOWLEDGMENTS

We are grateful to the journal associate editor, Annie Tremblay, and two anonymous Applied Psycholinguistics reviewers for their constructive feedback on an earlier version of the manuscript. We also acknowledge Hui Sun, Keiko Hanzawa, Takumi Uchihara, and George Smith for their help with data collection and analyses. The project was funded by the Grant-in-Aid for Scientific Research in Japan (No. 26770202) and the Birkbeck College Additional Research Support.

Footnotes

1. Of note, the repair measures were differently operationalized among the primary studies, such as the number of repetitions, restarts, and self-corrections per minute (Kormos & Dénes, Reference Kormos and Dénes2004); the number of self-repetitions per second (Derwing et al., Reference Derwing, Rossiter, Munro and Thomson2004); and the number of self-repetitions, self-corrections, and false-starts per minute/second. The methodological inconsistency could explain the nonsignificant role of the repair information in L2 fluency assessments.

2. The picture description format was selected, as it has been found to induce L2 learners to pay more attention to linguistic formulation and production (resulting in more accurate and fluent language) compared to interview tasks, where L2 learners simply talk about familiar topics with a greater focus on the conceptualization of the intended message (resulting in more complex language; Foster & Skehan, Reference Foster and Skehan1996).

3. In a pilot data collection, we found that some inexperienced Japanese learners spent much time (>30 s) planning what to say before starting to describe a picture when they were not given any key words. To make sure that all participants were to say something right after 5 s of planning time, a decision was made to provide three simple key words that they were asked to use for each picture description. These key words included (a) rain, table, and driveway (to describe a table left out in driveway), (b) three guys, guitar, and rock music (to describe three men playing rock music with guitars), and (c) blue sky, road, and cloud (to describe a long road under a cloudy blue sky).

4. Different from other fluency research using a relatively large speech samples (>1 min; e.g., Lambert et al., Reference Lambert, Kormos and Minn2017), we decided to use relatively short speech samples (30 s) to avoid unwanted fatigues during the relatively long listening sessions (2 hr). Of course, it could have been possible to ask the listeners to engage in even longer sessions over several days to assess longer samples. Noteworthy is, however, that intensive and extensive exposure to particular accented speech can affect native listeners’ L2 speech assessment patterns (e.g., their judgments of Japanese accented English may become stricter; see Flege & Fletcher, Reference Flege and Fletcher1992, for the relationship between the length of rating sessions and listeners’ accent judgments).

5. The listeners were from the United Kingdom, whereas the speech samples were collected mostly in Canada, and, hence, exposed to a Canadian variety of English as opposed to British English. To our knowledge, however, little attention has been given to the role of listener factor in L2 fluency assessment. We are currently conducting a follow-up study by asking native and nonnative listeners with various backgrounds to rate the same data set. We plan to report the results in another venue.

6. The breakdown measures were operationalized as the “frequency” (but not “length”) of pauses at final- and mid-clause positions. This was done so as to avoid any conceptual overlap with the speed measures (Bosker et al., Reference Bosker, Pinget, Quene, Sanders and de Jong2013). In addition, the length of pauses could be substantially influenced by word frequency (e.g., pauses become longer before infrequent words; de Jong, Reference de Jong2016), which is beyond the focus of the current study.

References

REFERENCES

Abrahamsson, N., & Hyltenstam, K. (2008). The robustness of aptitude effects in near-native second language acquisition. Studies in Second Language Acquisition, 30, 481509. doi:10.1017/S027226310808073XCrossRefGoogle Scholar
Ahmadian, M. J., & Tavakoli, M. (2011). The effects of simultaneous use of careful online planning and task repetition on accuracy, complexity, and fluency in EFL learners’ oral production. Language Teaching Research, 15, 3559. doi:10.1177/1362168810383329CrossRefGoogle Scholar
Baker, W. (2010). Effects of age and experience on the production of English word-final stops by Korean speakers. Bilingualism: Language and Cognition, 13, 263278. doi:10.1017/S136672890999006XCrossRefGoogle Scholar
Best, C., & Tyler, M. (2007). Nonnative and second-language speech perception. In Bohn, O.-S. & Munro, M. J. (Eds.), Language experience in second language speech learning: In honour of James Emil Flege (pp. 1334). Amsterdam: Benjamins.CrossRefGoogle Scholar
Birdsong, D. (2006). Age and second language acquisition and processing: A selective overview. Language Learning, 56, 949. doi:10.1111/j.1467-9922.2006.00353.xCrossRefGoogle Scholar
Boersma, D., & Weenink, P. (2012). Praat: Doing phonetics by computer. Version 5.3.14. Retrieved from http://www.praat.orgGoogle Scholar
Bosker, H. R., Pinget, A.-F., Quene, H., Sanders, T., & de Jong, N. H. (2013). What makes speech sound fluent? The contributions of pauses, speed and repairs. Language Testing, 30, 159175. doi:10.1177/0265532212455394CrossRefGoogle Scholar
Bundgaard-Nielsen, R., Best, C., & Tyler, M. (2011). Vocabulary size is associated with second-language vowel perception performance in adult learners. Studies in Second Language Acquisition, 33, 433461. doi:10.1017/S0272263111000040CrossRefGoogle Scholar
Chambers, F. (1997). What do we mean by fluency? System, 25, 535544. doi:10.1016/S0346-251X(97)00046-8CrossRefGoogle Scholar
Cucchiarini, C., Strik, H., & Boves, L. (2000). Quantitative assessment of second language learners’ fluency by means of automatic speech recognition technology. Journal of the Acoustical Society of America, 107, 989999. doi:10.1121/1.428279CrossRefGoogle ScholarPubMed
de Jong, N. H. (2016). Predicting pauses in L1 and L2 speech: The effects of utterance boundaries and word frequency. International Review of Applied Linguistics in Language Teaching, 54, 113132. doi:10.1515/iral-2016-9993CrossRefGoogle Scholar
de Jong, N. H., Groenhout, R., Schoonen, R., & Hulstijn, J. H. (2015). Second language fluency: Speaking style or proficiency? Correcting measures of second language fluency for first language behavior. Applied Psycholinguistics, 36, 223243. doi:10.1017/S0142716413000210CrossRefGoogle Scholar
de Jong, N. H., Steinel, M. P., Florijn, A. F., Schoonen, R., & Hulstijn, J. H. (2012). Facets of speaking proficiency. Studies in Second Language Acquisition, 34, 534. doi:10.1017/S0272263111000489CrossRefGoogle Scholar
DeKeyser, R. M. (2013). Age effects in second language learning: Stepping stones toward better understanding. Language Learning, 63, 5267. doi:10.1111/j.1467-9922.2012.00737.xCrossRefGoogle Scholar
Derwing, T. M., & Munro, M. J. (2013). The development of L2 oral language skills in two L1 groups: A 7-year study. Language Learning, 63, 163185. doi:10.1111/lang.12000CrossRefGoogle Scholar
Derwing, T. M., Munro, M. J., Thomson, R. I., & Rossiter, M. J. (2009). The relationship between L1 fluency and L2 fluency development. Studies in Second Language Acquisition, 31, 533557. doi:10.1017/S0272263109990015CrossRefGoogle Scholar
Derwing, T. M., Rossiter, M. J., Munro, M. J., & Thomson, R. I. (2004). L2 fluency: Judgments on different tasks. Language Learning, 54, 655679.CrossRefGoogle Scholar
Dewaele, J.-M., & Furnham, A. (2000). Personality and speech production: A pilot study of second language learners. Personality and Individual Differences, 28, 355365. doi:10.1016/S0191-8869(99)00106-3CrossRefGoogle Scholar
Flege, J. E. (2016). The role of phonetic category formation in second language speech acquisition. Paper presented at the 8th International Conference on Second Language Speech, Aarhus University, Denmark.Google Scholar
Flege, J. E., Bohn, O.-S., & Jang, S. (1997). Effects of experience on non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25, 437470. doi:10.1006/jpho.1997.0052CrossRefGoogle Scholar
Flege, J. E., & Fletcher, K. L. (1992). Speaker and listener effects on degree of perceived foreign accent. Journal of the Acoustical Society of America, 91, 370389.CrossRefGoogle ScholarPubMed
Flege, J. E., & Liu, S. (2001). The effect of experience on adults’ acquisition of a second language. Studies in Second Language Acquisition, 23, 527552.CrossRefGoogle Scholar
Flege, J. E., Munro, M. J., & MacKay, I. R. A. (1995). Factors affecting strength of perceived foreign accent in a second language. Journal of the Acoustical Society of America, 97, 31253134. doi:10.1121/1.413041CrossRefGoogle Scholar
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons. Applied Linguistics, 21, 354375. doi:10.1093/applin/21.3.354CrossRefGoogle Scholar
Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language performance. Studies in Second Language Acquisition, 18, 299323.CrossRefGoogle Scholar
Götz, S. (2013). Fluency in native and nonnative English speech. Amsterdam: Benjamins.CrossRefGoogle Scholar
Granena, G., & Long, M. H. (2013). Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Language Research, 29, 311343. doi:10.1177/0267658312461497CrossRefGoogle Scholar
Higgs, T., & Clifford, R. (1982). The push towards communication. In Higgs, T. (Ed.), Curriculum, competence, and the foreign language teacher (pp. 5779). Skokie, IL: National Textbook Company.Google Scholar
Isaacs, T., & Trofimovich, P. (2012). Deconstructing comprehensibility. Studies in Second Language Acquisition, 34, 475505. doi:10.1017/S0272263112000150CrossRefGoogle Scholar
Jenkins, J. (2014). English as a lingua franca in the international university: The politics of academic English language policy. Abingdon, UK: Routledge.Google Scholar
Kormos, J. (1999). The effect of speaker variables on the self-correction behaviour of L2 learners. System, 27, 207221. doi:10.1016/S0346-251X(99)00017-2CrossRefGoogle Scholar
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Erlbaum.Google Scholar
Kormos, J., & Dénes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System, 32, 145164. doi:10.1016/j.system.2004.01.001CrossRefGoogle Scholar
Lahmann, C., Steinkrauss, R., & Schmid, M. S. (2016). Factors affecting grammatical and lexical complexity of long-term L2 speakers’ oral proficiency. Language Learning, 66, 354385. doi:10.1111/lang.12151CrossRefGoogle Scholar
Lahmann, C., Steinkrauss, R., & Schmid, M. S. (2017). Speed, breakdown, and repair: An investigation of fluency in long-term second-language speakers of English. International Journal of Bilingualism, 21, 228242. doi:10.1177/1367006915613162CrossRefGoogle Scholar
Lambert, C., Kormos, J., & Minn, D. (2017). Task repetition and second language speech processing. Studies in Second Language Acquisition, 39, 167196. doi:10.1017/S0272263116000085CrossRefGoogle Scholar
Marsden, E., Mackey, A., & Plonsky, L. (2016). The IRIS Repository: Advancing research practice and methodology. In Mackey, A. & Marsden, E. (Eds.), Advancing methodology and practice: The IRIS Repository of Instruments for Research into Second Languages (pp. 121). New York: Routledge.Google Scholar
Munro, M., & Mann, V. (2005). Age of immersion as a predictor of foreign accent. Applied Psycholinguistics, 26, 311341. doi:10.1017/S0142716405050198CrossRefGoogle Scholar
Munro, M. J., & Derwing, T. M. (2008). Segmental acquisition in adult ESL learners: A longitudinal study of vowel production. Language Learning, 58, 479502. doi:10.1111/j.1467-9922.2008.00448.xCrossRefGoogle Scholar
Munro, M. J., Derwing, T. M., & Saito, K. (2013). English L2 vowel acquisition over seven years. In Levis, J. & LeVelle, K. (Eds.), Proceedings of the 4th Pronunciation in Second Language Learning and Teaching Conference (pp. 112–119). Ames, IA: Iowa State University.Google Scholar
Norris, J. M., Plonsky, L., Ross, S. J., & Schoonen, R. (2015). Guidelines for reporting quantitative methods and results in primary research. Language Learning, 65, 470476. doi:10.1111/lang.12104CrossRefGoogle Scholar
O'Brien, I., Segalowitz, N., Collentine, J., & Freed, B. (2006). Phonological memory and lexical, narrative, and grammatical skills in second language oral production by adult learners. Applied Psycholinguistics, 27, 377402. doi:10.1017/S0142716406060322CrossRefGoogle Scholar
Piske, T., Mackay, I. R. A., & Flege, J. E. (2001). Factors affecting degree of foreign accent in an L2: A review. Journal of Phonetics, 29, 191215. doi:10.006/jpho.2001.0134CrossRefGoogle Scholar
Prefontaine, Y., Kormos, J., & Johnson, D. E. (2016). How do utterance measures predict raters’ perceptions of fluency in French as a second language? Language Testing, 33, 5373. doi:10.1177/0265532215579530CrossRefGoogle Scholar
Révész, A., Michel, M., & Gilabert, R. (2016). Measuring cognitive task demands using dual-task methodology, subjective self-ratings, and expert judgements: A validation study. Studies in Second Language Acquisition, 38, 703737. doi:10.1017/S0272263115000339CrossRefGoogle Scholar
Rossiter, M. J. (2009). Perceptions of L2 fluency by native and non-native speakers of English. Canadian Modern Language Review, 65, 395412. doi:10.3138/cmlr.65.3.395CrossRefGoogle Scholar
Saito, K. (2013). Age effects on late bilingualism: The production development of /r/ by high-proficiency Japanese learners of English. Journal of Memory and Language, 69, 546562. doi:10.1016/j.jml.2013.07.003CrossRefGoogle Scholar
Saito, K. (2015). Experience effects on the development of late second language learners' oral proficiency. Language Learning, 65, 563595. doi:10.1111/lang.12120CrossRefGoogle Scholar
Saito, K. (2017). Beginner, intermediate and advanced Japanese learners of English in Japan Canada, the US and the UK. Unpublished corpus of second language speech. Retrieved October 2017 from http://kazuyasaito.net/Google Scholar
Saito, K. (in press). Advanced segmental and suprasegmental acquisition. In Malovrh, P. & Benati, A. (Eds.). The handbook of advanced proficiency in second language acquisition. Oxford: Blackwell.Google Scholar
Saito, K., & Brajot, F. (2013). Scrutinizing the role of length of residence and age of acquisition in the interlanguage pronunciation development of English /r/ by late Japanese bilinguals. Bilingualism: Language and Cognition, 16, 847863. doi:10.1017/S1366728912000703CrossRefGoogle Scholar
Saito, K., Dewaele, J.-M., & Hanzawa, K. (2017). A longitudinal investigation of the relationship between motivation and late second language speech learning in classroom settings. Language and Speech, 60, 614632. doi:10.1177/0023830916687793CrossRefGoogle ScholarPubMed
Saito, K., & Munro, M. (2014). The early phase of /r/ production development in adult Japanese learners of English. Language and Speech, 57, 451469. doi:10.1177/0023830913513206CrossRefGoogle Scholar
Segalowitz, N. (2016). Second language fluency and its underlying cognitive and social determinants. International Review of Applied Linguistics in Language Teaching, 54, 7995. doi:10.1515/iral-2016-9991CrossRefGoogle Scholar
Skehan, P. (2003). Task-based instruction. Language Teaching, 36, 114. doi:10.1017/S026144480200188XCrossRefGoogle Scholar
Skehan, P. (2014). Processing perspectives on task performance. Amsterdam: Benjamins.CrossRefGoogle Scholar
Spada, N., & Tomita, Y. (2010). Interactions between type of instruction and type of language feature: A meta-analysis. Language Learning, 6, 263308. doi:10.1111/j.1467-9922CrossRefGoogle Scholar
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure and performance testing. In Ellis, R. (Ed.), Planning and task performance in a second language (pp. 239277). Amsterdam: Benjamins.CrossRefGoogle Scholar
Trofimovich, P., & Baker, W. (2006). Learning second language suprasegmentals: Effect of L2 experience on prosody and fluency characteristics of L2 speech. Studies in Second Language Acquisition, 28, 130. doi:10.1017/S0272263106060013CrossRefGoogle Scholar
Yuan, F., & Ellis, R. (2003). The effects of pre-task planning and on-line planning on fluency, complexity and accuracy in L2 monologic oral production. Applied Linguistics, 24, 127. doi:10.1093/applin/24.1.1CrossRefGoogle Scholar
Figure 0

Table 1. Summary of five major L2 fluency studies examining the relationship between perceived and utterance fluency

Figure 1

Figure 1. Dendrogram tree of hierarchical clusters based on the participants’ perceived fluency scores.

Figure 2

Table 2. Descriptive statistics of perceived and utterance fluency scores

Figure 3

Table 3. Results of Pearson correlation analyses between five utterance fluency measures

Figure 4

Table 4. Correlation coefficients between perceived fluency scores and five utterance fluency measures

Figure 5

Table 5. Results of multiple regression analysis using acoustic variables as predictors of perceived fluency

Figure 6

Table 6. Summary of group differences for low, mid, high and native levels of perceived fluency

Figure 7

Table 7. Descriptive statistics of learner length of residence profiles

Figure 8

Table 8. Descriptive statistics of learner age of acquisition profiles

Figure 9

Table 9. Learner profiles of seven Japanese learners who attained nativelike fluency performance

Figure 10

Table 10. Summary of Acoustic Characteristics and Learner Profiles of Low, Mid, High and Native Fluency