Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-02-06T07:34:23.634Z Has data issue: false hasContentIssue false

Testing learner reliance on caption supports in second language listening comprehension multimedia environments

Published online by Cambridge University Press:  27 March 2013

Aubrey Neil Leveridge
Affiliation:
National Central University, Taiwan (email: neill@lst.ncu.edu.tw)
Jie Chi Yang
Affiliation:
National Central University, Taiwan (email: yang@cl.ncu.edu.tw)
Rights & Permissions [Opens in a new window]

Abstract

Listening comprehension in a second language (L2) is a complex and particularly challenging task for learners. Because of this, L2 learners and instructors alike employ different learning supports as assistance. Captions in multimedia instruction readily provide support and thus have been an ever-increasing focus of many studies. However, captions must eventually be removed, as the goal of language learning is participation in the target language where captions are not typically available. Consequently, this creates a dilemma particularly for language instructors as to the usage of captioning supports, as early removal may cause frustration, while late removal may create learning interference. Accordingly, the goal of the current study was to propose and employ a testing instrument, the Caption Reliance Test (CRT), which evaluates individual learners’ reliance on captioning in second language learning environments; giving a clear indication of the learners’ reliance on captioning, mirroring their support needs. Thus, the CRT was constructed comprised of an auditory track, accompanied by congruent textual captions, as well as particular incongruent textual words, to provide a means for testing. It was subsequently employed in an empirical study involving English as a Foreign Language (EFL) high school students. The results exhibited individual variances in the degree of reliance and, more importantly, exposed a negative correlation between caption reliance and L2 achievement. In other words, learners’ reliance on captions varies individually and lower-level achievers rely on captions for listening comprehension more than their high-level counterparts, indicating that learners at various comprehension levels require different degrees of caption support. Thus, through employment of the CRT, instructors are able to evaluate the degree to which learners rely on the caption supports and thus make informed decisions regarding learners’ requirements and utilization of captions as a multimedia learning support.

Type
Research Article
Copyright
Copyright © European Association for Computer Assisted Language Learning 2013 

1 Introduction

Multimedia computers have been used extensively to teach target languages in non-target language speaking regions (Amaral & Meurers, Reference Amaral and Meurers2011; Liaw, Reference Liaw2007; Lim & Shen, Reference Lim and Shen2006). For more than a decade, research on the integration of technology-based multimedia into traditional English as a Foreign Language (EFL) classes, has made it clear that the use of multimedia is beneficial for learners, particularly in situations that involve listening comprehension with the addition of captions (visual text in the target language). Once multimedia lessons are successfully integrated into the classroom learning experience, learners are presented with a richer, more in-depth learning environment.

The most drastic change pertains to learners who typically do not fare well on listening tests; they find the added support of captions in multimedia listening lessons to be a great advantage. Once incapable of making sense of the audio text, these same learners utilize the visual support of captions to create a strong connection between the text and what they hear (Lwo & Lin, Reference Lwo and Lin2012; Stewart & Pertusa, Reference Stewart and Pertusa2004). In turn, they are better able to break down the continual stream of indistinct words spoken by native speakers into punctuated sentences consisting of individual, separate words (Diao, Chandler, & Sweller, Reference Diao, Chandler and Sweller2007; Lwo & Lin, Reference Lwo and Lin2012). Furthermore, for added comprehension, learners are able to quickly reread sentences in their entirety if given additional time: a provision previously unavailable with audio-only instruction.

In a personal observation of current high school EFL classes, when presented with a caption-supported listening comprehension test, learners who previously found listening comprehension difficult showed improvements, similar to findings by Danan (Reference Danan2004), Garza (Reference Garza1991), and Markham and Peter (Reference Markham and Peter2003). With such immediate and remarkable overall improvements, the results did not seem plausible. During subsequent classes presented with the same test, it was noted that many learners were answering the questions before the audio track had finished, thus indicating that they were relying solely on the captions to answer the comprehension questions, similar to findings by Pujolá (Reference Pujolá2002). It was then clear that the added support of captions for listening comprehension had become a double-edged sword. While more advanced learners tend to ignore the on-screen captions, judging the text as interference, less advanced learners tend to rely on the captions quite heavily. In other words, as learners’ L2 comprehension advances, the supporting captions become interference, thus begging the question: “At what point do learners no longer rely on the captions as support, sanctioning their removal?”

Learners’ self-reporting on their reliance may be one avenue to addressing this problem; however, learners often self-report in a fashion they deem pleasing to their instructors (Lepper, Corpus & Iyengar, Reference Lepper, Corpus and Iyengar2005), which does not mirror their reliance accurately. Consequently, instructors are left with their intuition as to when best to remove the captions, opening opportunities for untimely removal: too early causing frustration, or too late creating interference. Following in this vein, the aim of the current study was to create and employ a testing instrument that measures the degree to which learners rely on the support of captions during L2 listening comprehension instruction. The creation of such an instrument would assist instructors to make informed choices on the employment, inclusion, and/or exclusion of captioning in support of L2 listening comprehension exercises and, as Markham, Peter and McCarthy (Reference Markham, Peter and McCarthy2001:444) state, “…language teachers need all the assistance they can possibly get as they attempt to help learners meet the challenge of improving their target-language reading and listening comprehension abilities.”

Accordingly, this paper outlines the current status of research in relevant areas, with a description and discussion of the findings of an empirical study that introduces a testing instrument, the Caption Reliance Test (CRT), proposed by the authors, and employed to assess learners’ reliance on captioning. Following is a discussion of implications related to the use of this test. Finally conclusions are drawn, highlighting the pedagogical implications of the usage of the CRT, as well as outlining the limitations of the current study.

2 Review of literature

2.1 The listening process

The very nature of listening, an invisible, complex (Meinardi, Reference Meinardi2009), cognitive process (Goh & Taib, Reference Goh and Taib2006), makes it difficult to describe. Wipf (Reference Wipf1984) described listening comprehension as a “…complex problem-solving skill.” Furthermore, listeners are required to discriminate between sounds, understand vocabulary and grammatical structures, consider stress and intonation, recognize intention and retain and interpret all this within the immediate, as well as the larger socio-cultural context of the utterance (op. cit.: 1984). Thus, listening is a multifaceted, active process of interpretation where listeners match what is heard with what is already known. Due to the complex nature of listening, and the skills required, learners often attempt to increase comprehension by using supplementary sources such as captions, which is deemed easier than listening alone (see Diao et al., Reference Diao, Chandler and Sweller2007; Smidt & Hegelheimer, Reference Smidt and Hegelheimer2004; Stewart & Pertusa, Reference Stewart and Pertusa2004) as captions provide various supports, i.e., visual representations of the auditory signal (Danan, Reference Danan2004; Richards & Gordon, Reference Richards and Gordon2004; Stewart & Pertusa, Reference Stewart and Pertusa2004), exposing new and unfamiliar text (Krashen, Reference Krashen1981), and a greater depth of processing (Danan, Reference Danan2004). However, learners who rely on captions may not achieve the creation of required schema to successfully comprehend L2 auditory messages where they are able to rely only on the acoustic signal (Vandergrift, Reference Vandergrift2004; Reference Vandergrift2007).

2.2 Captions and language learning

Captions may be defined as redundant text that matches spoken audio signals and appears in the same language as the audio. Captions are not to be confused with subtitles, which are textual versions of dialogue, but may not necessarily be in the same language as the audio. In other words, a Spanish movie with captions will display the text in Spanish, while the subtitles of the same movie may be displayed in English, Japanese or another language preferred by the viewer (Markham & Peter, Reference Markham and Peter2003). Multimedia readily facilitates the use of either captions or subtitles and thus has become an important area in language learning research (Goodwin-Jones, Reference Goodwin-Jones2007; Leveridge & Yang, Reference Leveridge and Yang2012; Vandergrift, Reference Vandergrift2007). Moreover, research has shown captions to be beneficial: in the facilitation of immediate understanding of L2 content (Garza, Reference Garza1991; Hwang, Reference Hwang2004; Markham, Reference Markham2000–2001; Robin, Reference Robin2007; Stewart & Pertusa, Reference Stewart and Pertusa2004); enhancing vocabulary acquisition (Chai & Erlam, 2008); and assisting L2 beginners when the audio is too fast (Robin, Reference Robin2007). However, thus far, no test has been created that is able to assess the degree to which individual learners rely on captioning.

2.3 Multimedia L2 instruction and captions as a learning support

According to Mayer's (Reference Mayer2001) Cognitive Theory of Multimedia Learning, two modes of matching representations of information may be processed via two separate channels: the auditory and visual channels. In addition, three key premises make up the theoretical basis: (a) dual channel; (b) limited capacity; and most importantly (c) active processing. The active processing premise signifies that one will select the most relevant information as input during information processing, and subsequently integrate the information with prior knowledge (Al-Shehri & Gitsaki, Reference Al-Shehri and Gitsaki2010). In other words, this implies that if L2 learners are challenged by the listening content they are exposed to, they may choose to attend to available visual content, as it may be more easily understood, and consequently more relevant. In sum, learning may take place simultaneously using both auditory and visual stimuli; nevertheless, it may also be negatively impacted by excessive information due to the limited capacity of each channel. As such, learners who gather information via the auditory channel, supported by captions through the visual channel, may reach a point where the audio stimuli cannot be processed quickly enough, resulting in the captions becoming the most understood and relevant stimuli, and thus the preferred stimuli (Diao et al., Reference Diao, Chandler and Sweller2007).

As previously mentioned, two modes of congruent auditory and visual text may be processed via distinct channels. This is particularly important for L2 listening comprehension, as multimedia instruction may exploit these processes, in turn benefitting the learner (Leveridge & Yang, Reference Leveridge and Yang2012; Sun & Dong, Reference Sun and Dong2004). An auditory stimulus disappears once it has been uttered, making listening comprehension a more demanding ability to master as compared to visual stimuli. Conversely, multimedia instructional support, through the use of captions, allows learners to visualize what they hear (Danan, Reference Danan2004), and remains available for a prolonged period of time, allowing additional time for processing. However, questions posed by Winke, Gass, and Sydorenko (Reference Winke, Gass and Sydorenko2010), have gone unanswered as relatively little is known regarding how learners process the captioning support or what learners are attending to when they look at captions.

3 Hypotheses

One central issue regarding the use of captioning support is determining learners’ reliance on the support of captions in multimedia instruction. Miscalculation on the instructor's part could create undesired results as premature removal may cause some learners, especially those more dependent on the captions, to become frustrated with the learning content, as they require additional instructional supports. At the same time, providing captions for too long may prevent these learners from creating schema designed to comprehend listening in real-life situations, i.e., situations void of captioning support. Furthermore, captions may be distracting for higher-level learners who have previously created L2 listening schemas and now find that the captions directly interfere with the schema. In sum, both early removal and extended employment of caption supports may inhibit the effectiveness of multimedia instruction. For instructors to successfully utilize captions when most beneficial, they require a testing instrument that indicates the degree to which learners are reliant on captioning support; this will assist instructors as they make informed decisions regarding support usage. The current study provides just such a test, the CRT, which is further discussed in section 4.

To further illustrate the aforementioned problems, a simple trial consisting of ten multiple-choice questions was designed prior to the current study. The first five questions and answers were completely auditory, delivered via multimedia to three EFL classes with a total of 157 participants. Of these participants, 53 (33.7%) were assessed as low proficiency, 52 (33.1%) as intermediate, and 52 (33.1%) as high proficiency learners according to academic records. A projector showed nothing but a blank screen during the first five questions and answers whereas the last five included captions displayed on the screen. The test scores indicated that for the first five questions higher-level learners had little trouble, scoring 4 or 5 out of a possible 5, while lower-level learners scored very low, either 0 or 1 out of a possible 5 (see Table 1). Yet, these same low-level learners scored 4 or 5 out of a possible 5 on the last five questions that included captions, an increase of approximately 80%.

Table 1 Results from the trial (n = 157)

Additionally, it was observed that some learners were answering captioned questions before the auditory track had completed. Other learners occasionally paid attention to the captions, looking up at the screen once or twice, while yet others ignored the captions altogether; even going as far as blocking the captions out with their hands. This led to three hypotheses regarding the use of captions:

  • H1: All learners do not rely equally on the additional support of captions during listening comprehension instruction.

  • H2: Lower-level achievers are more reliant on captions to gain listening comprehension than higher-level learners.

  • H3: Higher-level achievers are less reliant on captions to gain listening comprehension than lower-level learners.

Furthermore, because the ultimate goal of L2 listening comprehension instruction is that learners will eventually be able to participate in conversations with native target language speakers without the need of learning supports, such as captions, the removal of these supports is also an instructional aim. Thus, to provide instruction support that matches individual learner needs, instructors must be able to assess the degree to which learners are reliant on captions and gradually remove them, in turn reducing learners’ reliance on the support that the captions provide.

4 The proposed CRT

The proposed CRT is a multimedia listening test that exploits learners’ attention to determine the degree of learner reliance on captioning by utilizing both audio text and accompanying congruent/incongruent visual text, in the form of captions. As learners answer, they do so according to either the audio or visual message, dependent upon which they are attending to. The basic assumption behind the test is twofold. Firstly, as Mayer's (Reference Mayer2001) Cognitive Theory of Multimedia Learning implies, if L2 learners are challenged by the listening content, they may choose to attend to available visual content, as it may be more easily understood, and consequently more relevant. Thus, learners who rely heavily on captioning support will tend to answer questions based on what they read, rather than what they hear. Secondly, as previously mentioned, captions may be distracting for higher-level learners who have previously created L2 listening schemas. They may now find that the captions directly interfere with the schema and try to ignore or even block the captions provided, so as to answer questions according to what they hear (Taylor, Reference Taylor2005). As the L2 learner develops and creates listening comprehension schema, they will decreasingly find the need to rely on captioning support. Thus, the CRT can provide information on the point at which the learner has progressed towards the place at which captions are no longer necessary and may be removed.

The CRT in its current state is comprised of multiple-choice questions of which 75% are congruent, meaning that the captions mirror the auditory track, while 25% are incongruent, meaning that one particular word in the captions for each question does not match the audio. In other words, if a learner's answers match the captioned text more than the audio, they will be considered more reliant on captions. The incongruent questions are placed randomly throughout the test to ensure that they would not appear obvious to the learner. The CRT content may be adapted from the audio of the learners’ current listening comprehension instructional materials to ensure that learners are familiar with the vocabulary. In addition, the questions must be at an appropriate academic level as well as accurately reflecting the learners’ listening comprehension ability. In other words, the CRT should return scores indicative of individual learner proficiency supported by prior listening comprehension testing.

The proposed CRT consists of an introduction, examples, and a testing stage. The introduction outlining the test should be delivered aurally in both the target and native languages to ensure the learners’ understanding. It advises the learners to listen for the correct answers, and explains that if there is any discrepancy (incongruence) between the audio track and the captions then the audio is to be considered correct. This is done to ensure that learners who are able to discover discrepancies will rely on the audio portion of the test, and not be confused as to which item is correct. A framework of the CRT is provided in Table 2.

Table 2 CRT framework

The CRT is comprised of multiple-choice questions in order to: facilitate multiple answers for one question; guide the learner to choose either an audio or textual answer; allow the option of incorrect answers thus providing a testing baseline, as well as to provide specific answers while removing ambiguity that other question/answer styles provide, i.e., open-ended questions. Furthermore, only 25% of the incongruent questions are given so that the learners would not readily notice any incongruence. During trial stages, the test was comprised of half congruent and half randomly placed incongruent questions. It was observed that after completing a few of the questions, the participants in the trial became aware of the incongruent words and were more alert to them, thus confounding the test findings. This indicated the possibility of additional factors at play regarding an appropriate balance of in/congruent questions, which was mirrored in the design changes to the CRT, resulting in a reduction in the number of incongruent questions, finally settling with 25% of the questions as incongruent.

In addition, the CRT was designed for classroom implementation, allowing the test to be run with minimal equipment, i.e., a laptop, projector, and speakers. The small amount of required equipment does not incur high implementation costs. The rationale behind these design decisions reflects the researchers’ intent to preserve the learners’ typical testing procedures for listening comprehension; i.e., the learners recorded their answers for the multiple-choice questions using pen and paper, a method they were familiar with. In contrast, the introduction of electronic devices to record answers would require extensive classroom modifications as well as the introduction of new devices, possibly refocusing the learners’ attention on the device, away from the task at hand, in turn affecting the outcome of the study.

Finally, it is suggested that any future employment of CRT would require a similar framework; however, the content, both auditory and visual text, would have to be altered to suit the learners. An example of the CRT content used in the current study can be seen in Figures 1 and 2, while screenshots of the CRT questions and answers appear in Figure 3. Figure. 2 presents the audio script of an incongruent question including the word “mad”, while Figure 3 presents a screenshot of the caption for the same question, in which the word “glad” replaces the word “mad” to create incongruence.

Fig. 1 CRT audio script congruent question & answer.

Fig. 2 CRT audio script incongruent question & answer.

Fig. 3 Screenshots: Typical incongruent question slides (left & center), Typical answer slide (right).

5 Methods

5.1 Participants

A total of 141 grade 12 students from a senior high school in northern Taiwan participated in this study. All participants were enrolled in one of three EFL classes. In addition, all participants had studied English for a minimum of five years. The average age of all participants at the time of the study was 17. All participants indicated normal or corrected to normal eyesight. No hearing impairments were reported. Although the participants were all approximately the same age and in the same grade their English abilities varied from person to person. All participants had prior experience of being taught in multimedia environments where captioning support was a typical medium.

5.2 Instruments

The instruments employed in this study were as follows: a) English Listening Comprehension Test (ELCT); b) scores on the General English Proficiency Test (GEPT); and c) the Caption Reliance Test (CRT).

  1. a) ELCT: Participants’ prior academic achievement related to English language learning was collected from the ELCT to provide evidence of successful English language learning, as well as for ranking purposes. This included the participants’ most recent English exam scores from the semester concurrent with this study.

  2. b) GEPT: The GEPT, developed in 1999 in Taiwan, provides individual assessment of English language proficiency (Roever & Pan, Reference Roever and Pan2008). The study aimed to determine the participants’ level of EFL proficiency before the experiment by using the result of the GEPT combined with their prior academic achievements. This was done to eliminate from the experiment those participants whose GEPT and academic scores did not match, thus allowing the creation of more robust categories. All participants’ test scores matched; thus, all of them participated in the experiment. Finally, the participants were categorized into three groups according to their listening comprehension skill level: low, medium, and high proficiency. However, all three groups contained many borderline cases. These borderline cases neither added nor distracted from the overall results. Therefore, for presentation purposes and to remove any ambiguity between the groups, the GEPT scores were divided into 5 percentiles at 20% intervals, resulting in the following groups: 1 = low, 2 = medium low, 3 = medium, 4 = medium high, 5 = high. The second and fourth percentiles were removed after analysis.

  3. c) CRT: The CRT, as previously introduced, was comprised of a total of 40 multiple-choice questions, including 30 congruent and 10 randomly placed incongruent. The CRT content was adapted from the audio of the participants’ current listening comprehension book to ensure that participants were familiar with the vocabulary. In addition, the participants’ listening comprehension instructor created the test questions, further ensuring that the questions were at an appropriate academic level as well as accurately depicting the participants’ listening comprehension ability. Reliability analysis was employed to check the dependability, consistency and homogeneity of each item in the CRT. Cronbach's α for the multiple-choice questions in the CRT is 0.71, which indicates that the reliability of the CRT is acceptable. The CRT used in the current study consisted of an introduction, examples, and a testing stage, as previously mentioned. The introduction outlining the test was delivered in both English and Mandarin, the participants’ native language, to ensure understanding. To ensure that no individuals pointed out the incongruence to the instructor during the test, consequently making the entire class aware of the incongruence, the researcher reiterated that the raising of hands and asking of questions was prohibited during the entire procedure.

5.3 Procedure

The experiment was conducted at a senior high school in northern Taiwan over a two-week period. The two-step procedure was as follows:

Step 1: Following the collection of the original 141 participants’ ELCT scores administered by their high school, the GEPT was administered. Participants’ GEPT scores were then compared with the ELCT scores.

Step 2: In this step, participants were asked to complete the CRT. An Adobe Flash™ presentation was created by the researchers to display the multimedia listening comprehension test. The audio was broadcast over a portable public address system connected to a laptop computer audio port, and the text was presented as black text captions on a white screen using a digital projector connected to the same laptop. The captions appeared with the audio track simultaneously, exactly like captions on a DVD movie or CD. Prior to test commencement, participants were instructed, in both their native language (Mandarin) and the target language (English) to listen to the audio and then choose the correct answer. Participants were advised that if they noticed any discrepancies between the audio and the visual text, only the audio was to be considered as correct. This was followed by three example questions and answers. As previously mentioned, the multiple-choice CRT questions were divided into two types, congruent and incongruent with the incongruent questions randomly placed throughout.

5.4 Data analysis

To ensure the robustness of the GEPT grouping, and because the data was an ordinal scale, Spearman's Correlation was employed to compare results between their high school ELCT scores and the GEPT results. Additionally, the Spearman's Correlation was also employed to ensure that the results of the GEPT were positively correlated to the congruent CRT questions.

As previously stated, the congruent questions comprised 30 of the 40 total questions, while 10 questions were incongruent. Each question was followed by four multiple-choice answers. Both the congruent and incongruent questions were scored using the methods discussed below; examples of these scoring methods are shown in Table 3.

Table 3 Example CRT scores

  • Congruent questions: Out of four possible answers, only one was correct. Therefore, one point was awarded for each correct answer for a possible total of 30 points.

  • Incongruent questions: These questions were identical to the congruent questions in that there were four answers to choose from, with only one correct answer. However, for answers that matched the audio, one point was awarded to the ‘non-reliant’ category (Listening), while answers that matched the visual text (captions) scored one point for the ‘reliant’ category (Reading). If either of the two incorrect answers was chosen, the answer was categorized as incorrect, and a point was awarded to the incorrect category.

The participants were divided into five groups based on their GEPT scores, three of which were selected for further analysis. Because the data containing a discrete variable would not come from a probability distribution or make inferences about the parameters of the distribution, the Jonckheere-Terpstra (J-T) test, which is a non-parametric statistical test, was employed. Furthermore, trends in the data were expected because borderline cases were removed, creating ordered alternatives. Thus, the J-T test was employed because it is a distribution-free test for ordered alternatives in a linear layout that exposes trends.

6 Results and discussion

From the Spearman's Correlation results, the learners’ prior ELCT scores administered by the high school were significantly correlated to the results of the GEPT, as shown in Table 4, demonstrating that the GEPT results are accurate indicators of the learners’ English listening comprehension. In addition, the GEPT scores were also significantly correlated with the results of the 30 congruent questions from the CRT, signifying that the CRT congruent questions are positively associated to learners’ GEPT. Moreover, the ELCT and the CRT results signified a positive correlation, indicating that the CRT congruent questions are positively related to learners’ academic listening comprehension achievement level. Accordingly, the scores of the GEPT may be used to separate the learners into categories (see Table 5), removing any borderline cases (Groups 2 & 4), resulting in the creation of better-defined groups.

Table 4 Spearman Correlations of the ELCT, GEPT, and CRT congruent questions

*P < 0.05, **P < 0.01.

Table 5 GEPT categories and scores

Because two percentile groups were removed from the scoring, there were gaps in the data, possibly disrupting the test findings. Therefore, a Jonckheere-Terpstra (J-T) test was employed (N = 89), as the researchers assumed that the data would have a particular order. The J-T test was performed to look for relationships between participants’ GEPT scores and both the incongruent ‘non-reliant’ category (Listening) and the ‘reliant’ category (Reading) scores.

The J-T test results for the incongruent ‘non-reliant’ (Listening) scores with a significance level (J-T M = 1307.50, P < 0.001), and the incongruent ‘reliant’ (Reading) scores were also significant (J-T M = 1307.50, P < 0.05), as seen in Table 6.

Table 6 J-T test results (N = 89)

The above-mentioned J-T test findings indicate that the CRT scores specify a significant correlation between academic achievement and reliance on captions. The test scores were triangulated using learners’ current academic achievement (as measured by the ELCT) and their GEPT results. The results confirm the first hypothesis, which states that:

  • H1: All learners do not rely equally on the additional support of captions during listening comprehension instruction.

Furthermore, a correlation was found between academic achievement and reliance on captions. In other words, the more a learner progresses, the less they are reliant on captions as a learning support and the more the captions may interfere with listening comprehension, thus supporting the second and third hypotheses:

  • H2: Lower-level achievers are more reliant on captions to gain listening comprehension than higher-level learners.

  • H3: Higher-level achievers are less reliant on captions to gain listening comprehension than lower-level learners.

Several related studies take a blanket approach by suggesting that captions benefit all learners (see Chang, Tseng & Tseng, Reference Chang, Tseng and Tseng2011; Danan, Reference Danan2004; Vanderplank, Reference Vanderplank1988; Winke et al., Reference Winke, Gass and Sydorenko2010). However, the results obtained in this study suggest otherwise and imply that reliance on captions is an individual matter that cannot be universally applied. Consequently, before considering the addition and/or removal of captions, instructors should have evidence to the degree at which the individual learners rely on captioning support for comprehension.

7 Conclusions

The primary purpose of the current study was to assess learners’ individual reliance on captions. To achieve this goal, a novel testing instrument, the CRT, was created and employed during an empirical study. The results show promise in that the CRT reflects the degree to which learners rely on the support of captions. Furthermore, the empirical study demonstrated that all learners do not rely equally on the additional support of captions in listening comprehension instruction. Moreover, the present study results indicate that individuals rely on captions to varying degrees, and that captions are not equally beneficial to all learners, as previous studies have implied. While more beneficial to those at lower L2 levels, captions may cause interference for those at higher L2 levels, thereby possibly creating frustration or inhibiting comprehension. Moreover, the results gathered from the CRT in the empirical study appear to have given accurate accounts of the learners’ reliance on captioning support. Although the initial results demonstrate potential, more testing is required with various levels of L2 learners in divergent situations, gathering results that may be applied to more general populations.

7.1 Pedagogical implications

In regard to second language comprehension, previous studies have suggested that captioning support is beneficial to beginner (Buck, Reference Buck2001; Hulstijn, Reference Hulstijn2003; Rost, Reference Rost2002; Doughty & Long, Reference Doughty and Long2003) or intermediate (Hayati & Mohmedi, Reference Hayati and Mohmedi2011) L2 learners; these same studies propose that advanced L2 learners rely less on captioning support. Hence, reliance on captions is an individual learner characteristic, which varies considerably even within classes or small groups of learners. Therefore, to provide appropriate listening comprehension support for learners, instructors require an understanding of the degree to which individuals rely on the support of captions for comprehension. The current study creates and employs just such an assessment: one which instructors can utilize to gain insight into the reliance of individual learners on captions in support of L2 listening comprehension.

7.2 Limitations of the study

The study results are encouraging and supported by prior research. However, this study is not without limitations, as the sample was relatively homogeneous, coming from similar cultural and educational backgrounds as well as consisting entirely of subjects at the same level of L2 instruction. Thus, the results may only be applied to similar EFL high school students. Additionally, none of the participants in the current study reported any auditory or visual impairment, nor were any learning disabilities, such as dyslexia, indicated. As such impairments or disabilities are often present in general classroom populations and may profoundly affect a learner's ability to read a communal screen or hear the auditory track, considerations must be made before implementing the CRT.

Finally, the current investigation was a cross-sectional study illustrating that reliance on captions is an individual learner characteristic. Moreover, variants such as the effects of time or the optimal number of incongruent questions have yet to be considered. Thus, while the CRT has proven useful in reflecting the participants’ reliance on captions, further, more rigorous testing in various settings and with diverse populations is required before its true effectiveness can be realized.

Acknowledgements

The authors would like to thank Ms Irene Chen-Yi Shih, from National Central University, Taiwan and Mr Christian Venhuizen, from Simon Fraser University, Canada for their help in improving the readability of this article. The authors would also like to thank all the subjects who participated in the study. This study was partially supported by grant (NSC 100-2628-S-008-002-MY3) from the National Science Council of Taiwan.

References

Al-Shehri, S. Gitsaki, C. (2010) Online reading: a preliminary study of the impact of integrated and split-attention formats on L2 students’ cognitive load. ReCALL, 22(3): 356375.Google Scholar
Amaral, L. A. Meurers, D. (2011) On using intelligent computer-assisted language learning in real-life foreign language teaching and learning. ReCALL, 23(1): 424.Google Scholar
Buck, G. (2001) Assessing Listening. Cambridge: Cambridge University Press.Google Scholar
Chang, C. C., Tseng, K. H. Tseng, J. S. (2011) Is single or dual channel with different English proficiencies better for English listening comprehension cognitive load and attitude in ubiquitous learning environment? Computers & Education, 57(4): 23132321.Google Scholar
Danan, M. (2004) Captioning and subtitling: Undervalued language learning strategies. Meta, 49(1): 6777.Google Scholar
Diao, Y., Chandler, P. Sweller, J. (2007) The effect of written text on comprehension of spoken English as a foreign language. American Journal of Psychology, 120(2): 237261.Google Scholar
Doughty, C. Long, M. (2003) The scope of inquiry and goals of SLA. In: Doughty, C. and Long, M. (eds.), The Handbook of Second Language Acquisition. Malden, MA: Blackwell, 315.Google Scholar
Garza, T. (1991) Evaluating the use of captioned video materials in advanced foreign language learning. Foreign Language Annals, 24(3): 239258.Google Scholar
Goh, C. Taib, Y. (2006) Metacognitive instruction in listening for young learners. ELT Journal, 60(3): 222232.Google Scholar
Goodwin-Jones, R. (2007) Digital video update: YouTube, Flash, High-Definition. Language Learning & Technology, 11(1): 1621.Google Scholar
Hayati, A. Mohmedi, F. (2011) The effect of films with and without subtitles on listening comprehension of EFL learners. British Journal of Educational Technology, 42(1): 181192.Google Scholar
Hulstijn, J. H. (2003) Connectionist models of language processing and the training of listening skills with the aid of multimedia software. Computer Assisted Language Learning, 16(5): 413425.Google Scholar
Hwang, Y. L. (2004) The effect of the use of videos captioning on English as a foreign language on college students’ language learning in Taiwan (Unpublished doctoral dissertation). Ann, Arbor MI: UMI.Google Scholar
Krashen, S. D. (1981) Second language acquisition and second language learning. Oxford: Oxford University Press.Google Scholar
Lepper, M. R., Corpus, J. H. Iyengar, S. S. (2005) Intrinsic and extrinsic motivational orientations in the classroom: Age differences and academic correlates. Journal of Educational Psychology, 97(2): 184196.Google Scholar
Leveridge, A. N. Yang, J. C. (2012) Effect of Medium: A Conceptual Framework for the removal of Supporting Captions for EFL Listening Comprehension in Multimedia Instructional Delivery. In: 15th International CALL Research Conference: Proceedings. Taiwan: Taichung.Google Scholar
Liaw, M. L. (2007) Constructing a ‘third space’ for EFL learners: where language and cultures meet. ReCALL, 19(2): 224241.Google Scholar
Lim, K. M. Shen, H. Z. (2006) Integration of computers into an EFL reading classroom. ReCALL, 18(2): 212229.Google Scholar
Lwo, L. Lin, M. C. T. (2012) The effects of captions in teenagers’ multimedia L2 learning. ReCALL, 24(2): 188208.Google Scholar
Markham, P. L. (2000–2001) The influence of culture-specific background knowledge and captions on second language comprehension. Journal of Educational Technology Systems, 29(4): 331343.Google Scholar
Markham, P. L. Peter, L. (2003) The influence of English language and Spanish language captions on foreign language listening/reading comprehension. Journal of Educational Technology Systems, 31(3): 331341.Google Scholar
Markham, P. L., Peter, L. McCarthy, T. (2001) The effects of native language vs. target language captions on students’ DVD video comprehension. Foreign Language Annals, 34(5): 439445.Google Scholar
Mayer, R. E. (2001) Multimedia learning. New York: Cambridge University Press.Google Scholar
Meinardi, M. (2009) Speed bumps for authentic listening material. ReCALL, 21(3): 302318.Google Scholar
Pujolá, J. T. (2002) CALLing for help: Researching language learning strategies using help facilities in a web-based multimedia program. ReCALL, 14(2): 235262.Google Scholar
Richards, J. C. Gordon, D. B. (2004) New interchange intro: Video teacher's guide. New York: Cambridge University Press.Google Scholar
Robin, R. (2007) Commentary: learner-based listening and technological authenticity. Language Learning & Technology, 11(1): 109115.Google Scholar
Roever, C. Pan, Y. C. (2008) Test review: GEPT: General English Proficiency Test. Language Testing, 25(3): 403418.Google Scholar
Rost, M. (2002) Teaching and researching listening. London: Longman.Google Scholar
Smidt, E. Hegelheimer, V. (2004) Effects of online academic lectures on ESL listening comprehension, incidental vocabulary acquisition and strategy use. Computer Assisted Language Learning, 17(5): 517556.Google Scholar
Stewart, M. A. Pertusa, I. (2004) Gains to foreign language while viewing target language closed-caption films. Foreign Language Annals, 37(3): 438443.Google Scholar
Sun, Y. Dong, Q. (2004) An experiment on supporting children's English vocabulary learning in multimedia context. Computer Assisted Language Learning, 17(2): 131147.Google Scholar
Taylor, G. (2005) Perceived processing strategies of students watching captioned video. Foreign Language Annals, 38(3): 422427.Google Scholar
Vandergrift, L. (2004) Listening to learn or learning to listen? Annual Review of Applied Linguistics, 24: 325.Google Scholar
Vandergrift, L. (2007) Recent developments in second and foreign language listening comprehension research. Language Teaching, 40(3): 191210.Google Scholar
Vanderplank, R. (1988) The value of teletext subtitles in language learning. ELT Journal, 42(4): 272281.Google Scholar
Winke, P., Gass, S. Sydorenko, T. (2010) The effects of captioning videos used for foreign language listening activities. Language Learning & Technology, 14(1): 6586.Google Scholar
Wipf, J. (1984) Strategies for Teaching School Language Listening Comprehension. Foreign Language Annals, 17(4): 345348.Google Scholar
Figure 0

Table 1 Results from the trial (n = 157)

Figure 1

Table 2 CRT framework

Figure 2

Fig. 1 CRT audio script congruent question & answer.

Figure 3

Fig. 2 CRT audio script incongruent question & answer.

Figure 4

Fig. 3 Screenshots: Typical incongruent question slides (left & center), Typical answer slide (right).

Figure 5

Table 3 Example CRT scores

Figure 6

Table 4 Spearman Correlations of the ELCT, GEPT, and CRT congruent questions

Figure 7

Table 5 GEPT categories and scores

Figure 8

Table 6 J-T test results (N = 89)