Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-02-11T09:57:24.895Z Has data issue: false hasContentIssue false

Effects of gloss type on text recall and incidental vocabulary learning in mobile-assisted L2 listening

Published online by Cambridge University Press:  17 July 2017

Fidel Çakmak
Affiliation:
Alanya Alaadin Keykubat University, Turkey (email: fidelcakmak@gmail.com)
Gülcan Erçetin
Affiliation:
Boğaziçi University, Turkey (email: gulcan.ercetin@gmail.com)
Rights & Permissions [Opens in a new window]

Abstract

This study investigates the effects of multimedia glosses on text recall and incidental vocabulary learning in a mobile-assisted L2 listening task. A total of 88 participants with a low level of proficiency in English were randomly assigned to one of four conditions that involved single channel (textual-only, pictorial-only) and dual-channel (textual-plus-pictorial) glosses as well as a control condition where no glosses were provided. The participants listened to a story through their mobile phones and were engaged in an immediate free recall task and unannounced vocabulary tests after listening. The findings indicated that access to glosses facilitated recognition and production of vocabulary with the type of gloss having no effect. On the other hand, glosses had no effect on text recall.

Type
Regular papers
Copyright
Copyright © European Association for Computer Assisted Language Learning 2017 

1 Introduction

Mobile learning (m-learning) is considered as an extension of e-learning through mobile devices and refers to “learning across multiple contexts, through social and content interactions, using personal electronic devices” (Crompton, Reference Crompton2013: 4). Providing learning contexts that are available at almost any location and time (Kukulska-Hulme & Traxler, Reference Kukulska-Hulme and Traxler2005), m-learning involves any kind of handheld mobile devices (e.g., PDAs, iPADs, mobile phones) that are utilized for self-directed and spontaneous learning. Conventional e-learning technologies through computers are limited in scope to one time and place, and Crompton noted that for these technologies the social and environmental context for learning is incidental (Crompton, Reference Crompton2013: 49). M-learning on the other hand, can take place without spatial and temporal constraints. Learning on a mobile platform occurs through ubiquitous access (Melhuish & Falloon, Reference Melhuish and Fallon2010; Sharples, Sánchez, Milrad & Vavoula, Reference Sharples, Sánchez, Milrad and Vavoula2009) and context can vary (Pachler, Bachmair & Cook, Reference Pachler, Bachmair and Cook2013; Sharples et al., Reference Sharples, Sánchez, Milrad and Vavoula2009). The ubiquity and flexibility in time and context, and access to knowledge provided by m-learning unpack meaningful and real-world contexts for the interactive learning habitus while allowing learners to “exploit the spontaneous and opportunistic nature of learning on the move” (Kukulska-Hulme & Traxler, Reference Kukulska-Hulme and Traxler2005: 31). This could make the attributes of handheld mobile learning environments more effective than a controlled computer-based environment. A recent meta-analysis (Sung, Chang & Liu, Reference Sung, Chang and Liu2016), taking into account hardware, software, and intervention durations for mobile devices applied to different users, settings, teaching methods, and domain subjects, revealed that the overall effect of using mobile devices in education was better compared to using desktop computers or not using mobile devices as an intervention. Thanks to the advances in technology, m-learning is available through a wide spectrum of mobile devices, generally handheld ones, including PDAs, mobile phones, small tablets, MP3/MP4 players, e-book readers, games consoles, digital dictionaries, and voice recorders. By being always-on and serving both as a primary means of social communication and connectivity, “high-end” mobile phones are more often preferred and are more popular than the other mobile devices such as tablets or laptops (Lindquist, Denning, Kelly, Malani, Griswold & Simon, Reference Lindquist, Denning, Kelly, Malani, Griswold and Simon2007). The portability of mobile phones has made access to information easier and faster (Bradley & Holley, Reference Bradley and Holley2011), thus encouraging learners to take part in learning while communicating with others.

The use of mobile technologies for second language (L2) learning and teaching has also received great interest due to their potential for providing “authentic”, and “contextual” language learning experiences (Chinnery, Reference Chinnery2006: 9; Kukulska-Hulme, Reference Kukulska-Hulme2006: 123, respectively) as well as immediate and flexible ways of acquiring a new language (Kukulska-Hulme, Reference Kukulska-Hulme2010). Research on the effectiveness of mobile-assisted language learning (MALL) has generally focused on vocabulary learning (e.g., Chen & Chung, Reference Chen and Chung2008; Stockwell, Reference Stockwell2007) with research particularly focusing on the use of short message service (Çavuş & İbrahim, Reference Çavuş and İbrahim2009; Kennedy & Levy, Reference Kennedy and Levy2008; Kim, Reference Kim2011; Lu, Reference Lu2008) or e-dictionaries (Song & Fox, Reference Song and Fox2008) or flashcards (Başoğlu & Akdemir, Reference Başoğlu and Akdemir2010) through mobile phones.

Current mobile technologies can deliver enriched multimedia environments to learners where visual and verbal information can be provided simultaneously. The issue of how learning takes place in multimedia environments has been of particular interest to researchers in technology integrated learning. For instance, Mayer (Reference Mayer2009) has provided a comprehensive account of learning in multimedia environments with his generative theory of multimedia learning (GTML). The theory is built upon three main assumptions which maintain that two separate channels (auditory and visual) are used to process incoming information (i.e., the dual channels assumption), that information should be attended selectively through these channels due to the limited capacity of working memory (WM), and that learning occurs in a series of active processes such as selecting the relevant verbal and visual information, organizing and integrating them with each other and with already existing knowledge. A number of design principles have been derived from the GTML, one of which is the multimedia principle, which suggests “people learn more deeply from words and pictures than from words alone” (Mayer, Reference Mayer2009: 47). When relevant visuals are added to words, learners can easily make connections between the two representations and create meaning. This fosters “generative processing” or “deep cognitive processing” (Mayer, Reference Mayer2009: 57) and thus reduces the cognitive load (CL), which might occur if verbal and visual information were provided separately. Although the design principles offered by the GTML have been widely tested with L1 populations, their validity for L2 populations is still under scrutiny.

In the field of second language (L2) learning, much attention has been given to multimedia input enhancement with regard to vocabulary learning and language skill improvement. The provision of verbal and visual input as a means for delivering comprehensible input is considered to facilitate L2 learners’ meaning-making processes. Much of what we know about the effects of multimedia input on L2 learning comes from studies that focus on the effects of multimedia glosses on reading comprehension and vocabulary learning. These studies generally provide evidence for the facilitative effects of dual presentation of multimedia glosses (see Abraham, Reference Abraham2008 and Yun, Reference Yun2011 for meta analyses).

With the widespread use of mobile technologies in the field of education in general and in language learning in particular, the question that needs to be addressed is whether the effects observed in computer-based multimedia environments would be applicable to mobile environments as well. Since there is ample computer-based research on the effects of multimedia glosses on vocabulary learning and text comprehension, the current study does not compare mobile learning with computer-based learning. Instead, it aims to explore whether the findings obtained from computer-based research are transferable to the mobile environment. As such, the purpose of the current study is to test the applicability of the multimedia principle of the GTML in a mobile learning environment where L2 learners were exposed to multimedia glosses as they completed a listening task through mobile phones. The study incorporated a listening task not only because the majority of available studies in computer-based environments focused on reading comprehension (e.g., Chun & Plass, Reference Chun and Plass1996b; Plass, Chun, Mayer & Leutner, Reference Plass, Chun, Mayer and Leutner1998; Şakar & Erçetin, Reference Şakar and Erçetin2005, Türk & Erçetin, Reference Türk and Erçetin2014; Yeh & Wang, Reference Yeh and Wang2003; Yun, Reference Yun2011; Zarei & Mahmoodzadeh, Reference Zarei and Mahmoodzadeh2014) and there was a call for more in-depth research into listening comprehension (Brett, Reference Brett1995, Reference Brett1997; Hoven, Reference Hoven1999; Jones & Plass, Reference Jones and Plass2002; Meskill, Reference Meskill1996) but also because listening is a more suitable skill in a mobile learning environment compared to reading which poses a special challenge with small screens.

1.1 Research on L2 vocabulary learning and listening in mobile learning environments

Relatively few studies on L2 listening comprehension through mobile devices have been conducted (Kim, Reference Kim2013) compared to studies focusing on vocabulary learning. The studies on vocabulary learning have focused on the comparison of e-dictionaries and paper dictionaries in relation to reading comprehension and vocabulary retention (Kobayashi, Reference Kobayashi2008; Koyama & Takeuchi, Reference Koyama and Takeuchi2004, Reference Koyama and Takeuchi2009), spelling exercises and teaching pronunciation through mobile phones (Butgereit & Botha, Reference Butgereit and Botha2009; Saran, Seferoğlu & Çağıltay, Reference Saran, Seferoğlu and Çağıltay2009; Zhang, Reference Zhang2012), the effect of using SMS versus printed dictionary on academic vocabulary retention (Alemi, Sarab & Lari, Reference Alemi, Sarab and Lari2012), learning idioms through mobile phones (Amer, Reference Amer2010), the effect of SMS on learning collocations (Motallebzadeh, Beh-Afarin & Daliry Rad, Reference Motallebzadeh, Beh-Afarin and Daliry Rad2011), and vocabulary learning through SMS versus traditional flashcards (Azabdaftari & Mozaheb, Reference Azabdaftari and Mozaheb2012; Başoğlu & Akdemir, Reference Başoğlu and Akdemir2010). For instance, Kim (Reference Kim2011) examined the effects of text messaging and interactivity on vocabulary learning with 62 undergraduate Korean students learning English as an L2. While the control group received only classroom instruction, one of the experimental groups received SMS with no interactivity; the other received SMS with interactivity. The experimental groups outperformed the control group in terms of vocabulary scores, with the group that received interactive SMS performing better than the one receiving noninteractive SMS. Despite positive effects of text messaging on vocabulary learning, the studies conducted by Lu (Reference Lu2008), and Zhang, Song and Burston (Reference Zhang, Song and Burston2011) suggest that vocabulary gains of learners could be short term. Zhang et al. (Reference Zhang, Song and Burston2011) exposed one group of students to printed version of words (available to the students any time they wished) and another group to text messages twice a day for a month. The SMS group outperformed the group exposed to the printed words on the immediate vocabulary retention test. However, the delayed test scores of the groups were not significantly different. Lu (Reference Lu2008) also observed immediate gains in favor of the SMS group but no significant difference in delayed scores.

As for listening, one of the early studies was conducted by Nah, White and Sussex (Reference Nah, White and Sussex2008) who designed pre-listening, while-listening, and post-listening activities through a wireless application protocol (WAP) site. A total of 30 undergraduate Korean students accessed the materials through their mobile phones for 12 weeks. The data collected through questionnaires and interviews indicated that the participants had positive attitudes toward the use of the WAP site since the activities were interactive; the system provided opportunities for anytime and anywhere studying outside the classroom and for collaborative learning. Demouy and Kukulska-Hulme (Reference Demouy and Kukulska-Hulme2010), based on online questionnaire data, also reported that undergraduate students learning French as L2 quickly adopted the use of iPods and MP3 players for additional listening and speaking practice within a six-week program. In another study, Reinders and Cho (Reference Reinders and Cho2010) reported the enthusiasm that undergraduate students had for engaging in L2 extensive listening practice outside the class through podcasts downloaded on their mobile phones. The students stated they were motivated because time and place were flexible; they liked the opportunity to access English on-the-go while travelling on the bus or spending time with friends. These studies merely focused on learners’ attitudes toward the use of mobile phones for listening practice.

A study by Hwang, Huang, Shadiev, Wu and Chen (Reference Hwang, Huang, Shadiev, Wu and Chen2014) investigated how students perceive learning activities and the mobile English listening and speaking system in terms of ease of the use, perceived usefulness of the system and learning activities, perceived engagement value of the activities, and intention to use the system. The participants were 5th Grade English as a foreign language learners exposed to a number of learning activities supported by a mobile listening and speaking system on PDAs for one semester. The activities, sequenced from easy to difficult, involved practicing new words with sentences, dialogues or short stories, role-playing or matching words with pictures. The participants could either work individually or collaboratively; they could record their speech while reading aloud words, sentences or dialogues, share their recordings with other students and listen to recordings shared by their peers. The data were collected through a questionnaire in which the participants were asked to evaluate the system’s ease of use, the usefulness of the system, and the activities on a 5-point Likert scale as well as field notes by the researchers and semi-structured interviews. In addition, the students’ engagement with the learning activities was tracked by the system. The questionnaire data indicated that students had positive perceptions toward the system and the activities. The correlational analyses revealed that the students’ perceptions regarding the system’s ease of use and system usefulness were significantly related to the students’ intentions to use the system. However, the students’ actual usage of the system after class was not related to their perceptions of the ease and usefulness of the system. Although the comparison of the pre-test and post-test means in terms of listening and speaking performance indicated a significant improvement, it is not possible to attribute this gain to the mobile learning system since there was no control group. While the tests have revealed progress in learning, there is no basis for attributing the gains to the technology. It could be the case that the technology actually had no discernable effect. Without a control group for comparison one cannot tell.

A quasi-experimental study was conducted by Kim (Reference Kim2013) who examined whether students of different majors enrolled in a TOEIC course focusing on L2 listening and reading benefitted from additional mobile-assisted listening practice through smart phone applications. A total of 44 students were assigned to control and experimental groups. The former was only exposed to classroom-based listening practice while the latter was provided with additional listening practice through their mobile phone applications twice per week. An analysis of covariance controlling for pre-test differences revealed a significant difference between the groups in favor of the experimental group.

Chang, Tseng and Tseng (Reference Chang, Tseng and Tseng2011) exposed L2 learners to content-based, situated language learning materials through PDAs and examined the effects of content presentation mode and proficiency level on CL and listening comprehension. In this study, 162 Taiwanese university students were assigned to one of two presentation modes: a single mode with auditory-only input and a dual mode with audio-plus-textual input. The participants took a field trip with their PDAs and observed four animals while listening to an audio guide in English either in single-mode or dual-mode condition. After listening, they took a listening comprehension test involving five multiple-choice questions and completed a CL rating scale. The results indicated that the students in the dual-channel condition performed better than those in the single channel and the effect of presentation mode was not mediated by proficiency level. On the other hand, a moderating role of English proficiency on CL was observed. That is, lower proficiency students in the single channel condition expressed higher CL than those in the dual channel while CL ratings of the higher proficiency students in the two conditions did not differ.

As can be seen from the review above, MALL studies typically lack a control group, which is a characteristic of experimental studies. Since the defining feature of mobile learning is its anytime-anywhere aspect, its very nature precludes setting up a control group that is comparable in terms of activities. Lacking an actual control group, these studies tend to rely on learners’ perceptions of the effectiveness of MALL (e.g., Demouy & Kukulska-Hulme, Reference Demouy and Kukulska-Hulme2010; Nah, White & Sussex, Reference Nah, White and Sussex2008; Reinders & Cho, Reference Reinders and Cho2010). However, learners’ perceptions rarely match their actual performance. Therefore, system-collected data regarding learners’ interactions with MALL materials may provide more meaningful insights regarding the learning process compared to user perceptions (e.g., Hwang et al., Reference Hwang, Huang, Shadiev, Wu and Chen2014). In the same vein, the comparison of MALL with other types of learning such as classroom-based learning does not provide convincing evidence as to the effectiveness of MALL due to the weak experimental control (e.g., Kim, Reference Kim2013). Instead, there is a need for more controlled experiments comparing different learning conditions within MALL (e.g. Chang et al., Reference Chang, Tseng and Tseng2011).

1.2 Effects of multimedia glosses on L2 listening and incidental vocabulary learning through listening

Gloss use as an input enhancement vocabulary learning technique has been commonly mentioned in the studies as regards incidental vocabulary learning through L2 reading or listening. Gloss is “a brief definition or synonym, either in L1 or L2, which is provided with the text” (Nation, Reference Nation2001: 272). It has certain benefits, one of which is to make difficult reading easier while by no means simplifying or adapting the text. It also provides an accurate definition of words whose meaning could be guessed incorrectly from the context. Moreover, glossing avoids major interruptions during reading especially when the gloss is provided next to text. Last but not least, it directs the reader’s attention to specific words, which might motivate learning (Nation, Reference Nation2001). In regards to attention, Schmitt (2008) proposes the term engagement to refer anything that leads to more attention on lexical items thus creating more involvement with lexical items to improve vocabulary learning and more chances to recall the items. Providing explanations, for example an L1 translation for a target word, and then using that word in context might enhance learners’ vocabulary knowledge through the extensive exposure generated by the meaning-focused and enriched input. L2 listening and incidental vocabulary learning through listening has received little research attention (Meier, Reference Meier2015; van Zeeland & Schmitt, Reference van Zeeland and Schmitt2013). Similarly, the number of studies investigating the effects of multimedia glosses on L2 listening and incidental vocabulary learning through listening are much fewer (e.g., Cottam, Reference Cottam2010; Jones, Reference Jones2003, Reference Jones2004, Reference Jones2006; Jones & Plass, Reference Jones and Plass2002) compared to studies focused on reading.

Jones and Plass (Reference Jones and Plass2002) randomly assigned 171 English-speaking university students taking French as L2 to one of four listening conditions: (1) no glosses (2) textual-only glosses (3) pictorial-only glosses (4) textual-plus-pictorial glosses. The results pointed to the superior performance of gloss groups over the no-gloss group in terms of immediate and delayed vocabulary tests as well as text recall. Among the gloss groups, the combined gloss group performed significantly better than those who had access to either pictorial or textual glosses on the immediate vocabulary test, but the combined group did not have a significantly better performance than the pictorial group on the delayed test. Additionally, there was not a significant difference between the two single gloss groups on the delayed test either. As for listening comprehension, the combined gloss group outperformed the single gloss groups and no-gloss group on the immediate comprehension test. On the delayed comprehension test, the superiority of the combined gloss group over the others remained the same while the textual-only and no-gloss groups did not differ significantly from each other in their performance. The researchers concluded that the effect of textual-plus-pictorial glosses was “stronger and longer-lasting” than textual-only glosses both in vocabulary learning and listening comprehension (Jones & Plass, Reference Jones and Plass2002: 557).

Cottam (Reference Cottam2010) investigated the effects of textual and visual glosses on listening comprehension and vocabulary learning of learners of Spanish as L2. The results pointed to the positive effects of textual glosses on text recall but no effects of visual glosses were observed. No effects of either textual or visual glosses were observed on vocabulary learning either.

Although Jones (Reference Jones2003) provided qualitative evidence based on participant interviews that the choice of visual and verbal glosses enhances students’ abilities to comprehend the material presented and to acquire vocabulary, Jones (Reference Jones2004) did not have consistent observations regarding the superiority of one type of gloss over another, except for the finding that gloss groups consistently outperformed the control group on vocabulary recognition and recall. Similarly, Cottom’s findings raise doubts as to the effectiveness of both single-mode pictorial glosses and dual-mode glosses. As such, the advantage of single-mode glosses over those of dual-mode in terms of text recall and incidental vocabulary learning in L2 listening is still an unresolved issue (Xu, Reference Xu2010).

2 The present study

The aim of the current study is to examine the effects of multimedia glosses on L2 listening comprehension and vocabulary learning in a mobile learning environment. Since few studies have examined the effects of multimedia glosses on L2 listening comprehension and the majority of the available studies on multimedia glosses were conducted in computer-based environments rather than mobile learning environments, the current study contributes to the field with its focus on listening through mobile phones. The study addresses the following research questions:

  1. 1. Does access to glosses facilitate text recall and incidental vocabulary learning when learners are engaged in listening to a story through mobile phones? If yes, are there differences between single-mode (i.e., textual-only, pictorial-only) and dual-mode (i.e., textual-plus-pictorial) glosses?

  2. 2. Are there differences between single-mode glosses (i.e., textual-only, pictorial-only) and dual-mode (i.e., textual-plus-pictorial) glosses in terms of frequency of access to glosses and time spent on task?

  3. 3. Are frequency of access to glosses and time spent on task related to text recall and incidental vocabulary learning in single-mode glosses (i.e., textual-only, pictorial-only) and dual-mode (i.e., textual-plus-pictorial) glosses?

Two hypotheses regarding the first research question were formed. Since the other questions were exploratory in nature, no predictions were made regarding Questions 2 and 3. Given the vast consensus of research on the role of glosses in language learning (see Abraham, Reference Abraham2008; Taylor, Reference Taylor2006; Yun, Reference Yun2011 for meta analyses), the participants in the gloss conditions were expected to outperform those in the no-gloss condition in terms of both text recall and incidental vocabulary learning (Hypothesis 1). In addition, based on the multimedia principle of the GTML and research showing the facilitative effects of computerized glosses providing both textual and pictorial information on listening (e.g. Jones, Reference Jones2006; Jones & Plass, Reference Jones and Plass2002), the dual-mode condition was predicted to be superior to single-mode conditions (Hypothesis 2).

2.1.1 Participants

The research took place in a state university in 2014 in Turkey. The participants were recruited among the registered freshmen students studying Public Administration, Management, and Economics in the Faculty of Management and Economics. A total of 88 students with elementary level of proficiency voluntarily participated in the study. The participants had completed one year of English Language Preparatory Program prior to the treatment; one of the researchers was the participants’ English instructor.

2.2 Development of the mobile application

A mobile-assisted listening application was developed and optimized for Samsung Galaxy Mini devices. The application connected to a web service which was developed with PHP language, MySQL database and JSON data interchange standard, and downloaded the experimental materials. The keystrokes and interface usage for each participant was synchronously sent to the web service and recorded in the database as they utilized the application. With the help of a web control panel written with HTML and PHP languages, the researcher could change the system settings, activate the conditions before the experiment started and terminate them when the experiment was over, and download the participants’ data as spreadsheets including their names, the treatment condition, total time spent on task, and glosses referenced.

2.3 Experimental conditions

The participants were randomly assigned to one of the following four groups: (1) control group with no access to glosses, (2) textual-only gloss group, (3) pictorial-only gloss group, and (4) textual-plus-pictorial gloss group. The sample size in each group was 22. The participants in each group were asked to listen to a narrative text with the choice of listening twice. They could regulate the listening task through audio control buttons by going back and forth during listening. The listening text involved an audio file that was a 13.56-minute-long story taken from the website of Voice of America, an official American broadcast for non-native speakers of English. The story was chosen not only for its pace, which was well adjusted for foreign language learners to grasp what was being said but also for its suitability to provide word-level glosses (i.e., key words in the story were easy to gloss both verbally and visually).

To determine the words to be glossed, a pilot study was conducted with a group of learners whose proficiency was similar to the ones who participated in the actual study. They listened to the text and completed a recall task. They were then given the script of the story to underline the words that were difficult for them. Forty-three words from the list of words underlined by the students were selected. Of these forty-three words, target words were identified based on their frequency in the Corpus of Contemporary American English (COCA). Twenty-five words whose frequency ranged between 5031 and 113656 occurrence in the corpus were selected as target words to be assessed. The majority of the target words occurred in the text only once (see Appendix 1). Ten of the target words were nouns, twelve were verbs and three were adjectives. Most of these words were easy to gloss visually. However, the adjectives were about feelings (i.e., glad, frightened, embarrassed) and relatively difficult to represent visually. In seeking the best representations of these target words, the researchers listed three best representations for each word by using Google Images and Red House Bilingual Dictionary (Turkish-English) and consulted six senior non-native English language instructors and a native speaker of English for their expert opinions. The experts were asked to indicate to what extent the verbal definition and the image for each word represented the word’s meaning. As such, three types of glosses (textual-only, pictorial-only, and textual-plus-pictorial glosses) were created and integrated.

The participants in the no-gloss condition received only the text without any glosses (Figure 1). The participants in this condition could see only the audio control buttons and audio bar. The instructions on the screen indicated that they could control the audio file by using the bar and the buttons. In the gloss conditions, the words appeared on the screen as a list. The participants could scroll down to see the words listed and click on them to see their dictionary definition (Figure 2), a visual representation of the word’s meaning (Figure 3) or both (Figure 4). In each condition, after the participants logged into the system, the story was downloaded and the instructions specific to the condition were provided.

Figure 1 A screenshot from the no-gloss condition

Figure 2 A screenshot from the textual-only condition

Figure 3 A screenshot from the pictorial-only condition

Figure 4 A screenshot from the text-plus-picture-condition

2.4 Data collection instruments

2.4.1 Free recall task

Free recall has been suggested as a method of assessing students’ comprehension in reading (Bernhardt, Reference Bernhardt1991). In the current study, free recall was used as an explicit measure of L2 listening comprehension in which the participants were asked to write down as much as they could recall from the text in their native language, Turkish. Free recall after the listening task was preferred over other comprehension measures that could be implemented during listening because completion of the comprehension task would interfere with the use of glosses in gloss conditions.

Performance on the recall task was evaluated based on phonetic parcel units of listening text, following the procedures of Johnson (Reference Johnson1970). Specifically, the text was divided into linguistically coherent phonetic parcels according to natural pause locus where a reader/narrator catches a breath often to enhance meaning or to emphasize the text. A total of 365 phonetic parcel units based on the narrator’s pause locus was identified. The participants’ written recall protocol was structured to exploit these naturally occurring phonetic parcels. Each parcel unit received one score. The segmenting of the protocols into phonetic parcel units was carried out independently by two raters; the parcel units were then compared and checked to ensure reliability. The inter-rater reliability of the written recall protocol is 0.93. Discrepancies were resolved in meetings between the raters.

2.4.2 Vocabulary tests

Three vocabulary measures were used in the current study in order to assess form and meaning aspect of target words: form recognition, L2 meaning production and L1 meaning production. These tests are regarded as direct tests that require demonstrating learner’s understanding of target words or production of the target forms for giving meaning either in L1 or L2 (Laufer & Goldstein, Reference Laufer and Goldstein2004; Waring & Takaki, Reference Waring and Takaki2003).

The form recognition test (Cronbach’s alpha=.871), a test of target word form knowledge, was a checklist that included 25 target words in L2 and 29 distractors. For each target word, participants were exposed to a set of three words aurally and asked to select the words they heard in the story as in (1).

$$\hskip -10pc\matrix{ {\left( 1 \right)} \quad{{\rm Pistol}} \quad \quad \quad \quad \quad\quad\,\,{{\rm Gun}} \quad \quad \quad \quad \quad\quad\,\,{{\rm Hunt}} \cr } $$

The L2 meaning production test (Cronbach’s alpha=.789) required the participants to write down the L2 equivalent of the target words provided in L1, assessing productive knowledge of meaning and form (Webb, Reference Webb2005). In (2), the Turkish word “kürk” is given and the learner is expected to write “fur.”

$$\hskip -17pc\left( 2 \right)\quad{\rm K} \"{\rm u} {\rm rk}\quad \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_$$

L1 meaning production test (Cronbach’s alpha=.807) required the participants to write the L1 translation of the target words, assessing the receptive knowledge of meaning and form (Webb, Reference Webb2005). In (3), the learner is expected to recall the meaning of the target word and write “toprak” which is the Turkish equivalent of “land”.

$$\hskip -16pc\left( 3 \right)\quad{\rm Land}\,\left( {\rm n} \right)\quad \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_$$

Although both L2 and L1 meaning production tests assessed the knowledge of form and meaning, L2 meaning production test involved moving to the word form from word meaning, while L1 meaning production test required moving toward word meaning from the word form (Nation, Reference Nation2001). Though receptive recall is easier than productive recall (Nation, Reference Nation2001), in the current study, L2 meaning production test was administered prior to the receptive test so as to minimize the possibility of getting clues from the previous test. In all the tests, each correct answer was scored 1 and incomplete or wrong answers 0.

2.5 Procedures

The data collection instruments were piloted with fifteen students who had similar language proficiency to that of the test group. There were three students for each condition. The entire process of treatment was tested. Revisions regarding the instructions, test items, and the time allowed for the tests were made based on the pilot data.

For the actual study, two classrooms were set up with a router in each of them so that the participants could connect to the internet and run the listening application. The conditions were administered at different times so as to avoid any communication between the participants before they were exposed to the treatment. Both online and offline data were collected. The online part was collected with a Samsung Galaxy Mini GT-S5570 mobile device and earphones. The students were requested to use the earphones to minimize distraction. Before the treatment, the screencasts with a different listening text and glosses were shown to the participants in each condition so that they could have a better understanding of what to expect during the application. All taps were recorded to the online database as they listened to the text. The listening span was also recorded online to examine how long it took the participants to complete listening to the text through the application. The offline data collection was conducted through paper-based activities following the listening text. Before the learning session started, given no specific instructions about the vocabulary tests, the students were only told that they were going to listen to a story through mobile phone by using the application and then would be asked to write down what they remembered from the text. When the students finished listening, they first completed the free recall task and then the vocabulary tests, which were administered in a fixed order (form recognition, L2 meaning production, and L1 meaning production, respectively) in order to minimize the possibility of learning the meaning of a given word from the previous test. As the tests aimed to measure incidental vocabulary learning, they were administered unannounced. The time allotted for data collection was approximately 80 minutes per condition.

3 Results

3.1 Effects of glosses on text recall

Descriptive statistics for the recall scores are provided in Table 1. The participants in the textual-plus-pictorial condition scored highest while those exposed to the pictorial-only gloss condition scored lowest. However, a one-way between subjects ANOVA revealed that the differences among the group means were not statistically significant, F(3, 84)=1.50, p>.05.

Table 1 Descriptive statistics for recall scores across the experimental conditions

3.2 Effects of glosses on incidental vocabulary learning

Table 2 provides the descriptive statistics for the vocabulary tests. The lowest means observed in all tests belonged to the condition where no glosses were provided. In other words, the gloss conditions had higher means than the no-gloss condition. Additionally, the means of the three gloss conditions on all tests were only slightly different.

Table 2 Descriptive statistics for vocabulary tests across the experimental conditions

A one-way MANOVA revealed a significant multivariate effect of condition on the dependent variables, Wilks Λ=.522, F(9, 199)=6.798, p<.001, η 2 =.195. A univariate ANOVA was conducted on each dependent variable as a follow-up to MANOVA. The differences among the conditions were significant for all the dependent variables, namely form recognition, F(3, 84)=20.78, p<.001, partial η 2 =0.426; L1 meaning production, F(3, 84)=5.04, p<.01, partial η 2 =0.153; L2 meaning production, F(3, 84)=3.70, p<.05, partial η 2 =0.117. Pairwise comparisons with the Bonferroni procedure (Table 3) indicated that the mean of the no-gloss condition was significantly lower than each gloss condition in relation to form recognition, lower than textual-only gloss and textual-plus-pictorial gloss conditions in relation to L1 meaning production. Similarly, the no-gloss condition had a significantly lower mean than the textual-only gloss condition in relation to L2 meaning production. The comparisons among the gloss conditions did not reveal any significant differences.

Table 3 Pairwise comparisons among the conditions in relation to vocabulary measures

Note. The table presents mean differences. *p<.05, ** p<.01, *** p<.001.

3.3 Frequency of access to glosses and time on task

Table 4 shows the frequency of glosses accessed across the three gloss conditions. It should be noted that there were 43 glossed words in total. If the participants clicked on a particular gloss more than once, each click was counted.

Table 4 Frequency of access to glosses

Table 4 indicates that the participants used glosses more frequently during the first listening than the second listening. In addition, the participants in the pictorial-only condition used the glosses more frequently than those in the other conditions during both the first and the second listening. The least number of glosses was used by the participants in the textual-only condition during both first and second listening. In all the conditions, there were some participants who did not prefer to look up glosses during the second listening.

A 2 (time: first listening vs. second listening) × 3 (gloss condition: textual-only, pictorial-only, textual-plus-pictorial) mixed ANOVA was conducted to determine whether there were significant differences among the groups in terms of access to glosses. The results revealed that the main effect of time was significant, F(1,63)=23.52, p<.001, partial η 2 = .27. However, the main effect of gloss condition, F(2,63)=1.81, p>.05 and the interaction between time and gloss condition were non-significant, F(2,63)=.02, p>.05. These findings suggest that the participants accessed significantly more glosses during first listening compared to second listening. Gloss condition, however, did not have an effect on the frequency of access to glosses.

“Time on task” refers to the total amount of time the participants spent to complete the listening task, including the amount of time they spent accessing the annotations. In other words, time on task started when the participant tapped on “begin” after reading the instructions and ended when they tapped “quit”. Descriptive statistics of listening duration across the conditions are displayed in Table 5.

Table 5 Listening duration across the experimental conditions

As seen from Table 5 above, the participants in the pictorial-only condition spent the longest time on task while those in the control condition the least in the first listening. On the other hand, the latter group spent the longest time on task in the second listening. It should be noted that the second listening was optional in all conditions for ensuring learner’s self regulation of mobile devices and autonomous learning for improving listening skill on mobile devices.

A 2 (time: first listening, second listening) × 4 (condition: no-gloss, textual-only, pictorial-only, textual-plus-pictorial) mixed ANOVA was conducted to determine whether there were significant differences among the groups in terms of the amount of time spent on the listening task. The results revealed significant main effects for time and condition, but a non-significant interaction between the two factors, F(3,84)=.24, p>.05. The main effect of time, F(1,84)=40.62, p<.001, partial η 2 =.33 suggests that significantly less time was spent on the task during the second listening. The main effect of condition, F(3,84)=4.51, p<.01, partial η 2 =.14 was probed through Tukey post-hoc comparisons (Table 6). These comparisons revealed that the participants in the textual-only gloss condition spent significantly less time on the task compared to both no-gloss condition and the other gloss conditions.

Table 6 Pairwise comparisons of average duration of listening across the conditions

Pearson product-moment correlations were examined in each gloss condition to determine whether the frequency of access to glosses and time on task were significantly related to text recall and vocabulary learning. Results indicated that the number of glosses accessed during first listening was significantly related to the number of target words recognized in the textual-only condition (r=.496, p<.05). As for time on task, there was a substantial negative correlation of duration of first listening with the number of target words recognized (r=−.51, p=.028) and L1 meaning production (r=−.566, p<.01) in the pictorial-only gloss condition. In addition, total time on task was negatively correlated with both form recognition (r=−.577, p=.01) and L1 meaning production (r=−.431, p<.05) in the no-gloss condition.

4 Discussion

The results indicate that the participants in the no-gloss condition did not have a significantly lower performance in terms of text recall compared to those in the gloss conditions, suggesting that the facilitative effect of access to glosses was not confirmed in terms of text recall. However, the hypothesis was confirmed in terms of vocabulary learning since the participants in the gloss conditions had significantly higher means than those in the control group.

The insignificant effect of glosses on text recall contrasts with the findings of previous studies which have demonstrated the beneficial effects of multimedia glosses both on reading comprehension (e.g., Chun, Reference Chun2001; Chun & Plass, Reference Chun and Plass1996a, Reference Chun and Plass1996b; Leffa, Reference Leffa1992; Lomicka, Reference Lomicka1998; Plass et al., Reference Plass, Chun, Mayer and Leutner1998; Taylor, Reference Taylor2006; Yanguas, Reference Yanguas2009) and listening comprehension (e.g., Jones, Reference Jones2004, Reference Jones2006; Jones & Plass, Reference Jones and Plass2002; Plass & Jones, Reference Plass and Jones2005). On the other hand, the findings concur with a number of studies that found no effects of glosses on text recall after reading (Ariew & Erçetin, Reference Ariew and Erçetin2004; Gasigijtamrong, Reference Gasigijtamrong2013; Jacobs, Dufon & Fong, Reference Jacobs, Dufon and Fong1994; Joyce, Reference Joyce1997; Zarei & Mahmoodzadeh, Reference Zarei and Mahmoodzadeh2014) or listening (Cottam, Reference Cottam2010).

The inconsistency regarding the effects of multimedia glosses on text comprehension can be attributed to factors such as the type of task used to measure text comprehension or the proficiency level of the learners. The task used in the current study was text recall, which elicits main ideas or salient information in the text and heavily relies on the reader’s memory. Such a task may fail to discriminate different levels of text processing such as making inferences and general recognition (Chang, Reference Chang2006: 537). For instance, Türk and Erçetin (Reference Türk and Erçetin2014) have demonstrated significant positive effects of simultaneous presentation of verbal and visual glosses with a multiple-choice test of reading comprehension but not with a recall task.

Another explanation may be related to the mediating effects of learners’ proficiency level. For instance, Ariew and Erçetin (Reference Ariew and Erçetin2004) found that advanced learners of English did not make extensive use of multimedia glosses during reading and their comprehension was not affected by access to glosses, whereas the intermediate learners used glosses extensively, with visual glosses (videos or static images) having a negative effect on their text comprehension. Since successful comprehension requires interaction between top-down and bottom-up processes, learners need to use word-level definitional glosses along with top-down strategies like using background knowledge information and contextual information to fully comprehend a text. Easy access to definitional glosses may hamper the effective use of top-down reading strategies.

Finally, the mobile environment itself may be a factor in failing to observe facilitative effects of multimedia glosses on text recall. Given the relative novelty of MALL, the students might not have acclimated to this new environment of L2 listening with gloss lookups available simultaneously (Zarei & Mahmoodzadeh, Reference Zarei and Mahmoodzadeh2014), though the interface was easy to use and perceived to be user-friendly by the students. When the students were asked how they found the application after completing the tasks, the majority of them stated that it was “toy like” and “practical to check the meaning of the words”. The physical limitations related to screen size might still have been problematic for the effective implementation of MALL (Stockwell, Reference Stockwell2008; Stockwell & Hubbard, Reference Stockwell and Hubbard2013).

The findings regarding incidental vocabulary learning are congruent with a number studies which showed the significant positive effects of glosses on incidental vocabulary learning either as a by-product of reading (e.g. Chun & Plass, Reference Chun and Plass1996a, Reference Chun and Plass1996b; Plass et al., Reference Plass, Chun, Mayer and Leutner1998; Yanguas, Reference Yanguas2009; Yoshii & Flaitz, Reference Yoshii and Flaitz2002) or listening (Jones, Reference Jones2004; Jones & Plass, Reference Jones and Plass2002). The majority of the aforementioned studies (e.g., Al-Seghayer, Reference Al-Seghayer2001; Chun & Plass, Reference Chun and Plass1996a, Reference Chun and Plass1996b; Kost, Foss & Lenzini, Reference Kost, Foss and Lenzini1999; Plass et al., Reference Plass, Chun, Mayer and Leutner1998) found positive correlations in favor of dual-modality glosses (i.e., visual-plus- textual) over the use of single-modality glosses (i.e., textual-only or pictorial-only). Findings of the current study do not support the multimedia principle of the CTML and suggest that the type of gloss does not matter in terms of facilitating incidental vocabulary learning in a mobile learning environment, disconfirming the second hypothesis of the study that predicted superiority of dual-modality glosses over those of single modality. A similar finding was obtained by Yanguas (Reference Yanguas2009) who, based on qualitative data collected through think-aloud protocols, found that learners exposed to all three gloss conditions (textual-only, pictorial-only, textual-plus-pictorial) noticed and recognized significantly more of the target words than the control group, which was exposed to only the reading text without glosses. Bowles (Reference Bowles2004) also found that both multimedia glosses and traditional glosses helped learners to notice target words. Yanguas (Reference Yanguas2009) and Bowles (Reference Bowles2004) explain their findings through the noticing hypothesis proposed by Schmidt (Reference Schmidt1990). The noticing hypothesis posits that attention and noticing are necessary correlates for input to become intake (Schmidt, Reference Schmidt2001). Considering that glosses draw learners’ attention to the meaning of unknown words during the reading or listening process, the learner’s focus is not solely on text comprehension, albeit the primary focus of the task is comprehension. In a similar vein, Plass and Jones (Reference Plass and Jones2005), in their theoretical account of second language acquisition with multimedia, argue that interaction with the text by means of information links that provide simplification, elaboration, or definitional support can facilitate apperception of input, that is the selection of verbal and pictorial information to be represented in a text base or image base. Thus, both verbal and visual glosses draw attention to linguistic information, which is essential for noticing.

The participants’ interaction with the gloss conditions provided further insights. For instance, the frequency of access to glosses did not differ significantly across the gloss conditions although time on task did. It should be noted that participants in the pictorial-only and textual-plus-pictorial gloss conditions spent significantly more time on task compared to the textual-only condition. In addition, there was a substantial positive relationship between frequency of access to textual glosses and the number of target words recognized. This finding may be explained by Schmitt’s (2008: 338) notion of engagement, which suggests that “the more a learner engages with a new word, the more likely they are to learn it”. That is, higher frequency of exposure to words facilitates vocabulary learning. In a similar vein, Kida (Reference Kida2010) highlighted a positive effect of increased exposure of the target words (three over one) on incidental vocabulary learning while reading. All in all, these findings suggest that the direct relationship between the definition provided by textual glosses and the target word might have facilitated the use of more efficient strategies, which in turn facilitated recognition of target words.

The substantial negative relationship between time on task during the first listening and the number of target words recognized in the pictorial-only gloss condition is noteworthy. Jones (Reference Jones2004) also observed that the group exposed to pictorial glosses could not produce vocabulary from memory as well as the groups that had access to written glosses. A similar finding was observed by Acha (Reference Acha2009), who demonstrated that 3rd and 4th Grade children exposed to verbal glosses could recall word translations better than those exposed to simultaneously visual and verbal glosses or visual glosses only. It is possible that pictorial glosses have a distracting effect unless they directly convey the meaning of words as in concrete nouns. Thus, they may lead to processing problems. As discussed earlier the negative effect of visual glosses on low-proficiency learners was demonstrated by Ariew and Erçetin (Reference Ariew and Erçetin2004) as well as Şakar and Erçetin (Reference Şakar and Erçetin2005) in terms of reading comprehension. It is possible that the conceptual link between glosses and unknown target words in multimedia mode is difficult for students with low-level proficiency to grasp (Hu, Vongpumivitch, Chang & Liou, Reference Hu, Vongpumivitch, Chang and Liou2014). As Plass, Chun, Mayer and Leutner (Reference Plass, Chun, Mayer and Leutner2003) demonstrated, low-ability learners learned fewer words from text compared to high-ability learners when they had to process visual glosses. Such a difference was not observed between the two groups in the case of verbal glosses. As such, the authors argue that visual glosses induce higher levels of CL since their meanings have to be interpreted unlike verbal glosses, which provide “clear and unambiguous meanings of the annotated words” (Plass et al., Reference Plass, Chun, Mayer and Leutner2003: 237). In the case of listening, it might be even more challenging for low-proficiency learners to “pick up” meaning incidentally when the given text lacks a continuous flow due to glossing.

5 Conclusion

The current study presents an empirical study of the effectiveness multimedia glosses on L2 listening comprehension and incidental vocabulary learning in a mobile learning environment. The study aimed to address whether the effects reported in the literature for computer-based environments are transferable to an environment which incorporates mobile learning devices. The findings indicate robust effects of access to glosses on incidental vocabulary learning, with no differences among the types of glosses. On the other hand, the relationship of vocabulary measures with frequency of access to glosses and time on task suggest implications for the provision and design of multimedia presentations for L2 listening and incidental vocabulary learning. The positive relationship between frequency of access to glosses and target words recognized in the textual-only gloss condition suggests that these glosses are potentially more effective than pictorial glosses. Considering negative relationships observed between time on task and vocabulary measures in the pictorial-only gloss condition, it can be concluded that pictorial glosses may not be as effective as textual glosses due to weak referential connections between verbal and visual information (Plass et al., Reference Plass, Chun, Mayer and Leutner2003). As a pedagogical implication of the current study, it can be stated that such glosses may be incorporated into multimedia presentations only with concrete nouns. The current study has also shown that dual-mode glosses are not necessarily more effective than single-mode glosses. Considering research that has demonstrated that simultaneous presentation of verbal and visual glosses may increase CL for low-ability learners, caution is warranted in incorporating both visual-only and textual-plus-visual glosses into multimedia presentations. As Taylor (Reference Taylor2009) suggests, glosses should be appropriately tailored for the proficiency level of students. Low-proficiency learners should be trained when and how to use glosses during reading or listening. These learners have a tendency to use glosses more frequently than high-proficiency learners (Ariew & Erçetin, Reference Ariew and Erçetin2004; Yun, Reference Yun2011). As for technological implications, utilizing software-embedded user-behavior tracking to monitor a learner’s look-up behavior in a MALL environment could help illuminate what students are actually doing for their own learning. This data could help identify problems and strategies for more effective follow-up learning activities (Colpaert, Reference Colpaert2004; Chun, Reference Chun1994). Furthermore, displaying the learners’ interaction patterns with the software components would make for a more in-depth analysis of skill learning (Chapelle, Reference Chapelle2001; Pujolà, Reference Pujolà2002). Last but not least, for better pictorial gloss representation on mobile devices, the technical features of the mobile devices such as the screen resolution and size of the specific mobile device must be considered.

The present study suffers from a couple of limitations. One of the fundamental limitations was the method of controlled experiment in a mobile environment. Mobile learning is characterized by the flexibility to access the material without any time and space boundaries; however, in this research the ideal MALL setting was not possible due to the experimental study design. Another limitation was the glosses for words that were abstract and related to feelings. It was challenging to ensure that the students could easily predict the meaning from the pictures. Additionally, the data were collected through mobile phones, which had reduced screen size and low screen resolution. This was especially important for the gloss groups as they were exposed to images and text in small size and resolution, which might have added to the CL induced by pictorial information. The assessment type was another limitation. There are different ways of scoring recall protocols. In the present study, each idea unit received one point, which resulted in treating main ideas and details equally. As such, the mean idea units recalled by the participants were much fewer compared to the idea units identified based on phonetic parcel units. A weighted scoring procedure where main ideas received more weight than details could have provided a better indicator of the participants’ recall performance. Additionally, delayed tests for listening comprehension and vocabulary learning were not administered in the study. Lastly, the use of convenience sampling in this study limits the generalizability of the results.

Further research should take into account learners’ verbal and visual ability levels as previous research (Chun & Plass, Reference Chun and Plass1996a, Reference Chun and Plass1996b; Plass et al., Reference Plass, Chun, Mayer and Leutner2003) has shown that such learner characteristics may be directly related to performance in a multimedia environment. In addition, WM capacity can play a significant role both in terms of affecting listening performance and mediating the effects of multimedia glosses. As such, individual differences, for example learner preferences, strategy use, and WM, should be incorporated into future research. Although controlled experiments such as the one reported here are important, the study should be replicated through a less controlled design where learners use the mobile devices to promote anytime-anywhere learning. Last but not least, future research could also look at the condition where glosses are present in the first listening but not in the second so that the distraction caused by local-level glosses at the connected discourse level is eliminated in the second listening. This might be investigated in terms of recall capacity, incidental vocabulary learning and comprehension processes.

Acknowledgements

The research underlying the study was supported by a Boğaziçi University Research Fund grant (Project No. 6961).

Appendix 1

Frequency of Occurrence of Target Words in the Text

References

Abraham, L. B. (2008) Computer-mediated glosses in second language reading comprehension and vocabulary learning: A meta-analysis. Computer Assisted Language Learning, 21(3): 199226.Google Scholar
Acha, J. (2009) The effectiveness of multimedia programmes in children’s vocabulary learning. British Journal of Educational Technology, 40(1): 2331.Google Scholar
Alemi, M., Sarab, M. R. and Lari, Z. (2012) Successful learning of academic word list via MALL: Mobile-assisted language learning. International Education Studies, 5(6): 99109.Google Scholar
Al-Seghayer, K. (2001) The effect of multimedia annotation modes on L2 vocabulary acquisition: A comparative study. Language Learning and Technology, 5(1): 202232.Google Scholar
Amer, M. (2010) Idiomobile for learners of English: A study of learners’ usage of a mobile learning application for learning idioms and collocations (Unpublished doctoral dissertation), Indiana University of Pennsylvania: Indiana, PA.Google Scholar
Ariew, R. and Erçetin, G. (2004) Exploring the potential of hypermedia annotations for second language reading. Computer Assisted Language Learning, 17(2): 237259.CrossRefGoogle Scholar
Azabdaftari, B. and Mozaheb, M. (2012) Comparing vocabulary learning of EFL learners by using two different strategies: Mobile learning vs. flashcards. The Eurocall Review, 20(2): 4759.CrossRefGoogle Scholar
Başoğlu, E. and Akdemir, O. (2010) A comparison of undergraduate students’ English vocabulary learning: Using mobile phones and flash cards. TOJET, 9(3): 17.Google Scholar
Bernhardt, E. B. (1991) Reading development in a second-language. Norwood, NJ: Ablex.Google Scholar
Bowles, M. A. (2004) L2 glossing: To CALL or not to CALL. Hispania, 87(3): 541552.Google Scholar
Bradley, C. and Holley, D. (2011) Empirical research into students’ mobile phones and their use for learning. International Journal of Mobile and Blended Learning, 3(4): 3853.Google Scholar
Brett, P. (1995) Multimedia for listening comprehension: The design of a multimedia-based resource for developing listening skills. System, 23(1): 7785.Google Scholar
Brett, P. (1997) A comparative study of the effect of the use of multimedia on listening comprehension. System, 25(1): 3951.Google Scholar
Butgereit, L. and Botha, A. (2009) Hadeda: The noisy way to practice spelling vocabulary using a cell phone. In: Cunningham, P. and Cunningham, M. (eds.), IST-Africa 2009 Conference Proceedings. Kampala: Uganda: IIMC, 1–7.Google Scholar
Çavuş, N. and İbrahim, D. (2009) m-Learning: An experiment in using SMS to support learning new English language words. British Journal of Educational Technology, 40(1): 7891.CrossRefGoogle Scholar
Colpaert, J. (2004) From courseware to coursewear? Computer Assisted Language Learning, 17(3–4): 261266.Google Scholar
Chang, Y. (2006) On the use of the immediate recall task as a measure of second language reading comprehension. Language Testing, 23(4): 520543.CrossRefGoogle Scholar
Chang, C. C., Tseng, K. H. and Tseng, J. S. (2011) Is single or dual channel with different English proficiencies better for English listening comprehension, cognitive load and attitude in ubiquitous learning environment? Computers & Education, 57(4): 23132321.Google Scholar
Chapelle, C. (2001) Computer applications in second language acquisition: Foundations for teaching, testing, and research. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Chen, C. and Chung, C. (2008) Personalized mobile English vocabulary learning system based on item response theory and learning memory cycle. Computers & Education, 51(2): 624645.CrossRefGoogle Scholar
Chinnery, G. M. (2006) Going to the MALL: Mobile-assisted language learning. Language Learning & Technology, 10(1): 916.Google Scholar
Chun, D. M. (1994) Using computer networking to facilitate the acquisition of interactive competence. System, 22(1): 1731.Google Scholar
Chun, D. M. (2001) L2 reading on the web: strategies for accessing information in hypermedia. Computer Assisted Language Learning, 14(5): 367403.Google Scholar
Chun, D. M. and Plass, J. L. (1996a) Effects of multimedia annotations on vocabulary acquisition. Modern Language Journal, 80(2): 183198.CrossRefGoogle Scholar
Chun, D. M. and Plass, J. L. (1996b) Facilitating reading comprehension with multimedia. System, 24(4): 503519.Google Scholar
Cottam, M. (2010) The effects of visual and textual annotations on Spanish listening comprehension, vocabulary acquisition and cognitive load (Unpublished doctoral dissertation). Arizona State University, AZ.Google Scholar
Crompton, H. (2013) A historical overview of mobile learning: Toward learner-centered education. In: Berge, Z. and Muilenburg, L. Y. (eds.), Handbook of mobile learning. New York: Routledge, 314.Google Scholar
Demouy, V. and Kukulska-Hulme, A. (2010) On the spot: Using mobile devices for listening and speaking practice on a French language programme. The Journal of Open, Distance and e-Learning, 25(3): 217232.CrossRefGoogle Scholar
Gasigijtamrong, J. (2013) Effects of multimedia annotations on Thai EFL readers’ words and text recall. English Language Teaching, 6(12): 4857.Google Scholar
Hoven, D. (1999) A model for listening and viewing comprehension in multimedia environments. Language Learning and Technology, 3(1): 88103.Google Scholar
Hu, S. M., Vongpumivitch, V., Chang, J. S. and Liou, H. C. (2014) The effects of L1 and L2 e-glosses on incidental vocabulary learning of junior high-school English students. ReCALL, 26(1): 8099.Google Scholar
Hwang, W. Y., Huang, Y. M., Shadiev, R., Wu, S. Y. and Chen, S. L. (2014) Effects of using mobile devices on English listening diversity and speaking for EFL elementary students. Australasian Journal of Educational Technology, 30(5): 503516.CrossRefGoogle Scholar
Jacobs, G. M., Dufon, P. and Fong, C. H. (1994) L1 and L2 vocabulary glosses in L2 reading passages: Their effectiveness for increasing comprehension and vocabulary knowledge. Journal of Research in Reading, 17(1): 1928.Google Scholar
Johnson, R. E. (1970) Recall of prose as a function of the structural importance of the linguistic units. Journal of Verbal Learning and Verbal Behavior, 9: 1220.CrossRefGoogle Scholar
Jones, L. C. (2003) Supporting listening comprehension and vocabulary acquisition with multimedia annotations: The students’ voice. CALICO Journal, 21(1): 4165.Google Scholar
Jones, L. C. (2004) Testing L2 vocabulary recognition and recall using pictorial and written test items. Language Learning & Technology, 8(3): 122143.Google Scholar
Jones, L. C. (2006) Effects of collaboration and multimedia annotations on vocabulary learning and listening comprehension. CALICO Journal, 24(1): 3358.Google Scholar
Jones, L. C. and Plass, J. L. (2002) Supporting listening comprehension and vocabulary acquisition in French with multimedia annotations. The Modern Language Journal, 86(4): 546561.Google Scholar
Joyce, E. E. (1997) Which words should be glossed in L2 reading materials? A study of first, second and third semester French students’ recall. Pennsylvania Language Forum, 5864.Google Scholar
Kennedy, C. and Levy, M. (2008) L’italiano al telefonino: Using SMS to support beginners’ language learning. ReCALL, 20(3): 315330.Google Scholar
Kida, S. (2010) The role of quality and quantity of vocabulary processing in incidental L2 vocabulary acquisition through reading. Paper presented on 7 March 2010 at the Annual Conference of the American Association for Applied Linguistics in Atlanta, GA.Google Scholar
Kim, H. S. (2011) Effects of SMS text messaging on vocabulary learning. Multimedia-Assisted Language Learning, 14(2): 159180.Google Scholar
Kim, H. S. (2013) Emerging mobile apps to improve English listening skills. Multimedia-Assisted Language Learning, 16(2): 1130.Google Scholar
Kobayashi, C. (2008) The use of pocket electronic and printed dictionaries: A mixed-method study. In: Bradford-Watts, K., Muller, T. and Swanson, M. (eds.), JALT 2007 Conference Proceedings. Tokyo: JALT, 769–783.Google Scholar
Kost, C. R., Foss, P. and Lenzini, J. J. (1999) Textual and pictorial glosses: Effectiveness on incidental vocabulary growth when reading in a foreign language. Foreign Language Annals, 32(1): 8997.CrossRefGoogle Scholar
Koyama, T. and Takeuchi, O. (2004) Comparing electronic and printed dictionaries: How the difference affected EFL learning. JACET Bulletin, 38: 3346.Google Scholar
Koyama, T. and Takeuchi, O. (2009) How effectively do good language learners use handheld electronic dictionaries: A qualitative approach. Language Education & Technology, 46: 131150.Google Scholar
Kukulska-Hulme, A. (2006) Mobile language learning now and in the future. In: Svensson, P. (ed.), Fr˚an vision till praktik: Spr˚akutbildning och Informationsteknik (From vision to practice: language learning and IT). Sweden: Swedish Net University (Nätuniversitetet), 295310.Google Scholar
Kukulska-Hulme, A. (2010) Charting unknown territory: Models of participation in mobile language learning. International Journal of Mobile Learning and Organisation, 4(2): 116129.Google Scholar
Kukulska-Hulme, A. and Traxler, J. (eds.) (2005) Mobile Learning: A handbook for educators and trainers. Abingdon: Routledge.Google Scholar
Laufer, B. and Goldstein, Z. (2004) Testing vocabulary knowledge: Size, strength, and computer adaptiveness. Language Learning, 54(3): 399436.Google Scholar
Leffa, V. J. (1992) Making foreign language texts comprehensible for beginners: An experiment with an electronic glossary. System, 20(1): 6373.Google Scholar
Lindquist, D., Denning, T., Kelly, M., Malani, R., Griswold, W. G. and Simon, B. (2007) Exploring the potential of mobile phones for active learning in the classroom. ACM SIGCSE Bulletin, 39(1): 384388.CrossRefGoogle Scholar
Lomicka, L. (1998) To gloss or not to gloss an investigation of reading comprehension online, language learning and technology. Language Learning and Technology, 1(2): 4150.Google Scholar
Lu, M. (2008) Effectiveness of vocabulary learning via mobile phone. Journal of Computer Assisted Learning, 24(6): 515525.Google Scholar
Mayer, R. E. (2009) Multimedia learning (2nd edn.). New York: Cambridge University Press.Google Scholar
Meier, A. (2015) L2 incidental vocabulary acquisition through extensive listening to podcasts. Working Papers in TESOL & Applied Linguistics, 15(2): 7284.Google Scholar
Melhuish, K. and Fallon, G. (2010) Looking to the future: M-learning with the iPad. Computers in New Zealand Schools: Learning, Leading, Technology, 22(3): 116.Google Scholar
Meskill, C. (1996) Listening skills through multimedia. Journal of Educational Multimedia and Hypermedia, 5(2): 179201.Google Scholar
Motallebzadeh, K., Beh-Afarin, R. and Daliry Rad, S. (2011) The effect of short message service on the retention of collocations among Iranian lower intermediate EFL learners. Theory and Practice in Language Studies, 1(11): 15141520.Google Scholar
Nah, K. C., White, P. and Sussex, R. (2008) The potential of using a mobile phone to access the Internet for learning EFL listening skills within a Korean context. ReCALL, 20(3): 331347.Google Scholar
Nation, I. S. P. (2001) Learning vocabulary in another language. Cambridge: Cambridge University Press.Google Scholar
Pachler, N., Bachmair, B. and Cook, J. (2013) A sociocultural ecological frame for mobile learning. In: Berge, Z. and Muilenburg, L. Y. (eds.), Handbook of mobile education. New York: Routledge, 3546.Google Scholar
Plass, J. L. and Jones, L. C. (2005) Multimedia learning in second language acquisition. In: Mayer, R. E. (ed.), The Cambridge handbook of multimedia learning. New York: Cambridge University Press, 467488.Google Scholar
Plass, J. L., Chun, D. M., Mayer, R. E. and Leutner, D. (1998) Supporting visual and verbal learning preferences in a second-language multimedia learning environment. Journal of Educational Psychology, 90(1): 2536.Google Scholar
Plass, J. L., Chun, D. M., Mayer, R. E. and Leutner, D. (2003) Cognitive load in reading a foreign language text with multimedia aids and the influence of verbal and spatial abilities. Computers in Human Behavior, 19(2): 221243.Google Scholar
Pujolà, J. T. (2002) CALLing for help: Researching language learning strategies using helpfacilities in a web-based multimedia program. ReCALL, 14(2): 235262.CrossRefGoogle Scholar
Reinders, H. and Cho, M. Y. (2010) Extensive listening practice and input enhancement using mobile phones: Encouraging out-of-class learning with mobile phones. TESL-EJ, 14(2): 17.Google Scholar
Saran, M., Seferoğlu, G. and Çağıltay, K. (2009) Mobile-assisted language learning: English pronunciation at learners’ fingertips. Eurasian Journal of Educational Research, 34: 97114.Google Scholar
Schmidt, R. (1990) The role of consciousness in second language learning. Applied Linguistics, 11(2): 129158.Google Scholar
Schmidt, R. (2001) Attention. In: Robinson, P. (ed.), Cognition and second language instruction. Cambridge: Cambridge University Press, 332.CrossRefGoogle Scholar
Schmitt, N. (2008) Instructed second language vocabulary learning. Language Teaching Research, 12(3): 329363.Google Scholar
Sharples, M., Sánchez, I., A., Milrad, M. and Vavoula, G. (2009) Mobile learning: Small devices, big issues. In: Blacheff, N., Ludvigsen, S., Jong de, T., Lazonder, A. and Barnes, S. (eds.), Technology-enhanced learning: Principles and products. Berlin: Springer-Verlag, 223251.Google Scholar
Song, Y. and Fox, R. (2008) Using PDA for undergraduate student incidental vocabulary testing. ReCALL, 20(3): 290314.Google Scholar
Sung, Y.-T., Chang, K.-E. and Liu, T.-C. (2016) The effects of integrating mobile devices with teaching and learning on students’ learning performance: A meta-analysis and research synthesis. Computers & Education, 94: 252275.Google Scholar
Stockwell, G. (2007) Vocabulary on the move: Investigating an intelligent mobile phone-based vocabulary tutor. Computer Assisted Language Learning, 20(4): 365383.Google Scholar
Stockwell, G. (2008) Investigating learner preparedness for and usage patterns of mobile learning. ReCALL, 20(3): 253270.Google Scholar
Stockwell, G. and Hubbard, P. (2013) Some emerging principles for mobile-assisted language learning. Monterey, CA: The International Research Foundation for English Language Education.Google Scholar
Şakar, A. and Erçetin, G. (2005) Effectiveness of hypermedia annotations for foreign language reading. Journal of Computer-Assisted Learning, 21(1): 2838.Google Scholar
Taylor, A. M. (2006) The effects of CALL versus traditional L1 glosses on L2 reading comprehension. CALICO Journal, 23(2): 309318.Google Scholar
Taylor, A. M. (2009) CALL-based versus paper-based glosses: Is there a difference in reading comprehension? CALICO Journal, 27(1): 147160.Google Scholar
Türk, E. and Erçetin, G. (2014) Effects of interactive versus simultaneous display of multimedia glosses on L2 reading comprehension and incidental vocabulary learning. Computer Assisted Language Learning, 27(1): 125.Google Scholar
van Zeeland, H. and Schmitt, N. (2013) Incidental vocabulary acquisition through L2 listening: A dimensions approach. System, 41(3): 609624.Google Scholar
Waring, R. and Takaki, M. (2003) At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language, 15(2): 130163.Google Scholar
Webb, S. (2005) Receptive and productive vocabulary learning: The effects of reading and writing on word knowledge. Studies in Second Language Acquisition, 27(1): 3352.Google Scholar
Xu, J. (2010) Using multimedia vocabulary annotations in L2 reading and listening activities. CALICO Journal, 27(2): 311327.CrossRefGoogle Scholar
Yanguas, I. (2009) Multimedia glosses and their effect on L2 text comprehension and vocabulary learning. Language Learning and Technology, 13(2): 4867.Google Scholar
Yeh, Y. and Wang, C. (2003) Effects of multimedia vocabulary annotations and learning styles on vocabulary learning. CALICO Journal, 21(1): 131144.CrossRefGoogle Scholar
Yoshii, M. and Flaitz, J. (2002) Second language incidental vocabulary retention: The effect of text and picture annotation types. CALICO Journal, 20(1): 3358.CrossRefGoogle Scholar
Yun, J. (2011) The effects of hypertext glosses on L2 vocabulary acquisition: A meta-analysis. Computer Assisted Language Learning, 24(1): 3958.Google Scholar
Zarei, A. A. and Mahmoodzadeh, P. (2014) The effect of multimedia glosses on L2 reading comprehension and vocabulary production. Journal of English Language and Literature, 1(1): 17.Google Scholar
Zhang, F. (2012) Combining the body and mobile technology to teach English pronunciation. In: Zhang, F. (ed.), Computer-enhanced and mobile-assisted language learning: Emerging issues and trends. Hershey, PA: IGI Global, 202219.Google Scholar
Zhang, H., Song, W. and Burston, J. (2011) Reexamining the effectiveness of vocabulary learning via mobile phones. Turkish Online Journal on Educational Technology, 10(3): 203214.Google Scholar
Figure 0

Figure 1 A screenshot from the no-gloss condition

Figure 1

Figure 2 A screenshot from the textual-only condition

Figure 2

Figure 3 A screenshot from the pictorial-only condition

Figure 3

Figure 4 A screenshot from the text-plus-picture-condition

Figure 4

Table 1 Descriptive statistics for recall scores across the experimental conditions

Figure 5

Table 2 Descriptive statistics for vocabulary tests across the experimental conditions

Figure 6

Table 3 Pairwise comparisons among the conditions in relation to vocabulary measures

Figure 7

Table 4 Frequency of access to glosses

Figure 8

Table 5 Listening duration across the experimental conditions

Figure 9

Table 6 Pairwise comparisons of average duration of listening across the conditions