Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-04T13:42:27.137Z Has data issue: false hasContentIssue false

Testing the deficient processing account of the spacing effect in second language vocabulary learning: Evidence from eye tracking

Published online by Cambridge University Press:  30 July 2019

Natalie G. Koval*
Affiliation:
Michigan State University
*
*Corresponding author. E-mail: kovalnat@msu.edu
Rights & Permissions [Opens in a new window]

Abstract

The spacing effect refers to the learning benefit that comes from separating repeated study of target items by time or by other items. A prominent proposed explanation for this effect states that repeated exposures that occur closely together may not engage full attentional processing due to residual activation of the previous exposure and also, in an intentional learning context, due to a sense of familiarity that may result in strategic allocation of less study time to an item in massed repetitions. The present study used eye-tracking methodology to investigate the effects of temporal distribution of repeated exposures to novel second language words on attentional processing and learning of these words under intentional learning instructions. Adult native speakers of English read Finnish words embedded in English sentence contexts under massed and spaced conditions. The results showed that (a) massed repeated exposures received less attentional processing than spaced repeated exposures; (b) target words were better remembered in the spaced condition; and (c) attention was a significant mediator of the obtained spacing effect, in line with the predictions of the deficient processing account of the spacing effect. Implications for vocabulary learning are discussed.

Type
Original Article
Copyright
© Cambridge University Press 2019 

It is widely believed that attention plays an important role in second language (L2) learning (Gass, Reference Gass1988; Robinson, Reference Robinson, Doughty and Long2003; Schmidt, Reference Schmidt1990, Reference Schmidt and Robinson2001). This includes learning L2 vocabulary. Studies employing eye-tracking methodology have consistently found a positive relationship between the time a reader spends looking at a novel word and learning of the word (Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Godfroid, Boers, & Housen, Reference Godfroid, Boers and Housen2013; Godfroid & Schmidtke, Reference Godfroid, Schmidtke, Bergsleithner, Frota and Yoshioka2013; Mohamed, Reference Mohamed2017; Pellicer-Sánchez, Reference Pellicer-Sánchez2016). Such studies further show that amount of attentional processing benefits learning above and beyond the number of exposures to a target word (Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Mohamed, Reference Mohamed2017), the latter variable being one of the best known and most intuitive predictors of vocabulary learning success (Webb, Reference Webb2007). Thus, we know that vocabulary learning is more successful if the to-be-learned words are encountered repeatedly and also if these encounters are processed more attentively.

A number of studies have used eye tracking to investigate how repeated exposures to novel vocabulary are processed in terms of the amount of attention they receive and how this affects learning. A variable that has not yet been considered in this line of inquiry is the temporal distribution of repeated exposures. However, this variable is important because how closely together or widely apart repeated encounters with a target word occur may have an effect on both attention and learning. Research in psychology has consistently shown that learning from repeated study is more effective when repetitions are distributed over time rather than massed together (Cepeda, Pashler, Vul, Wixted, & Rohrer, Reference Cepeda, Pashler, Vul, Wixted and Rohrer2006). This is known as the spacing effect. A prominent explanation for the effect, known as the deficient processing theory, states that repeated encounters with to-be-learned material receive more attentional processing if they are spaced more widely (Callan & Schweighofer, Reference Callan and Schweighofer2010; Cuddy & Jacoby, Reference Cuddy and Jacoby1982). Because attention is known to be important for L2 learning success, it is important to understand how it is affected by the temporal distribution of repeated exposures to target forms. The present study extends on previous research into attentional processing and learning during repeated exposures to novel L2 vocabulary by including a manipulation of the temporal distribution of exposures. If, as predicted by the deficient processing hypothesis, the more widely spaced exposures are found to engage more attentional processing, this may have important implications for efforts to induce more attention to target forms in L2 pedagogy. Further, with the help of a mediation analysis, I investigate the contribution of deficient processing to the spacing effect in L2 vocabulary learning from different sentence contexts.

The Spacing Effect

The spacing effect (or distributed practice effect) refers to the widely obtained finding in psychology that memory for repeatedly studied material is better when repetitions are separated by time or intervening material than when the same number of repetitions follow consecutively, or in a massed fashion (Cepeda et al., Reference Cepeda, Pashler, Vul, Wixted and Rohrer2006; Cepeda, Vul, Rohrer, Wixted, & Pashler, Reference Cepeda, Vul, Rohrer, Wixted and Pashler2008; Delaney, Verkoeijen, & Spirgel, Reference Delaney, Verkoeijen, Spirgel and Ross2010; Dempster, Reference Dempster1988; Gerbier & Toppino, Reference Gerbier and Toppino2015; Pavlik & Anderson, Reference Pavlik and Anderson2005; Rohrer & Pashler, Reference Rohrer and Pashler2007). A related finding termed the lag effect is the finding that longer interstudy intervals are more beneficial for long-term retention of knowledge than shorter interstudy intervals (Melton, Reference Melton1967). Probably because the two effects are so closely related and also because in some situations “massed” practice in its strictest sense may not be as relevant as practice that is separated by intervals of varying lengths (Seabrook, Brown, & Solity, Reference Seabrook, Brown and Solity2005), the terminological distinction is not always made in the literature, where the term spacing effect is often used to refer to lag effects (see, e.g., Bird, Reference Bird2010; Nakata & Webb, Reference Nakata and Webb2016; Zhao et al., Reference Zhao, Wang, Liu, Xiao, Jiang, Chen and Xue2015). For simplicity, the term spacing effect will be used henceforth to refer to both phenomena.

The spacing effect is one of the most robust findings in memory research (Cepeda et al., Reference Cepeda, Pashler, Vul, Wixted and Rohrer2006; Gerbier & Toppino, Reference Gerbier and Toppino2015), where it has been the focus of much interest since at least the late 1800s (Ebbinghaus, Reference Ebbinghaus1885). The benefits of spacing practice have been consistently obtained in a wide range of experimental paradigms, with vastly diverse tasks, and with vastly diverse populations that are not limited to our species (Donovan & Radosevich, Reference Donovan and Radosevich1999; Yin, Del Vecchio, Zhou, & Tully, Reference Yin, Del Vecchio, Zhou and Tully1995). The spacing effect has also consistently been obtained with foreign language vocabulary as learning targets (see, e.g., Bahrick, Bahrick, Bahrick, & Bahrick, Reference Bahrick, Bahrick, Bahrick and Bahrick1993; Callan & Schweighofer, Reference Callan and Schweighofer2010; Kang, Lindsey, Mozer, & Pashler, Reference Kang, Lindsey, Mozer and Pashler2014; Pashler, Zarow, & Triplett, Reference Pashler, Zarow and Triplett2003; Pavlik & Anderson, Reference Pavlik and Anderson2005). The robustness and generality of the effect suggests that it potentially holds great promise for any learning situation. In order to make the best use of this powerful learning tool, we need a good understanding of the variables that may mediate and moderate the relationship between the spacing of repeated study and learning outcomes in our specific learning contexts. However, second language acquisition (SLA) research has thus far focused only on whether or not spacing is beneficial for the acquisition of various aspects of an L2, such as syntax (Bird, Reference Bird2010; Rogers, Reference Rogers2015) and vocabulary (Bloom & Shuell, Reference Bloom and Shuell1981; Nakata, Reference Nakata2015; Nakata & Webb, Reference Nakata and Webb2016; Schuetze, Reference Schuetze2015). No SLA studies, to my knowledge, have yet investigated the associated underlying mechanisms. The present study begins to address this gap by exploring the contribution of deficient processing as a potential mediator for the spacing effect in contextual learning of novel L2 vocabulary. It is recognized today that likely more than one cognitive mechanism underlies the spacing effect and that the engagement of these different processes or combinations of processes may depend on a number of variables, such as the target task (Gerbier & Toppino, Reference Gerbier and Toppino2015; Glenberg & Smith, Reference Glenberg and Smith1981; Greene, Reference Greene1989; Kornell & Bjork, Reference Kornell and Bjork2008; Russo & Mammarella, Reference Russo and Mammarella2002). A good understanding of what processes are relevant for SLA contexts is crucial for determining how and when the spacing effect may be useful for L2 teaching and learning as well as how to use it most effectively in any particular language learning situation (Rogers, Reference Rogers2015, p. 864). The focus of this study on deficient processing as a potential mediator for the spacing effect in L2 vocabulary acquisition is motivated by the widely accepted theory that attention facilitates L2 learning (Schmidt, Reference Schmidt1990, Reference Schmidt and Robinson2001). The next section briefly discusses the deficient processing theory as well as other prominent efforts in psychology research to specify the mechanisms underlying the spacing effect.

Explaining The Spacing effect

A large number of theories have been proposed to explain the spacing effect (see, e.g., Benjamin & Tullis, Reference Benjamin and Tullis2010; Bjork & Allen, Reference Bjork and Allen1970; Challis, Reference Challis1993; Estes, Reference Estes1955; Glenberg, Reference Glenberg1979; Hintzman, Block, & Summers, Reference Hintzman, Block and Summers1973; Jacoby, Reference Jacoby1978; Landauer, Reference Landauer1969; Maddox, Reference Maddox2016; Melton, Reference Melton1970; Pavlik & Anderson, Reference Pavlik and Anderson2005; Raaijmakers, Reference Raaijmakers2003). Theories that have received the most empirical attention can be broadly classified into those that explain the effect in terms of encoding variability and those that explain it in terms of deficient processing. According to the encoding variability family of accounts (Bower, Reference Bower, Melton and Martin1972; Glenberg, Reference Glenberg1976; Greene, Reference Greene1989; Maddox, Reference Maddox2016; Raaijmakers, Reference Raaijmakers2003), spaced repetitions are more likely to be experienced in different contextual states, resulting in the encoding of more diverse contextual elements that serve as important retrieval routes at test. According to the deficient processing family of accounts (Callan & Schweighofer, Reference Callan and Schweighofer2010; Challis, Reference Challis1993; Cuddy & Jacoby, Reference Cuddy and Jacoby1982; Dellarosa & Bourne, Reference Dellarosa and Bourne1985; Gerbier & Toppino, Reference Gerbier and Toppino2015; Greene, Reference Greene1989; Hintzman et al., Reference Hintzman and Block1973; Jacoby, Reference Jacoby1978; Johnston & Uhl, Reference Johnston and Uhl1976; Krug, Davis, & Glover, Reference Krug, Davis and Glover1990; Pavlik & Anderson, Reference Pavlik and Anderson2005; Zechmeister & Shaughnessy, Reference Zechmeister and Shaughnessy1980), when an item is repeated immediately or very shortly after its initial presentation, the repetition is processed less fully because its prior presentation may still be activated in short-term memory or readily accessible (Cuddy & Jacoby, Reference Cuddy and Jacoby1982; Greeno, Reference Greeno1967; Whitten & Bjork, Reference Whitten and Bjork1977). The difference in the amount of processing that an item receives is said to be responsible for the observed differences in learning outcomes.

Deficient processing of massed repetitions may be due to voluntary mechanisms, where it is the result of a conscious decision to devote less attention or study time to an immediately repeated item due to an increased sense of familiarity (Greene, Reference Greene1989; Rundus, Reference Rundus1971; Shaughnessy, Zimmerman, & Underwood, Reference Shaughnessy, Zimmerman and Underwood1972; Verkoeijen & Delaney, Reference Verkoeijen and Delaney2008; Zechmeister & Shaughnessy, Reference Zechmeister and Shaughnessy1980; Zimmerman, Reference Zimmerman1975). Alternatively, a failure to process a stimulus as extensively when it is repeated immediately may be due to more automatic mechanisms of a less strategic nature (Callan & Schweighofer, Reference Callan and Schweighofer2010; Challis, Reference Challis1993; Mammarella, Avons, & Russo, Reference Mammarella, Avons and Russo2004; Topino, Reference Toppino1991; Van Strien, Verkoeijen, Van der Meer, & Franken, Reference Van Strien, Verkoeijen, Van der Meer and Franken2007; Xue et al., Reference Xue, Mei, Chen, Lu, Poldrack and Dong2011). Processing may further be deficient in terms of quantity or quality. In the former case, the same underlying operations are engaged to a lesser extent or simply operate more quickly. In the latter case, the system relies on a different set of operations or some of the encoding operations are dropped and are not repeated during the second processing event. An example of a qualitative change in processing is retrieving a solution to a problem from memory instead of repeating the computational process that produced the solution (Jacoby, Reference Jacoby1978; Rose, Reference Rose1984).

The generality of the spacing effect suggests that it likely reflects some fundamental property of the memory system (Gerbier & Toppino, Reference Gerbier and Toppino2015, p. 50). However, specifying a single unitary mechanism that would account for the wide range of consistently obtained findings has proven difficult (Benjamin & Tullis, Reference Benjamin and Tullis2010; Delaney et al., Reference Delaney, Verkoeijen, Spirgel and Ross2010; Gerbier & Toppino, Reference Gerbier and Toppino2015; Greene, Reference Greene1989; Maddox, Reference Maddox2016; Verkoeijen, Rikers, & Schmidt, Reference Verkoeijen, Rikers and Schmidt2004). Thus, in their basic form, the encoding variability and deficient processing theories cannot readily accommodate the important findings of super-additivity and non-monotonicity (Benjamin & Tullis, Reference Benjamin and Tullis2010; Maddox, Reference Maddox2016). Super-additivity refers to the finding that likelihood of successful performance on a memory test following spaced practice is greater than what would be expected from independent encoding events (Begg & Green, Reference Begg and Green1988; Benjamin & Tullis, Reference Benjamin and Tullis2010; Ross & Landauer, Reference Ross and Landauer1978). This suggests that repetition plays a special role and a dependency between the memory traces is crucial (Bellezza, Winkler, & Andrasik, Reference Bellezza, Winkler and Andrasik1975; Hintzman & Block, Reference Hintzman and Block1973). Such trace dependency is not readily accommodated by the encoding variability or deficient processing theories in their basic form, as the more independent repetitions are, the more likely they are to be encoded with different contextual associations and to be processed fully.

Non-monotonicity refers to the shape of the lag function (an inverted U). Thus, while increasing the interstudy interval initially has a positive effect on learning, there comes a point at which an optimal interstudy interval is reached and beyond which increasing spacing any further actually has a negative effect on memory (Cepeda et al., Reference Cepeda, Pashler, Vul, Wixted and Rohrer2006, Reference Cepeda, Vul, Rohrer, Wixted and Pashler2008). As both contextual variability and amount of processing should monotonically increase the longer the interstudy interval, this finding, too, cannot be readily explained.

To better account for experimental evidence, a number of multiprocess theories have been proposed. These (see, e.g., Benjamin & Tullis, Reference Benjamin and Tullis2010; Delaney et al., Reference Delaney, Verkoeijen, Spirgel and Ross2010; Greene, Reference Greene1989; Maddox, Reference Maddox2016; Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2004) combine the effects of contextual fluctuation or deficient processing with the central assumption of the study-phase retrieval theory (Delaney et al., Reference Delaney, Verkoeijen, Spirgel and Ross2010; Greene, Reference Greene1989; Hintzman et al., Reference Hintzman and Block1973; Madigan, Reference Madigan1969; Thios & D’Agostino, Reference Thios and D’Agostino1976). The assumption is that an item’s initial presentation must be retrieved at the time of its repetition, or that the repetition of a stimulus must “remind” (Benjamin & Tullis, Reference Benjamin and Tullis2010) of its previous occurrence. Both super-additivity and non-monotonicity can be accommodated with this additional assumption. At very long interstudy intervals, successful retrieval is less likely due to memory trace decay, which would lead to a failure of super-additive effects and, consequently, less effective learning. This, in turn, would show up as the postinflection downward trajectory in learning as a function of spacing.

Other important findings can also be accommodated if we make this additional assumption. An important finding is that intentional learning produces larger spacing effects and also shows a longer optimal interstudy interval than incidental learning (Toppino & Bloom, Reference Toppino and Bloom2002; Verkoeijen, Rikers, & Schmidt, Reference Verkoeijen, Rikers and Schmidt2005). This makes sense if we assume that intentional learning produces stronger and more durable memory traces. Another important finding is that when the context of study is purposefully changed between repetitions, such experimenter-introduced variability benefits learning of massed repetitions but has an adverse effect on learning of spaced repetitions (Johnston & Uhl, Reference Johnston and Uhl1976; Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2004). If we assume that a repetition must remind of its previous presentation, then the detrimental effects of experimenter-imposed context change for the spaced condition can be explained by the fact that such change may render repetitions too dissimilar and the increased lag may further make recognition less likely (Benjamin & Tullis, Reference Benjamin and Tullis2010; Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2004).

Investigating Deficient Processing

Repetitions that occur close in time have been consistently shown to engage less attentional processing relative to spaced repetitions in a number of experimental paradigms in psychological research. It has been shown, for example, that less time is devoted to processing or studying stimuli and performing tasks that are repeated immediately, including repeated readings of texts (Hyönä & Niemi, Reference Hyönä and Niemi1990; Krug et al., Reference Krug, Davis and Glover1990; Rose, Reference Rose1984; Wahlheim, Dunlosky, & Jacoby, Reference Wahlheim, Dunlosky and Jacoby2011). Massed repetitions of task performance require less effort as measured by pupil dilations (Magliero, Reference Magliero1983) and secondary task performance (Johnston & Uhl, Reference Johnston and Uhl1976). Further evidence that massed repetitions receive diminished processing comes from studies that have used techniques such as electroencephalogram and functional magnetic resonance imaging to investigate brain activity during the processing of massed and spaced repetitions (see, e.g., Callan & Schweighofer, Reference Callan and Schweighofer2010; Kim, Kim, & Kwon, Reference Kim, Kim and Kwon2001; Van Strien et al., Reference Van Strien, Verkoeijen, Van der Meer and Franken2007; Xue et al., Reference Xue, Mei, Chen, Lu, Poldrack and Dong2011; Zhao et al., Reference Zhao, Wang, Liu, Xiao, Jiang, Chen and Xue2015).

Past research into deficient processing as an explanation for the spacing effect has also employed self-paced study of stimuli, such as words (Rundus, Reference Rundus1971; Shaughnessy et al., Reference Shaughnessy, Zimmerman and Underwood1972; Zimmerman, Reference Zimmerman1975). Here, target words are presented one per slide, and participants move from slide to slide at their own pace. The time participants spend on each slide is recorded as a measure of processing time. These studies showed that participants allocated more study time to more widely spaced words and these, in turn, were better remembered. However, because the benefit of spacing was greater than what was predicted from study time, Shaughnessy et al. speculated that participants may have used a strategy of holding off on ending a trial not to appear uninterested, particularly with massed repetitions. Some studies have included recordings of overt rehearsal (Rundus, Reference Rundus1971, Experiment 3; Zimmerman, Reference Zimmerman1975). Zimmerman (Reference Zimmerman1975) included this specifically to ensure equivalence between nominal and functional study time, reasoning that as long as participants are saying a word aloud, some type of processing must still be going on. However, learning benefits were still underestimated by study time. This led Zimmerman to question the ability of deficient processing to serve as an explanatory mechanism for the spacing effect. It could be argued, however, that overt rehearsal may still not capture certain important differences in processing, such as the fact that the nature of processing may not be constant throughout, with the first time a word is read aloud (either at initial exposure or after a delay) engaging deeper processing than when it is repeated toward the end of a series of consecutive repetitions. Further, this design, too, is not immune to effects of strategies as, in order to avoid appearing uninterested, participants may still repeat a word beyond what they might normally do, which might, again, be more likely with massed repetitions.

The present experiment extends on this line of research by using eye tracking to investigate self-paced study of novel L2 words within sentence contexts. Eye-tracking methodology consists in recording what parts of written input a participant looks at and for how long as well as the progression of their eye movements through written discourse. Eye tracking affords millisecond precision in measuring attention allocation to different parts of the visual display. Here, exposure time and processing time are more likely to be equivalent because of the hypothesized tight “eye-mind link” (Just & Carpenter, Reference Just and Carpenter1980). Eye-tracking methodology further affords an in-depth investigation into the different stages of processing, allowing us to look separately at early processes involved in word recognition and later processes that may include intentional rehearsal. The many different eye-movement indices that are recorded can further inform about qualitative differences in attention patterns between experimental conditions, which also allows us to infer any strategies adopted by participants in performing a given task.

While the overwhelming majority of spacing effect studies have investigated rote memorization, such as learning of lists of paired associates, benefits of spacing have also been demonstrated with more meaningful materials and higher level learning (Helsdingen, van Gog, & van Merrienboer, Reference Helsdingen, van Gog and van Merriënboer2011; Kapler, Weston, & Wiseheart, Reference Kapler, Weston and Wiseheart2015; Kornell & Bjork, Reference Kornell and Bjork2008; Reder & Anderson, Reference Reder and Anderson1982; Sobel, Cepeda, & Kapler, Reference Sobel, Cepeda and Kapler2011; Vlach & Sandhofer, Reference Vlach and Sandhofer2012; Wahlheim et al., Reference Wahlheim, Dunlosky and Jacoby2011). It has also been shown that the relevance of deficient processing theories is not limited to rote learning (Krug et al., Reference Krug, Davis and Glover1990; Rose, Reference Rose1984; Wahlheim et al., Reference Wahlheim, Dunlosky and Jacoby2011). In the present study, participants learned novel L2 words within sentence contexts, which constitutes more meaningful learning and is a departure from the traditional word list learning paradigm. Further, repetitions of the same word occurred in different sentence contexts. Encountering target vocabulary in different contexts is a common scenario for language learning. Such context variability has been shown to have the effect of increasing processing for massed repetitions (Dellarosa & Bourne, Reference Dellarosa and Bourne1985). Thus, in the present study massed repetitions cannot be processed as inattentively as they might in word list learning because each new sentence context necessitates processing of the target word in a new situation for comprehension of the sentence. However, for spaced repetitions such variability might mean lower likelihood of study-phase retrieval, or reminding (Benjamin & Tullis, Reference Benjamin and Tullis2010; Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2004), which might have a negative effect on learning. The present study investigates whether spacing will still have the effect of increasing attention above and beyond what is engendered by such variety in contexts and whether benefits of spacing will be observed in such diverse sentence contexts.

Attention and L2 Vocabulary Learning

In both incidental and intentional L2 vocabulary learning, more attentional engagement with a word has been shown to result in better learning outcomes (Craik & Lockhart, Reference Craik and Lockhart1972; Godfroid et al., Reference Godfroid, Boers and Housen2013; Laufer & Hulstijn, Reference Laufer and Hulstijn2001; Mohamed, Reference Mohamed2017; Schmitt, Reference Schmitt2008). Studies employing eye-tracking methodology have consistently shown a positive relationship between the amount of attention a target word receives and learning of the word (Godfroid et al., Reference Godfroid, Boers and Housen2013; Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Godfroid & Schmidtke, Reference Godfroid, Schmidtke, Bergsleithner, Frota and Yoshioka2013; Mohamed, Reference Mohamed2017; Pellicer-Sánchez, Reference Pellicer-Sánchez2016). From among the different measures that are used in eye tracking, attentional processing of vocabulary items during reading is investigated by measuring the duration and number of fixations that an area of interest receives. A fixation is when the gaze remains relatively still on an area of interest. It is during fixations that visual information is taken in (Rayner, Reference Rayner2009). Fixation duration is indicative of amount of attentional processing and encoding effort (Rayner, Reference Rayner1998). An area of interest often receives more than one fixation: these may be consecutive or they may be the result of a regression, which refers to revisiting an area of interest after the gaze has moved past it. In investigating novel word processing, the following measures are often employed: first fixation duration, which is the duration of the very first fixation on an area of interest; gaze duration, which is the sum of all fixations in an area of interest before the eyes exit the area of interest either to the right or to the left; and total reading time, which is the sum of all fixations in an area of interest, including those made during regressions.

A number of studies have used eye tracking to explore the effects of repetition on attention and learning of novel words (Elgort, Brysbaert, Stevens, & Van Assche, Reference Elgort, Brysbaert, Stevens and Van Assche2017; Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Joseph, Wonnacott, Forbes, & Nation, Reference Joseph, Wonnacott, Forbes and Nation2014; Mohamed, Reference Mohamed2017; Pellicer-Sánchez, Reference Pellicer-Sánchez2016). In addition to reporting a positive relationship between attention and learning, these have consistently reported a decrease in reading times across repetitions, which is taken as an indication of increased familiarity and, in some studies, even used as a measure of learning (e.g., Joseph et al., Reference Joseph, Wonnacott, Forbes and Nation2014). Another very important finding from this line of research is that amount of attention benefits learning above and beyond the number of times a target word is encountered (Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Mohamed, Reference Mohamed2017). This suggests that learning success depends not only on how many times a learner is exposed to a word but also on how these repeated exposures are utilized, or how much attentional processing they receive. Thus, finding ways of inducing more attention to target words is an important endeavor that potentially holds great benefits for teaching and learning vocabulary. Existing research offers valuable insights into how repeated encounters with novel vocabulary are processed and how this, in turn, affects learning. Missing from this line of research, however, is a consideration of any effects that temporal distribution of repeated encounters may have on both learning and attention, as well as, possibly, on the relationship between the two. The present study extends on existing research by investigating the effects of this important variable.

The Present Study

The present study used a within-subjects within-items counterbalanced design to explore the following research questions:

  1. 1. Does repetition affect attentional processing of novel L2 words differently under massed and spaced conditions?

  2. 2. Does the massing versus spacing of repeated exposures to novel L2 vocabulary affect learning gains?

  3. 3. Is the effect of spacing, if any, mediated by attention, operationalized as reading times for the target words?

Participants

Participants were 40 undergraduate and graduate students (21 males and 19 females, age: 18–42 years, M = 21, SD = 4.4) from a wide range of majors at Michigan State University, selected randomly from a large number of students who responded to an ad for the study that had been placed through the office of the registrar. All had normal or corrected-to-normal vision. All were native speakers of English. Ninety percent indicated that they had knowledge of a language other than English, with Spanish being reported most often (59%). Thirty-two percent indicated knowledge of more than one foreign language. None had any knowledge of Finnish (the language used in the present study). All were compensated for their time with cash. The sample size was based on Brysbaert and Stevens’s (Reference Brysbaert and Stevens2018) recommendation for properly powered repeated-measures reaction time experiments that use mixed-effects statistical models.

Materials

The experiment consisted of a study phase, a distractor math task, immediate vocabulary posttests, delayed vocabulary posttests, and a background questionnaire.

Study materials

Twenty-four Finnish words were selected as targets. These were chosen such that they were not cognates of their English translations. The words were divided into two lists, one of which was to be presented in the massed and the other in the spaced conditions, in a counterbalanced design. List A contained 9 nouns and 3 adjectives. List B contained 11 nouns and 1 adjective. The two lists were matched on the number of letters per word: each had three five-letter words, four six-letter words, one seven-letter word, three eight-letter words, and one nine-letter word. All target words denoted simple concepts (Appendix A), whose English translations were within the 5,000 most frequent words according to the BYU Corpus of Contemporary American English (Davies, Reference Davies2008–).

The Finnish words were embedded in English sentence contexts. This mimicked what is known as the Clockwork methodology (Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Horst, Cobb, & Meara, Reference Horst, Cobb and Meara1998). This allowed me to target simple word learning characteristic of novice L2 vocabulary learning stages while using contexts that are more interesting to read. It further helped to control for any effects of sentence reading disfluency spilling over on target words and thus allowed me to better isolate the effects of interest. The target words were never the first or last word of a sentence. The sentences were simple in structure and meaning. Four different sentence contexts were created for each target word for a total of 96 experimental sentences. In addition, there were 24 buffer sentences that did not contain a Finnish word.

Twenty-five percent of the sentences were followed by comprehension questions, many of which followed buffer sentences. Only 10 of the experimental sentences (5 per list) were followed by a comprehension question. In the massed condition, the questions always came after the fourth repetition to preserve the massed nature of presentation. In the spaced condition, comprehension questions occurred across repetitions such that they, as was the case in the massed condition, were dispersed more or less evenly throughout the study phase. None of the questions contained a target word translation to avoid causing additional processing of the target word or its meaning. The purpose of the comprehension questions was to ensure that participants read the sentences for meaning. A participant had to answer at least 80% of the questions correctly for his or her data to be included.

Distractor math task

The math task was a sequence of self-paced paper-and-pencil mathematical equations performed during breaks in the experimental procedure (the 6 min between the study blocks and the 15 min between the study phase and the immediate posttests). A math task was used to fill these time intervals so that the type of cognitive activity in which the participants engaged would be roughly similar across participants. They were asked to prioritize accuracy of answers over speed. They were not allowed to use a calculator but were provided with additional sheets to use as scratch paper.

Vocabulary posttests

Two paper-based tests, a form recognition test and a form–meaning mapping test, were used to measure learning gains. The form recognition test measured learners’ ability to recognize the correct spelling of each target word from among four choices. The three distractor items were created by switching the order of the first two syllables and letters within the syllables to produce plausible Finnish words from a nonnative speaker perspective. Whenever such a scrambling pattern resulted in a more salient distractor, such as when a letter was doubled or the resulting nonword resembled an English word, a different scrambling technique was used, such as using the third syllable for scrambling. The test was presented printed on a single sheet of paper (test sheet 1), where each entry was numbered and included the target word, three distractors, and an “I don’t know” option. The same form recognition test (but with a different randomization) served as the immediate and delayed tests.

The form–meaning mapping test measured learners’ ability to match the target forms with their meanings. With a pen of a different color (to prevent changes to test sheet 1 at this point), the number for each entry on test sheet 1 was to be written next to its corresponding English translation on test sheet 2. Test sheet 2 contained 45 English words. Only 24 of these were the target English words; distractor words were taken from the buffer and practice sentences; thus, all English words on test sheet 2 had been encountered during the study phase. The reason test sheet 1 was used instead of a new sheet with only correct forms on it was to prevent additional exposure to the correct forms, which might have contaminated the results of the delayed form recognition test. Because temporal distribution of practice was of primary interest, I wanted to avoid any spaced practice for the items in the massed condition. Further, because participants would have just finished selecting target forms from among distractors, presenting them with a list of correct forms might serve as a learning opportunity at a time of heightened curiosity about the correct forms. This was confirmed in the piloting phase. The immediate and delayed form–meaning mapping tests were also identical except for order randomization. One point was awarded for each correct and zero points for each incorrect answer on both tests for a total of 24 possible points on each test.

Background questionnaire

A paper-based background questionnaire asked the participants to list all languages they had ever studied or had any knowledge of and to state whether or not any of the studied words had struck them as familiar upon initial exposure.

Procedure

The experiment was conducted in an eye-tracking lab, where each participant met individually with the researcher for two sessions with 48–72 hr in between. The study phase and the immediate posttests were completed in Session 1; the delayed posttests were completed in Session 2. First, each participant read the consent form and indicated his or her consent to participate. Then participants read the on-screen instructions and did a practice block consisting of eight trials. After the practice block, the researcher confirmed with the participant that all was clear before beginning the four experimental blocks of the study phase. The four blocks were separated by three 6-min breaks during which the participants performed the distractor math task.

The Finnish words in the massed condition were repeated in 4 consecutive sentences within the same block. There were three such words per block. All words in the spaced condition occurred in 1 sentence in each block. In order to keep spacing of the repetitions more or less constant in the spaced condition, the order of their occurrence was changed only slightly from block to block, which was done to prevent any (however unlikely) anticipation of their occurrence across the four blocks. Each block started and ended with 1 buffer sentence. The final block always ended with an additional 16 buffer sentences. These served to prevent words in target sentences close to the end of the study phase from having a memorial advantage. Thus, words in the spaced condition were repeated in sentences separated by 25 other sentences and the distractor task. Figure 1 illustrates the distribution of the target words in the study phase. The carrier sentences are highlighted for two of the words, one in the massed condition (it repeats in 4 sentences that are consecutive) and one in the spaced condition (it repeats in 4 sentences across the four blocks). Thus, words in the massed and spaced conditions occurred throughout the four blocks of the experiment. Their average serial position was controlled (M = 60.50 and 60.54 for massed and spaced words, respectively, p = .972).

Figure 1. An illustration of the target word distribution in the study phase. The figure presents 10 sentences at the beginning of each block. Examples of the occurrence distribution for one word in the massed condition (repeating in 4 consecutive sentences) and one word in the spaced condition (repeating in 4 sentences across the four blocks) are highlighted.

Each block took 9–17 min to complete (M = 12.22, SD = 1.73), which together with the 6 min of the distractor task made for a 15- to 23-min interstudy interval for the spaced condition (M = 18.22, SD = 1.73). Experimental blocks were rotated such that all blocks appeared in each of the four positions, which, together with the counterbalancing between conditions, resulted in eight different experimental sequences, one sequence per five participants.

Participants read the study phase sentences from a computer screen while their eye movements were recorded with the EyeLink 1000 eye tracker (SR Research Ltd.; Mississauga, Canada), which samples at 1000 Hz. While reading, participants were seated in front of the computer monitor at a distance of 66 cm from the screen, where one letter subtended 0.36 degree of horizontal visual angle. To minimize head movements, participants rested their chin and forehead against a head stabilizer. The sentences were presented in black Consolas font, size 18, against a light-gray background. While reading was binocular, only the movements of the right eye were monitored. Each sentence fit on a single line. A 9-point grid calibration procedure was performed before each block of the experiment and repeated as needed, such that the resulting calibration error was always less than 0.5 degree of visual angle.

The instructions informed the participants that there would be a test on their memory for the form and meaning of the new Finnish words at the end, as in a usual foreign language classroom. They were instructed not only to read each sentence for comprehension but also to try to learn the Finnish words as if this was a Finnish lesson in which words were learned in sentence contexts. Each experimental trial started with the presentation of the English translation for the Finnish word that would appear in the upcoming sentence (or a series of dashes for buffer sentences). As soon as participants felt that they had familiarized themselves with the English translation, they pressed the space bar to proceed, at which point the English word was replaced with a fixation point that the participants were to fixate to trigger the appearance of the sentence. This served as a drift correction procedure to adjust for any drift in participants’ gaze. As soon as the participant’s gaze coincided with the fixation point, the sentence appeared on the screen, with its first letter in the same position where the fixation point had been. The participants pressed the space bar to progress from screen to screen. If a sentence was followed by a comprehension question, they were to press the yes or no key to respond. Figure 2 shows a sample exposure-phase trial sequence with a comprehension question.

Figure 2. Sample study-phase trial sequence with a comprehension question. The English translation for the upcoming Finnish word is always presented before the target sentence.

Upon completion of the study phase, participants spent 15 min engaged in the distractor task. Then they performed the immediate form recognition and form–meaning mapping posttests, in order. The posttests were untimed. After the completion of the immediate posttests, participants filled out the language background questionnaire. Delayed posttests were held 44–78 hr after the first session and were identical in procedure to the immediate posttests. Participants were not aware beforehand of the content of this second session. The entire experiment took approximately 2 hr to complete.

Analyses and Results

SPSS version 25 (IBM Corp., 2017) was used for all statistical analyses in this study. SPSS version 25, Excel 2013, and PowerPoint 2013 were used for the graphics. Response accuracy on the comprehension questions was acceptable for all participants (82%–100%, M = 96%, SD = 3.4%). There was no significant difference in accuracy between questions that followed massed and spaced repetitions, t (39) = 0.598, p = .552. A visual inspection of the eye-fixation data revealed overall good quality; therefore, no manual adjustment was performed. Values shorter than 60 ms were deleted. This resulted in the loss of 14 (0.36%), 5 (0.13%), and 2 (0.05%) cases for first fixation duration, gaze duration, and total reading time, respectively.

Background questionnaire

Responses on the background questionnaire showed that none of the participants had any prior knowledge of Finnish or any other language of the Finnic language family. Further, all participants indicated that none of the target words had struck them as familiar upon initial exposure.

Research Question 1

Research Question 1 investigated whether temporal distribution of repeated exposures affects attentional processing of L2 words. Because participants studied the words intentionally, total reading time was analyzed as the primary measure of interest. This is a measure of all overt attention that a target word received, including any intentional rehearsal. Total reading time is known as the most pedagogically relevant eye-tracking measure and as the measure that is the most highly correlated with learning outcomes (Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017). Further, first fixation duration and gaze duration were analyzed as early measures of lexical processing (Rayner, Reference Rayner1998; Rayner et al., 2006). These are informative about initial stages of word reading before any strategic attention allocation processes have an effect. Particularly, first fixation duration is known to operate too quickly to be under voluntary control (Rayner & Pollatsek, Reference Rayner, Pollatsek and Coltheart1987). A Bonferroni correction was used to correct for multiple testing (α = .05/12 = .004).

Total reading time

The total reading time data were positively skewed and had extreme outliers in the upper tail. Figure 3 presents boxplots that show the variance in these data. Figure 4 presents a line graph (based on medians) that shows how total reading time changed with repeated exposures in the two conditions. It is clear from Figure 4 that, in line with previous findings (Elgort et al., Reference Elgort, Brysbaert, Stevens and Van Assche2017; Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Joseph et al., Reference Joseph, Wonnacott, Forbes and Nation2014; Mohamed, Reference Mohamed2017; Pellicer-Sánchez, Reference Pellicer-Sánchez2016), reading times decreased with repeated exposures. The graph further shows that this downward trend was steeper in the massed than in the spaced condition, suggesting that the words in the massed condition received less attentional processing across repetitions than the words in the spaced condition. The downward trend in the massed condition looks a bit curved as if the decrease in attentional processing became less dramatic at later repetitions.

Figure 3. Boxplots showing the variance in total reading time (in ms) at each repetition in the two conditions.

Figure 4. Total reading times in milliseconds at each repetition for the two conditions. This figure shows medians.

For inferential statistical analyses, observations with residual values greater than 3 SD from the mean were trimmed (resulting in the loss of 1.5% of the data) and a square root transformation was performed. A mixed-effects growth curve modeling analysis was used to investigate change processes in reading times across repetitions in the two conditions. A mixed-effects framework was adopted to account for the nested structure of the data, as here each participant contributed multiple data points. The intraclass correlation coefficient (ICC) for the effect of participant was .192, which indicates that 19% of the variability in reading times can be attributed to the differences between participants (Hayes, Reference Hayes2006). The ICC for the effect of items was much smaller (.048). The addition of items as a random effect did not improve and even negatively affected model fit; it also interfered with the convergence of some of the models. Because of these considerations, only participants were used as random effects. Further, the addition of random slopes interfered with the convergence of some of the models; only random intercepts were used for simplicity and consistency. The likelihood ratio test was used to assess improvement in model fit with the addition of new parameters.

An initial model building process showed significant interactions between the effects of condition and both a linear and a quadratic term for repetition, χ2 (1) = 89.064 and 61.413, respectively, ps < .001, suggesting that the two conditions differed in the shape of their trajectories. For this reason, a growth curve model was fit separately for each condition. Table 1 presents fit statistics and parameter estimates for this model building process. In both conditions, the addition of slope significantly improved model fit, χ2 (1) = 607.316, p < .001 for the massed condition and χ2 (1) = 176.589, p < .001 for the spaced condition, suggesting that reading times did not remain constant across repetitions in the two conditions but rather there was a change present in both. The best fitting model for the massed condition was found to be a model with quadratic growth—improvement of model fit with the addition of a quadratic term was χ2 (1) = 154.783, p < .001—confirming the visual impression from Figure 4 of a curve in this trend. For the spaced condition, a linear model showed the best fit—improvement of model fit with the addition of a quadratic term was only χ2 (1) = .037, p = .847—which suggests that the rate of change in this condition was more or less constant within the four repetitions.

Table 1. Fit statistics and parameter estimates for multilevel growth-curve models for the rate of change in reading times in the two conditions

*** p < .001.

To explore differences in reading times in the two conditions at each repetition, a separate mixed-effects analysis was performed for each repetition with condition as the independent variable. Table 2 shows the results of this analysis. Model fit did not improve with the addition of condition as the independent variable for the first repetition, χ2 (1) = 1.126, p = .289. This makes sense as at this repetition the two conditions do not yet differ. For each subsequent repetition, however, model fit improved significantly with the addition of condition as the independent variable, χ2 (1) = 208.832, p < .001; χ2 (1) = 224.687, p < .001; χ2 (1) = 181.860, p < .001, for repetitions two, three, and four, respectively. Thus, at repetitions two, three, and four, reading times were significantly longer for the words in the spaced than in the massed condition, suggesting that the target words in the spaced condition received greater attentional processing. This confirms statistically the visual impression from Figure 4 of a more dramatic decrease in total reading times across repetitions in the massed condition. Cohen’s d effect sizes for the four repetitions were –0.074, 0.619, 0.654, and 0.538, in order. For repetitions two, three, and four, these are moderate-size effects (Cohen, Reference Cohen1988).

Table 2. Fit statistics and parameter estimates from a mixed-effects linear modeling analysis of the effects of condition on reading times at each repetition

*** p < .001.

Further analyses

Table 3 shows descriptive statistics for first fixation duration and gaze duration. Similarly to total reading time, these two measures exhibit a decrease across the repetitions, with a more dramatic decrease in the massed condition. For statistical analyses, residual values greater than 3 SD from the mean were trimmed (resulting in the loss of 1.3% and 1.6% of the data for first fixation duration and gaze duration, respectively) and a square root transformation was performed. As with total reading times, significant interactions were found between the effects of condition and the linear and quadratic terms for repetition for both first fixation duration, χ2 (1) = 7.207, p = .007 and χ2 (1) = 6.200, p = .013, respectively, and gaze duration, χ2 (1) = 74.366 and 59.586, respectively, ps < .001. For this reason, a growth curve model was fit separately in each condition for each measure. For first fixation duration, the addition of slope significantly improved model fit only in the massed condition, χ2 (1) = 31.757, p < .001. While there was, numerically, a downward trend in the spaced condition, it failed to reach statistical significance, χ2 (1) = 3.821, p = .051. The addition of a quadratic term did not significantly improve model fit in the massed condition, χ2 (1) = 1.484, p = .223, unlike what was observed for total reading time. For gaze duration, the addition of slope significantly improved model fit in both conditions, χ2 (1) = 272.620, p < .001 for the massed and χ2 (1) = 12.524, p < .001 for the spaced condition. The addition of a quadratic term improved fit in the massed condition, χ2 (1) = 62.323, p < .001, but not in the spaced condition, χ2 (1) = .377, p = .539, similarly to the pattern observed for total reading time. To investigate differences in first fixation duration and gaze duration between the two conditions, separate analyses were performed at each repetition. The addition of condition as an independent variable made a significant improvement to model fit at each repetition except the first for both first fixation duration, χ2 (1) = 3.031, p = .082; 16.151, p < .001; 22.465, p < .001; and 28.061, p < .001 for repetitions one, two, three, and four, respectively; and gaze duration, χ2 (1) = 0.556, p = .456; 126.238, p < .001; 169.992, p < .001; and 192.709, p < .001, in the same order. Thus, the results of the first fixation duration and gaze duration analyses also show significant differences between the two conditions, suggesting less attention in the massed than in the spaced condition, in a similar pattern to total reading time. This suggests that, in this intentional learning context, effects of spacing were not limited to intentional allocation of rehearsal but extended to the less voluntary and less strategic processes involved in initial stages of word reading.

Table 3. Descriptive statistics for first fixation duration and gaze duration across repetitions in the two conditions

A number of other eye-tracking measures were examined descriptively to explore any qualitative differences in attention patterns that may indicate use of different learning strategies. Table 4 presents descriptive statistics on these indices in the two conditions across the four repetitions. It also gives this information for initial exposures (in the massed condition) that occurred evenly spread across the four experimental blocks. These serve as a baseline representing how an initial encounter with a heretofore unseen L2 word was processed attentionally at the different time points across the study phase and thus allowed me to investigate any effects of serial position. Fixation count is the total number of times a target word was fixated within a sentence. Skipping rate is the percentage of times a word was not fixated at all within a sentence. First-pass skipping rate is the percentage of times a word was initially skipped but returned to after examining other parts of the sentence. Percent regressions in is a measure of how often a word was revisited after examining other parts of the sentence, regardless of whether or not it had been skipped upon initial pass. Revisit time is the amount of time, in milliseconds, that participants spent on a target word after examining other parts of the sentence. Revisit/total reading time, gives this amount as a proportion of total reading time.

Table 4. Descriptive statistics for additional eye-tracking indices for the four repetitions in the two conditions and for initial exposures across the four blocks

Note: aThis is the total number summed across participants. bTRT, total reading time.

The patterns in these measures converge in suggesting an overall tendency to give less attention to the target words across repetitions by fixating them fewer times, skipping them more, and revisiting them less often and for shorter durations. Further, this trend is, again, less dramatic in the spaced condition, where words seem to have been processed more attentively overall, on all these measures. These results indicate that there is not much evidence of attention declining across the blocks of the experiment, suggesting that the reduction observed across repetitions is due to a repetition effect and not to fatigue or any changes in strategies as the experiment progressed. The patterns further show that much of total reading time was due to revisiting (revisit/total reading time), particularly in early exposures. This is to be expected as the participants studied the words intentionally. Overall, there were few instances of skipping a word entirely, with this number going up in the massed condition with repetition. There is no evidence of skipping rate going up as the experiment progressed, as the total number remains at zero for initial exposures across the experimental blocks. In sum, no unusual rehearsal patterns were found in these additional measures.

Reminding/study-phase retrieval

In the present study, participants encountered L2 words in sentence contexts that differed from repetition to repetition. Recall that context variability was previously shown to negatively impact learning of spaced repetitions (Johnston & Uhl, Reference Johnston and Uhl1976; Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2004). This is attributed to lower reminding potential of such encounters (Benjamin & Tullis, Reference Benjamin and Tullis2010). It was important to investigate whether a failure of study-phase retrieval may pose a potential threat to learning in the spaced condition. Reading times in the spaced condition showed a significant downward trend across repeated exposures (on gaze duration and total reading time though not on first fixation duration). Such facilitation is usually taken to indicate increased familiarity that comes with repetition (Joseph et al., Reference Joseph, Wonnacott, Forbes and Nation2014; Pellicer-Sánchez, Reference Pellicer-Sánchez2016; Rayner & Duffy, Reference Rayner and Duffy1986; Rayner, Raney, & Pollatsek, Reference Rayner, Raney, Pollatsek, Lorch and O’Brien1995). Such a repetition effect would, in turn, suggest that the knowledge gained during previous exposures was likely retrieved upon seeing a word again and facilitated processing. One potential confound, however, is that the reduction in reading times could be simply due to a general speedup in reading as the experiment progressed (recall that repetitions in the spaced condition occurred across the blocks and coincided with the number of each block). To isolate true repetition effects, I again used first encounters in the massed condition as a baseline. Recall that first encounters with words in the massed condition occurred across the four blocks. They are, therefore, informative about how much reading time a heretofore unseen word received at the different time points throughout the study phase. For this reason, they provide a useful baseline for isolating true effects of repetition from order effects or any effects of fatigue. A linear mixed-effects model was run separately (due to unequal n sizes) for repeated encounters in the spaced condition and baseline first encounters in the massed condition with experimental block as a factor to explore such facilitation more closely. This was done for total reading time, gaze duration, and first fixation duration. Here, reading times in Block 1 were compared to reading times in each subsequent block. The same data cleaning procedures were used as previously. The Bonferroni correction was applied (α = .05/18 = .003). Table 5 shows descriptive statistics and significance values for the differences between the reference (first) block and each subsequent block. For total reading time, the addition of the independent variable significantly improved model fit in the spaced condition, χ2 (1) = 183.190, p < .001, but not in baseline exposures, χ2 (1) = 0.962, p = .327. Further, each repeated encounter was processed significantly more quickly than the initial encounter in the spaced condition while initial massed condition encounters in Block 1 did not differ from initial encounters in any of the subsequent blocks (even if the Bonferroni correction were not applied). Because no significant speedup was found on initial encounters across the study phase, an overall speedup can be ruled out as an explanation for the facilitation observed across repetitions in the spaced condition, suggesting a true repetition effect in the spaced condition.

Table 5. Reading times for repeated spaced exposures and first exposures across the four blocks; the table also shows significance values for the differences between the reference (first) block and each subsequent block

Note: ap value for the difference between each block and the first (reference) block. *Significant at the Bonferroni corrected α =.003.

For gaze duration, the addition of block as an independent variable also significantly improved model fit in the spaced condition, χ2 (1) = 14.698, p < .001, but not in the baseline, χ2 (1) = 0.996, p = .318. As in total reading time, no significant differences were observed in this measure between initial exposures in Block 1 and initial exposures in each subsequent block, suggesting, again, no overall speedup across time in gaze duration. For the spaced condition, Repetition 2 did not differ significantly from Repetition 1; however, Repetitions 3 and 4 straddled significance at the Bonferroni corrected alpha level. Thus, a repetition effect was observed in later repetitions for gaze duration.

Finally, for first fixation duration, the addition of block as an independent variable did not significantly improve model fit in either condition: the massed condition, χ2 (1) = 0.912, p = .340; the spaced condition, χ2 (1) = 5.338, p = .021.

While no overall speedup in reading times was observed across the study phase on any of the measures, a repetition effect was quite clearly present. This effect was strongest in total reading time, evident only in later repetitions in gaze duration, and only numerically present in first fixation duration, suggesting that recognition of spaced repetitions had not reached automaticity within the four exposures but required more time, particularly with earlier repetitions.

Research Question 2

Research Question 2 investigated whether intentional learning of L2 words in different sentence contexts is affected by how closely together or widely apart repeated exposures occur. Summed correct responses on immediate and delayed form and meaning tests were used to answer this question. While these were count data, the scores in each test presented a nearly normal distribution; therefore, the data were treated as continuous for the analyses. One participant of 40 did not come back for the delayed posttests. The data were treated as missing at random. Cronbach’s α reliability coefficients for the posttests were as follows: form immediate: α = .790; form delayed: α = .776; meaning immediate: α = .868; meaning delayed: α = .897. Because the serial position of Repetition 4 for each word could not be equated across the two conditions, binary logistic regression analyses were conducted with serial position as the independent variable and the scores on each of the four posttests as the dependent variables. Serial position was found not to have a significant effect on any of the posttest scores: form immediate, Wald χ2 (1) = 1.305, p = .254; form delayed, Wald χ2 (1) = 1.238, p = .266; meaning immediate, Wald χ2 (1) = 0.034, p = .854; meaning delayed, Wald χ2 (1) = 0.885, p = .347.

Table 6 presents overall scores achieved on each of the four tests, averaged across participants. It is clear from this table that there was a wide range of scores. Figures 5 and 6 show boxplots for the scores on the four tests in the two conditions. A visual inspection of the boxplots suggests that the scores were higher in the spaced condition on all four tests and also that the scores on the form recognition test were higher than those on the form–meaning mapping test. It is also clear that there was not much change in the scores across time.

Figure 5. Immediate and delayed form recognition posttest scores. These scores are out of 12 possible points and represent the sum of correct answers in each condition, averaged across participants.

Figure 6. Immediate and delayed form–meaning mapping posttest scores. These scores are out of 12 possible points and represent the sum of correct answers in each condition, averaged across participants.

Table 6. Total scores on the immediate and delayed form recognition and form–meaning mapping tests; these scores are collapsed across the two conditions and are out of 24 possible points

A linear mixed-effects analysis exploring the effects of condition (massed, spaced), test type (form, meaning), and retention interval (immediate, delayed) as the independent variables was performed on the raw scores, with participants as random effects. The ICC for the effect of participant was .379, which shows how variable the responses were across participants. This analysis showed significantly higher scores in the spaced relative to the massed condition, χ2 (1) = 126.803, p < .001, and significantly higher scores on the form recognition test than on the form–meaning mapping test, χ2 (1) = 139.757, p < .001, but no effect of retention interval, χ2 (1) = 2.389, p = .122. The addition of interaction terms produced the following results, in order: retention interval by condition, χ2 (1) = 2.152, p = .142; condition by test type, χ2 (1) = 5.763, p = .016; retention interval by test type, χ2 (1) = 1.483, p = .223. The three-way interaction was not significant, χ2 (1) = 0.085, p = .770. Note that there is no significant interaction between retention interval and condition. Note, however, that the interaction between condition and test type is significant. This confirms the visual impression from Figures 5 and 6 of a greater difference in scores between the two conditions on the form–meaning mapping test than on the form recognition test. The significant interaction also means that the effect of test type (performance on the form test being higher overall than on the meaning test) is larger in the massed than in the spaced condition. A linear mixed-effects modeling analysis was further performed for each of the four tests separately with participants as the random effects and condition as the fixed effect. Only random intercepts were used. The Bonferroni correction was applied to adjust for multiple testing (α = .05/5 = .01). The ICC for the effect of participant in the four tests was .26, .14, .28, and .21, respectively, for form immediate, form delayed, meaning immediate, and meaning delayed tests. Table 7 presents the results of this analysis. The addition of condition as the independent variable significantly improved model fit as indicated by the difference in -2LL between the models with and without this independent variable: form immediate, χ2 (1) = 37.881, p < .001; form delayed, χ2 (1) = 25.971, p < .001; meaning immediate, χ2 (1) = 50.545, p < .001; meaning delayed, χ2 (1) = 47.310, p < .001. This independent variable was statistically significant (ps < .001) for all four tests. Cohen’s d effect sizes for the effect of condition in the four tests were as follows: form immediate, 1.24; form delayed, 0.98; meaning immediate, 1.94; and meaning delayed, 1.91. These are large effect sizes (Cohen, Reference Cohen1988).

Table 7. Parameter estimates from a mixed-effects linear modeling analysis of the effects of condition on learning gains as measured by the four posttests

* p < .05.

** p < .01.

*** p < .001.

Research Question 3

Research Question 3 investigated whether the positive effects that spacing repeated encounters with novel L2 vocabulary was found to have on learning operates through increased attentional processing of spaced encounters relative to massed encounters, as predicted by the deficient processing account of the spacing effect. A mediation analysis was used to address this question. In mediation, the effect of a predictor variable on the outcome variable (direct effect) operates through a mediator variable (indirect effect). In the present case, I tested whether the direct effect of spacing on learning (as measured by the posttest scores) is mediated by the indirect effect of attention (inferred from the total reading times).

For the present analysis, scores from the four posttests that measured learning were collapsed into one score per condition per participant. Recall that the same basic pattern of results was found across the four posttests. Table 8 shows Pearson correlations among the four tests. These correlations are quite high. Table 8 also presents the results of a principle component analysis, which showed that the four tests load quite highly and quite uniformly on one underlying component. This suggests that the four tests are likely measuring the same underlying construct. In such a case, the scores can be summed or averaged without much loss of information. The present analysis was performed on averaged scores. The reading times were summed across the four exposures for each participant. This sum represented the total attentional processing a word received in the study phase. Figure 7 presents graphically the conceptual structure of the analysis with standardized coefficients for the three simple effects tested separately, including the direct effect of spacing on learning. The indirect effect, which was of primary interest, was tested with the PROCESS Procedure for SPSS Version 3.0 (Hayes, Reference Hayes2018) using bootstrapped 95% confidence intervals with 10,000 bootstrap samples. The partially standardized effect was .3184, bootstrapped standard error = .0979, 95% bootstrapped confidence interval [.1435, .5324]. The confidence interval for the indirect effect does not include zero, which indicates significant mediation. Thus, attention was found to be a significant mediator for the spacing effect in L2 word learning obtained in this study. This confirms the predictions of the deficient processing account. Recall that this account states that the positive effects of spacing on learning are due to increased attentional processing of spaced repetitions relative to massed repetitions. It can be concluded that spacing of repeated exposures with L2 words positively affects learning through increased attentional processing; that is, spacing has a positive effect on attention, which, in turn, has a positive effect on learning.

Figure 7. The path diagram for the mediation analysis. Standardized coefficients for each simple effect are presented alongside the corresponding arrow. The standardized coefficient for the effect of spacing on learning, controlling for attention, appears in parentheses. **p < .01. ***p < .001.

Table 8. Pearson correlation coefficients and principle components analysis results for the four vocabulary posttests

** p < .01.

Discussion

The present study tested the contribution of deficient processing to the spacing effect in intentional L2 vocabulary learning from sentence contexts. Participants studied 24 novel Finnish words by reading each in four different English sentence contexts. The results showed that words that repeated in sentences that were separated by other sentences and an intervening distractor task (for a total of 18 min, on average, between repetitions) were remembered significantly better than words that repeated in consecutive sentences, as measured by immediate and 48- to 72-hr-delayed form recognition and form–meaning mapping posttests. Thus, a spacing effect was obtained under the present conditions. The effect sizes were quite large, suggesting an advantage for studying words in a distributed fashion that is of practical importance. It is significant that such large effect sizes were obtained with only 18 min, on average, between repetitions, and that a difference of this magnitude can be seen in as few exposures as four.

Results from the analyses of total reading times showed that words that occurred in sentences that were spaced received significantly more attention than words that occurred in consecutive sentences. This suggests that temporal distribution of repeated exposures to L2 words plays a role in how much attentional processing the words receive; namely, repeated exposures that are close in time receive less attention. Analyses of first fixation durations and gaze durations as early measures of lexical processing showed a similar pattern to that found in total reading time: massed repetitions received significantly less attentional processing. This shows that, even though participants engaged in intentional study, effects of spacing extended to the less voluntary processes involved in early stages of word reading before any intentional rehearsal may have an effect. The present results suggest, therefore, that both a controlled (Greene, Reference Greene1989) and an automatic (Challis, Reference Challis1993; Toppino, Reference Toppino1991) deficient processing account may be relevant for intentional learning of L2 vocabulary from context. An examination of other eye-tracking indices did not reveal any unusual rehearsal patterns or any differences in eye-movement patterns beyond an overall decrease in attention across repetitions and this decrease being more dramatic in the massed condition.

In the present study, L2 words were encountered in contexts that differed from repetition to repetition. This is how L2 words are commonly encountered in language learning. However, this may have important consequences for both attention and learning. Such difference in contexts likely precluded deficient processing in its strictest sense (as seen in psychology studies with word list learning) as each repeated encounter with a word had to be processed in a new sentence context. It is, therefore, significant that spacing still had the effect of increasing attention above and beyond the additional processing engendered by such context variability and that this effect was not limited to intentional rehearsal.

Another important consequence of context variability is its effects on learning. Previous research in psychology found that contextual change enhances learning of massed repetitions but impedes learning of spaced repetitions because it may contribute to a failure of study-phase retrieval at longer lags (Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2004). Words in the spaced condition showed facilitation in reading times across repetitions that could not be attributed to effects of serial position and therefore indicated a repetition effect, suggesting that repeated spaced exposures were processed as repetitions. Thus, repetitions in the spaced condition experienced retrieval that was effortful yet successful. This is believed to be a “sweet spot” for optimal learning in dual mechanism accounts that combine deficient processing with the assumption of study-phase retrieval/reminding (Benjamin & Tullis, Reference Benjamin and Tullis2010). This may account for the large benefits that spacing produced in this study. The fact that memory traces survived between spaced repetitions may be due to the specific combination of the length of the interstudy interval, the fact that learners engaged in intentional learning, which may establish stronger memory traces (Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2005), and the relative ease of the task (Bui, Maddox, & Balota, Reference Bui, Maddox and Balota2013; Elgort & Warren, Reference Elgort and Warren2014; Verkoeijen & Bouwmeester, Reference Verkoeijen and Bouwmeester2008).

Finally, the present results showed that amount of attentional processing mediates the spacing effect in L2 vocabulary learning, in line with the predictions of the deficient processing theory of the spacing effect (Cuddy & Jacoby, Reference Cuddy and Jacoby1982; Zechmeister & Shaughnessy, Reference Zechmeister and Shaughnessy1980). The results showed that, even when words are encountered in different contexts, learners process these repeated encounters more attentively if they occur more widely spaced over time, which in turn results in more learning.

Implications for vocabulary learning and teaching

It is widely accepted that attention is important for learning many aspects of an L2, including vocabulary (Gass, Reference Gass1988; Godfroid et al., Reference Godfroid, Boers and Housen2013; Laufer & Hulstijn, Reference Laufer and Hulstijn2001; Mohamed, Reference Mohamed2017; Schmidt, Reference Schmidt1990, Reference Schmidt and Robinson2001; Schmitt, Reference Schmitt2008). Researchers and practitioners alike have been looking for effective ways of inducing more attention to target forms. A number of techniques have been proposed to accomplish this, such as input enhancement (Sharwood Smith, Reference Sharwood Smith1993) and the use of the noticing function of output (Swain, Reference Swain, Cook and Seidlhofer1995), to name just a few. The present results suggest that effective spacing of repeated exposures with L2 forms can be added to this list. In the present study, the length of the interval that separated repeated exposures to novel L2 words had a significant effect on how attentively these exposures were processed. It is not unreasonable to suppose that the finding of more attention to spaced repetitions may generalize to a different time scale and to different learning targets as well as activities that differ from sentence reading; that is, language forms that repeat closely in any kind of input or are focused on in a massed fashion during a lesson may receive less attentional processing. A recommendation can be made to avoid massing repeated exposures or treatments of the same form to keep learners’ attention to target forms at a higher level.

While in the present study participants studied the words intentionally, eye-tracking indices of early lexical processing known to be under less voluntary control suggest that the above recommendation may be useful for incidental vocabulary acquisition contexts as well. Thus, even when a learner is not trying to commit a word to memory but only processes it for recognition and comprehension, repeated exposures that are close together may receive less attentional processing and may, therefore, be not as useful for learning as they would be if they were more widely spaced.

While it is unlikely that increasing interstudy interval will ever have the effect of reducing attention to a target form, generalizing learning effects far beyond the time scale of the present study is less straightforward as learning may be negatively affected when repetitions are spaced too widely (Toppino & Bloom, Reference Toppino and Bloom2002; Verkoeijen et al., Reference Verkoeijen, Rikers and Schmidt2005). Optimal interstudy interval may, further, be shorter the more complex the task (Bui et al., Reference Bui, Maddox and Balota2013; Donovan & Radosevich, Reference Donovan and Radosevich1999; Elgort & Warren, Reference Elgort and Warren2014; Suzuki & DeKeyser, Reference Suzuki and DeKeyser2017; Verkoeijen & Bouwmeester, Reference Verkoeijen and Bouwmeester2008).

The findings of this study also have implications for discussions of how many exposures are needed to learn a word (Nation, Reference Nation1990). The present results are in line with prior studies that have shown that, holding constant the number of exposures to an L2 word, more attentional processing leads to more learning (Godfroid et al., Reference Godfroid, Ahn, Choi, Ballard, Cui, Johnston and Yoon2017; Mohamed, Reference Mohamed2017). Thus, if we can induce more attention to L2 words, such as through effective use of spacing, we may be able to achieve more learning with fewer exposures, which may mean time saved.

Limitations and future directions

The present study has limitations that future research needs to address. This was a first attempt at testing attention as a mediator for the spacing effect in contextual L2 vocabulary learning. The design was a departure from the traditional word list rote learning paradigm widely used in psychology to investigate deficient processing as in the present study words were encountered in contexts that differed from repetition to repetition, which approximates more closely the way L2 vocabulary is encountered in language learning but may, as discussed earlier, have important consequences for both attention and learning. Not to make too drastic a departure from established research, the study aimed to keep the learning process simple. Participants studied words denoting simple and generic concepts such as “butterfly,” “food,” and “city” in a language that was completely novel to them. The use of a novel language and such simple words allowed me to target simple processes of word learning characteristic of novice L2 vocabulary learning stages before subtleties in meaning and stylistic and other distributional features of a word may become important. The use of native language sentence contexts made targeting such novice learning possible while allowing the use of more interesting sentence contexts. Using sentence contexts allowed to control the spacing of repetitions and to use a counterbalanced design. While these design features allowed me to isolate the effects of interest with more certainty, this level of control creates a limitation because it does not capture all processes involved in authentic L2 reading. Future research needs to investigate these processes in more ecologically valid designs by presenting target words within L2 reading contexts, such as extended L2 texts. Important variables whose effects need to be systematically investigated with L2 reading include learner proficiency and task complexity (Bui et al., Reference Bui, Maddox and Balota2013; Verkoeijen & Bouwmeester, Reference Verkoeijen and Bouwmeester2008). Some previous failures to observe a spacing effect in SLA have been attributed to task complexity (Suzuki & DeKeyser, Reference Suzuki and DeKeyser2017) or lower proficiency and probable memory trace decay from repetition to repetition (Elgort & Warren, Reference Elgort and Warren2014), which is consistent with a failure of study-phase retrieval. These studies further used longer interstudy intervals (e.g., mostly 24 hr in Elgort & Warren, who investigated incidental learning of vocabulary from reading). By manipulating interstudy interval on the one hand and task difficulty and learner proficiency on the other, future research may observe important interactions. There may even be a situation in which more attention may actually not lead to more learning from repetition (contrary to the consistent finding) because of a failure of study-phase retrieval at longer interstudy intervals.

Future research also needs to explore how temporal distribution affects both attention and learning beyond four exposures. In the present study, the rate of change in reading times had a quadratic shape in the massed condition and a linear shape in the spaced condition. It is likely that the spaced condition will also show a trend that deviates from linearity at a point beyond four repetitions. At Repetition 4 there was still a significant difference in reading times between the two conditions. With more repetitions, there may come a point at which the two lines converge and there is no longer a difference. This point, in turn, will likely depend on the interstudy interval and a combination of other moderator variables in any given learning situation. Finally, the benefits that come from each additional exposure need to be investigated as a function of interstudy interval and other relevant variables (see, e.g., Maddox & Balota, Reference Maddox and Balota2015).

The present findings tell us that if vocabulary is studied repeatedly in a massed fashion, there may come a point at which these repetitions no longer engage as much attention and are no longer as useful for learning as they would be if the words were, instead, to be revisited at a later time. To be able to give more precise recommendations for teaching/learning practice and for materials and syllabus design, more research is needed that investigates systematically what the optimal spacing may be for the acquisition of vocabulary and of other aspects of L2. The optimal interstudy interval will, in turn, depend on other variables present in any given L2 learning situation. These include whether learning is intentional/incidental, the presence/absence of feedback, individual learner differences such as proficiency and working memory capacity, whether the learners are engaged in comprehending input or producing output, and the input/output modality, to name just a few. Future research should not neglect to investigate the underlying mechanisms responsible for the spacing effect in SLA contexts. An understanding of the underlying cognitive operations that are involved will allow us to better understand how and why certain variables that are present in our learning contexts may exert a moderating influence on the relationship between spacing and learning. This, in turn, will allow us to set up inquiries in more useful ways such that they do a better job of informing L2 pedagogy about how best to make use of the memory phenomenon that is the spacing effect.

Appendix A. Target Words

List A

  • aviomies (husband)

  • silta (bridge)

  • tarina (story)

  • savuke (cigarette)

  • puhelin (phone)

  • toimisto (office)

  • kieli (language)

  • valmis (ready)

  • kaupunki (city)

  • taivas (sky)

  • keltainen (yellow)

  • lyhyt (short)

List B

  • avain (key)

  • perhonen (butterfly)

  • elokuva (movie)

  • paita (shirt)

  • rakennus (building)

  • lattia (floor)

  • kirjasto (library)

  • osoite (address)

  • kasvot (face)

  • ruoka (food)

  • valkoinen (white)

  • laukku (bag)

Appendix B. Sample Sentences

A sample experimental sentence with a comprehension question

The article was written in a kieli that he did not know.

  • - Could he read and understand the article?

A sample buffer trial with a comprehension question

Her luggage was so heavy that she had to ask for help.

  • - Was she probably traveling?

Author ORCIDs

Natalie Koval, 0000-0001-9233-0717

Acknowledgments

This research was supported in part by the Second Language Studies Doctoral Program at Michigan State University. I would like to thank Drs. Patti Spinner and Aline Godfroid for their valuable feedback on the experimental design and data analysis. I am also grateful to MSU CSTAT statistical consulting for feedback on the statistical analyses. I am further grateful to associate editor Dr. Annie Tremblay and my three anonymous reviewers for their valuable comments and suggestions that helped me to greatly improve my contribution.

References

Bahrick, H. P., Bahrick, L. E., Bahrick, A. S., & Bahrick, P. E. (1993). Maintenance of foreign language vocabulary and the spacing effect. Psychological Science, 4, 316321.CrossRefGoogle Scholar
Begg, I., & Green, C. (1988). Repetition and trace interaction: Superadditivity. Memory & Cognition, 16, 232242.CrossRefGoogle ScholarPubMed
Bellezza, F. S., Winkler, H. B., & Andrasik, F. (1975). Encoding processes and the spacing effect. Memory & Cognition, 3, 451457.CrossRefGoogle ScholarPubMed
Benjamin, A. S., & Tullis, J. G. (2010). What makes distributed practice effective? Cognitive Psychology, 61, 228247.CrossRefGoogle ScholarPubMed
Bird, S. (2010). Effects of distributed practice on the acquisition of second language English syntax. Applied Psycholinguistics, 31, 635650.CrossRefGoogle Scholar
Bjork, R. A., & Allen, T.W. (1970). The spacing effect: Consolidation or differential encoding? Journal of Verbal Learning and Verbal Behavior, 9, 567572.CrossRefGoogle Scholar
Bloom, K. C., & Shuell, T. J. (1981). Effects of massed and distributed practice on the learning and retention of second-language vocabulary. Journal of Educational Research, 74, 245248.CrossRefGoogle Scholar
Bower, G. H. (1972). Stimulus-sampling theory of encoding variability. In Melton, A. W., and Martin, E. (Eds.), Coding processes in human memory (pp. 85123). Washington, DC: Winston.Google Scholar
Brysbaert, M., & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1, 120.CrossRefGoogle Scholar
Bui, D. C., Maddox, G. B., & Balota, D. A. (2013). The roles of working memory and intervening task difficulty in determining the benefits of repetition. Psychonomic Bulletin & Review, 20, 341347.CrossRefGoogle ScholarPubMed
Callan, D., & Schweighofer, N. (2010). Neural correlates of the spacing effect in explicit verbal semantic encoding support the deficient-processing theory. Human Brain Mapping, 31, 645659.Google ScholarPubMed
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: A review and quantitative synthesis. Psychological Bulletin, 132, 354380.CrossRefGoogle ScholarPubMed
Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T., & Pashler, H. (2008). Spacing effects in learning: A temporal ridgeline of optimal retention. Psychological Science, 19, 10951102.CrossRefGoogle ScholarPubMed
Challis, B. H. (1993). Spacing effects on cued-memory tests depend on level of processing. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 389396.Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Earlbaum.Google Scholar
Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671684.CrossRefGoogle Scholar
Cuddy, L. J., & Jacoby, L. L. (1982). When forgetting helps memory: An analysis of repetition effects. Journal of Verbal Learning and Verbal Behavior, 21, 451467.CrossRefGoogle Scholar
Davies, M. (2008–). The Corpus of Contemporary American English (COCA): 520 million words, 1990–present. Available online at https://corpus.byu.edu/coca/Google Scholar
Delaney, P. F., Verkoeijen, P., & Spirgel, A. (2010). Spacing and testing effects: A deeply critical, lengthy, and at times discursive review of the literature. In Ross, B. H. (Ed.), Psychology of learning and motivation: Advances in research and theory (pp. 63147). San Diego: Elsevier Academic Press.CrossRefGoogle Scholar
Dellarosa, D., & Bourne, L. E. (1985). Surface form and the spacing effect. Memory & Cognition, 13, 529537.CrossRefGoogle ScholarPubMed
Dempster, F. N. (1988). The spacing effect: A case study in the failure to apply the results of psychological research. American Psychologist, 43, 627634.CrossRefGoogle Scholar
Donovan, J. J., & Radosevich, D. J. (1999). A meta-analytic review of the distribution of practice effect: Now you see it, now you don’t. Journal of Applied Psychology, 84, 795805.CrossRefGoogle Scholar
Ebbinghaus, H. (1885). Über das gedächtnis: Untersuchungen zur experimentellen psychologie. Berlin: Duncker & Humblot.Google Scholar
Elgort, I., Brysbaert, M., Stevens, M., & Van Assche, E. (2017). Contextual word learning during reading in a second language: An eye-movement study. Studies in Second Language Acquisition. Advance online publication.Google Scholar
Elgort, I., & Warren, P. (2014). L2 Vocabulary learning from reading: Explicit and tacit lexical knowledge and the role of learner and item variables. Language Learning, 64, 365414.CrossRefGoogle Scholar
Estes, W. K. (1955). Statistical theory of distributional phenomena in learning. Psychological Review, 62, 369377.CrossRefGoogle Scholar
Gass, S. (1988). Integrating research areas: A framework for second language studies. Applied Linguistics, 9, 198217.CrossRefGoogle Scholar
Gerbier, E., & Toppino, T. C. (2015). The effect of distributed practice: Neuroscience, cognition, and education. Trends in Neuroscience and Education, 4, 4959.CrossRefGoogle Scholar
Glenberg, A. M. (1976). Monotonic and nonmonotonic lag effects in paired-associate and recognition memory paradigms. Journal of Verbal Learning and Verbal Behavior, 15, 116.CrossRefGoogle Scholar
Glenberg, A. M. (1979). Component-levels theory of the effects of spacing of repetitions on recall and recognition. Memory & Cognition, 7, 95112.CrossRefGoogle ScholarPubMed
Glenberg, A. M., & Smith, S. M. (1981). Spacing repetitions and solving problems are not the same. Journal of Verbal Learning and Verbal Behavior, 20, 110119.CrossRefGoogle Scholar
Godfroid, A., Ahn, J., Choi, I., Ballard, L., Cui, Y., Johnston, S., . . . Yoon, H-J. (2017). Incidental vocabulary learning in a natural reading context: An eye-tracking study. Bilingualism: Language and Cognition. Advance online publication.Google Scholar
Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in incidental L2 vocabulary acquisition by means of eye tracking. Studies in Second Language Acquisition, 35, 483517.CrossRefGoogle Scholar
Godfroid, A., & Schmidtke, J. (2013). What do eye movements tell us about awareness? A triangulation of eye-movement data, verbal reports and vocabulary learning scores. In Bergsleithner, J. M., Frota, S. N., and Yoshioka, J. K. (Eds.), Noticing and second language acquisition: Studies in honor of Richard Schmidt (pp. 183205). Honolulu, HI: University of Hawai'i, National Foreign Language Resource Center.Google Scholar
Greene, R. L. (1989). Spacing effects in memory: Evidence for a two-process account. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 371377.Google Scholar
Greeno, J. G. (1967). Paired-associate learning with short-term retention: Mathematical analysis and data regarding identification of parameters. Journal of Mathematical Psychology, 4, 430472.CrossRefGoogle Scholar
Hayes, A. F. (2006). A primer on multilevel modeling. Human Communication Research, 32, 385410.CrossRefGoogle Scholar
Hayes, A. F. (2018). Introduction to mediation, moderation, and conditional process analysis (2nd ed.). New York: Guilford Press.Google Scholar
Helsdingen, A., van Gog, T., & van Merriënboer, J. (2011). The effects of practice schedule and critical thinking prompts on learning and transfer of a complex judgment task. Journal of Educational Psychology, 103, 383398.CrossRefGoogle Scholar
Hintzman, D. L., & Block, R. A. (1973). Memory for the spacing of repetitions. Journal of Experimental Psychology, 99, 70.CrossRefGoogle Scholar
Hintzman, D. L., Block, R. A., & Summers, J. J. (1973). Modality tags and memory for repetitions: Locus of the spacing effect. Journal of Memory and Language, 12, 229238.Google Scholar
Horst, M., Cobb, T., & Meara, P. (1998). Beyond A Clockwork Orange: Acquiring second language vocabulary through reading. Reading in a Foreign Language, 11, 207223.Google Scholar
Hyönä, J., & Niemi, P. (1990). Eye movements in repeated movements of a text. Acta Psychologica, 73, 259280.CrossRefGoogle ScholarPubMed
IBM Corp. (2017). IBM SPSS Statistics for Windows, Version 25.0. Armonk, NY: Author.Google Scholar
Jacoby, L. L. (1978). On interpreting the effects of repetition: Solving a problem versus remembering a solution. Journal of Verbal Learning and Verbal Behavior, 17, 649667.CrossRefGoogle Scholar
Johnston, W. A., & Uhl, C. N. (1976). The contributions of encoding effort and variability to the spacing effect on free recall. Journal of Experimental Psychology: Human Learning and Memory, 2, 153160.Google Scholar
Joseph, H., Wonnacott, E., Forbes, P., & Nation, K. (2014). Becoming a written word: Eye- movements reveal order of acquisition effects following incidental exposure to new words during silent reading. Cognition, 133, 238248.CrossRefGoogle ScholarPubMed
Just, M. A., & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review, 87, 329354.CrossRefGoogle ScholarPubMed
Kang, S., Lindsey, R. V., Mozer, M. C., & Pashler, H. (2014). Retrieval practice over the long term: Should spacing be expanding or equal-interval? Psychonomic Bulletin & Review, 21, 15441550.CrossRefGoogle ScholarPubMed
Kapler, I. V., Weston, T., & Wiseheart, M. (2015). Spacing in a simulated undergraduate classroom: Long-term benefits for factual and higher-level learning. Learning and Instruction, 36, 3845.CrossRefGoogle Scholar
Kim, M., Kim, J., & Kwon, J. S. (2001). The effect of immediate and delayed word repetition on event-related potential in a continuous recognition task. Cognitive Brain Research, 11, 387396.CrossRefGoogle Scholar
Kornell, N., & Bjork, R. A. (2008). Learning concepts and categories: Is spacing the “enemy of induction”? Psychological Science, 19, 585592.CrossRefGoogle ScholarPubMed
Krug, D., Davis, B., & Glover, J. A. (1990). Massed versus distributed repeated reading: A case of forgetting helping recall? Journal of Educational Psychology, 82, 366371.CrossRefGoogle Scholar
Landauer, T. K. (1969). Reinforcement as consolidation. Psychological Review, 76, 8296.CrossRefGoogle ScholarPubMed
Laufer, B., & Hulstijn, J. H. (2001). Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics, 22, 126.CrossRefGoogle Scholar
Maddox, G. B. (2016). Understanding the underlying mechanism of the spacing effect in verbal learning: A case for encoding variability and study-phase retrieval. Journal of Cognitive Psychology, 28, 684706.CrossRefGoogle Scholar
Maddox, G. B., & Balota, D. A. (2015). Retrieval practice and spacing effects in young and older adults: An examination of the benefits of desirable difficulty. Memory & Cognition, 43, 760774.CrossRefGoogle Scholar
Madigan, S. A. (1969). Intraserial repetition and coding processes in free recall. Journal of Verbal Learning and Verbal Behavior, 8, 828835.CrossRefGoogle Scholar
Magliero, A. (1983). Pupil dilations following pairs of identical and related to-be-remembered words. Memory & Cognition, 11, 609615.CrossRefGoogle ScholarPubMed
Mammarella, N., Avons, S. E., & Russo, R. (2004). A short-term perceptual priming account of spacing effects in explicit cued-memory tasks for unfamiliar stimuli. European Journal of Cognitive Psychology, 16, 387402.CrossRefGoogle Scholar
Melton, A. W. (1967). Repetition and retrieval from memory. Science, 158, 532.CrossRefGoogle ScholarPubMed
Melton, A. W. (1970). The situation with respect to the spacing of repetitions and memory. Journal of Verbal Learning and Verbal Behavior, 9, 596606.CrossRefGoogle Scholar
Mohamed, A. A. (2017). Exposure frequency in L2 reading: An eye-movement perspective of incidental vocabulary learning. Studies in Second Language Acquisition. Advance online publication.Google Scholar
Nakata, T. (2015). Effects of expanding and equal spacing on second language vocabulary learning: Does gradually increasing spacing increase vocabulary learning? Studies in Second Language Acquisition, 37, 677711.CrossRefGoogle Scholar
Nakata, T., & Webb, S. (2016). Does studying vocabulary in smaller sets increase learning? The effects of part and whole learning on second language vocabulary acquisition. Studies in Second Language Acquisition, 38, 523552.CrossRefGoogle Scholar
Nation, I. S. P. (1990). Teaching and learning vocabulary. New York: Newbury House.Google Scholar
Pashler, H., Zarow, G., & Triplett, B. (2003). Is temporal spacing of tests helpful even when it inflates error rates? Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 10511057.Google ScholarPubMed
Pavlik, P. I., & Anderson, J. R. (2005). Practice and forgetting effect on vocabulary memory: An activation-based model of the spacing effect. Cognitive Science, 29, 559586.CrossRefGoogle ScholarPubMed
Pellicer-Sánchez, A. (2016). Incidental L2 vocabulary acquisition from and while reading: An eye-tracking study. Studies in Second Language Acquisition, 38, 97130.CrossRefGoogle Scholar
Raaijmakers, J. G. W. (2003). Spacing and repetition effects in human memory: Application of the SAM model. Cognitive Science, 27, 431452.CrossRefGoogle Scholar
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372422.CrossRefGoogle Scholar
Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology, 62, 14571506.CrossRefGoogle ScholarPubMed
Rayner, K., & Duffy, S. A. (1986). Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition, 14, 191201.CrossRefGoogle ScholarPubMed
Rayner, K., & Pollatsek, A. (1987). Eye movements in reading: A tutorial review. In Coltheart, M. (Ed.), Attention and performance (Vol. 12, pp. 327362). London: Erlbaum.Google Scholar
Rayner, K., Raney, G. E., & Pollatsek, A. (1995). Eye movements and discourse processing. In Lorch, R. F., and O’Brien, E. J. (Eds.), Sources of coherence in reading (pp. 936). Hillsdale, NJ: Erlbaum.Google Scholar
Reder, L. M., & Anderson, J. R. (1982). Effects of spacing and embellishment on memory for the main points of a text. Memory & Cognition, 10, 97102.CrossRefGoogle ScholarPubMed
Robinson, P. (2003). Attention and memory during SLA. In Doughty, C., and Long, M. H. (Eds.), The handbook of second language acquisition (pp. 631678). Oxford: Blackwell.CrossRefGoogle Scholar
Rogers, J. (2015). Learning second language syntax under massed and distributed conditions. TESOL Quarterly, 49, 857866.CrossRefGoogle Scholar
Rohrer, D., & Pashler, H. (2007). Increasing retention without increasing study time. Current Directions in Psychological Science, 16, 183186.CrossRefGoogle Scholar
Rose, R. J. (1984). Processing time for repetitions and the spacing effect. Canadian Journal of Psychology/Revue Canadienne De Psychologie, 38, 537550.CrossRefGoogle Scholar
Ross, B. H., & Landauer, T. K. (1978). Memory for at least one of two items: Test and failure of several theories of spacing effects. Journal of Verbal Learning and Verbal Behavior, 17, 669680.CrossRefGoogle Scholar
Rundus, D. (1971). Analysis of rehearsal processes in free recall. Journal of Experimental Psychology, 89, 6377.CrossRefGoogle Scholar
Russo, R., & Mammarella, N. (2002). Spacing effects in recognition memory: When meaning matters. European Journal of Cognitive Psychology, 14, 4959.CrossRefGoogle Scholar
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129158.CrossRefGoogle Scholar
Schmidt, R. (2001). Attention. In Robinson, P. (Ed.), Cognition and second language instruction (pp. 332). New York: Cambridge University Press.CrossRefGoogle Scholar
Schmitt, N. (2008). Review article: Instructed second language vocabulary learning. Language Teaching Research, 12, 329363.CrossRefGoogle Scholar
Schuetze, U. (2015). Spacing techniques in second language vocabulary acquisition: Short-term gains vs. long-term memory. Language Teaching Research, 19, 2842.CrossRefGoogle Scholar
Seabrook, R., Brown, G. D., & Solity, J. E. (2005). Distributed and massed practice: From laboratory to classroom. Applied Cognitive Psychology, 19, 107122.CrossRefGoogle Scholar
Sharwood Smith, M. (1993). Input enhancement in instructed SLA: Theoretical bases. Studies in Second Language Acquisition, 15, 165179.CrossRefGoogle Scholar
Shaughnessy, J. J., Zimmerman, J., & Underwood, B. J. (1972). Further evidence on the MP-D effect in free-recall learning. Journal of Verbal Learning and Verbal Behavior, 11, 112.CrossRefGoogle Scholar
Sobel, H. S., Cepeda, N. J., & Kapler, I. V. (2011). Spacing effects in real-world classroom vocabulary learning. Applied Cognitive Psychology, 25, 763767.CrossRefGoogle Scholar
Suzuki, Y., & DeKeyser, R. (2017). Effects of distributed practice on the proceduralization of morphology. Language Teaching Research, 21, 166188.CrossRefGoogle Scholar
Swain, M. (1995). Three functions of output in second language learning. In Cook, G., and Seidlhofer, B. (Eds.), Principle and practice in applied linguistics: Studies in honor of H. G. Widdowson (pp. 125144). Oxford: Oxford University Press.Google Scholar
Thios, S. J., & D’Agostino, P. R. (1976). Effects of repetition as a function of study-phase retrieval. Journal of Verbal Learning and Verbal Behavior, 15, 529536.CrossRefGoogle Scholar
Toppino, T. C. (1991). The spacing effect in young children’s free recall: Support for automatic-process explanations. Memory & Cognition, 19, 159167.CrossRefGoogle ScholarPubMed
Toppino, T. C., & Bloom, L. C. (2002). The spacing effect, free recall, and two-process theory: A closer look. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 437444.Google ScholarPubMed
Van Strien, J. W., Verkoeijen, P. P., Van der Meer, N., & Franken, I. H. A. (2007). Electrophysiological correlates of word repetition spacing: ERP and induced band power old/new effects with massed and spaced repetitions. International Journal of Psychophysiology, 66, 205214.CrossRefGoogle ScholarPubMed
Verkoeijen, P. P., & Bouwmeester, S. (2008). Using latent class modeling to detect bimodality in spacing effect data. Journal of Memory and Language, 59, 545555.CrossRefGoogle Scholar
Verkoeijen, P. P., & Delaney, P. F. (2008). Rote rehearsal and spacing effects in the free recall of pure and mixed lists. Journal of Memory and Language, 58, 3547.CrossRefGoogle Scholar
Verkoeijen, P. P., Rikers, R. M., & Schmidt, H. G. (2004). Detrimental influence of contextual change on spacing effects in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 796800.Google ScholarPubMed
Verkoeijen, P. P., Rikers, R. M., & Schmidt, H. G. (2005). Limitations to the spacing effect: Demonstration of an inverted U-shaped relationship between interrepetition spacing and free recall. Experimental Psychology, 52, 257263.CrossRefGoogle ScholarPubMed
Vlach, H. A., & Sandhofer, C. M. (2012). Distributing learning over time: The spacing effect in children’s acquisition and generalization of science concepts. Child Development, 83, 11371144.CrossRefGoogle ScholarPubMed
Wahlheim, C. N., Dunlosky, J., & Jacoby, L. L. (2011). Spacing enhances the learning of natural concepts: An investigation of mechanisms, metacognition, and aging. Memory & Cognition, 39, 750763.CrossRefGoogle ScholarPubMed
Webb, S. (2007). The effects of repetition on vocabulary knowledge. Applied Linguistics, 28, 4665.CrossRefGoogle Scholar
Whitten, W. B., & Bjork, R. A. (1977). Learning from tests: Effects of spacing. Journal of Memory and Language, 16, 465.Google Scholar
Xue, G., Mei, L., Chen, C., Lu, Z., Poldrack, R., & Dong, Q. (2011). Spaced learning enhances subsequent recognition memory by reducing neural repetition suppression. Journal of Cognitive Neuroscience, 23, 16241633.CrossRefGoogle ScholarPubMed
Yin, J. C. P., Del Vecchio, M., Zhou, H., & Tully, T. (1995). CREB as a memory modulator: Induced expression of a dCREB2 activator isoform enhances long-term memory in drosophila. Cell, 81, 107115.CrossRefGoogle ScholarPubMed
Zechmeister, E. B., & Shaughnessy, J. J. (1980). When you know that you know and when you think that you know but you don’t. Bulletin of the Psychonomic Society, 15, 4144.CrossRefGoogle Scholar
Zhao, X., Wang, C., Liu, Q., Xiao, X., Jiang, T., Chen, C., & Xue, G. (2015). Neural mechanisms of the spacing effect in episodic memory: A parallel EEG and fMRI study. Cortex, 69, 7692.CrossRefGoogle ScholarPubMed
Zimmerman, J. (1975). Free recall after self-paced study: A test of the attention explanation of the spacing effect. American Journal of Psychology, 88, 277291.CrossRefGoogle Scholar
Figure 0

Figure 1. An illustration of the target word distribution in the study phase. The figure presents 10 sentences at the beginning of each block. Examples of the occurrence distribution for one word in the massed condition (repeating in 4 consecutive sentences) and one word in the spaced condition (repeating in 4 sentences across the four blocks) are highlighted.

Figure 1

Figure 2. Sample study-phase trial sequence with a comprehension question. The English translation for the upcoming Finnish word is always presented before the target sentence.

Figure 2

Figure 3. Boxplots showing the variance in total reading time (in ms) at each repetition in the two conditions.

Figure 3

Figure 4. Total reading times in milliseconds at each repetition for the two conditions. This figure shows medians.

Figure 4

Table 1. Fit statistics and parameter estimates for multilevel growth-curve models for the rate of change in reading times in the two conditions

Figure 5

Table 2. Fit statistics and parameter estimates from a mixed-effects linear modeling analysis of the effects of condition on reading times at each repetition

Figure 6

Table 3. Descriptive statistics for first fixation duration and gaze duration across repetitions in the two conditions

Figure 7

Table 4. Descriptive statistics for additional eye-tracking indices for the four repetitions in the two conditions and for initial exposures across the four blocks

Figure 8

Table 5. Reading times for repeated spaced exposures and first exposures across the four blocks; the table also shows significance values for the differences between the reference (first) block and each subsequent block

Figure 9

Figure 5. Immediate and delayed form recognition posttest scores. These scores are out of 12 possible points and represent the sum of correct answers in each condition, averaged across participants.

Figure 10

Figure 6. Immediate and delayed form–meaning mapping posttest scores. These scores are out of 12 possible points and represent the sum of correct answers in each condition, averaged across participants.

Figure 11

Table 6. Total scores on the immediate and delayed form recognition and form–meaning mapping tests; these scores are collapsed across the two conditions and are out of 24 possible points

Figure 12

Table 7. Parameter estimates from a mixed-effects linear modeling analysis of the effects of condition on learning gains as measured by the four posttests

Figure 13

Figure 7. The path diagram for the mediation analysis. Standardized coefficients for each simple effect are presented alongside the corresponding arrow. The standardized coefficient for the effect of spacing on learning, controlling for attention, appears in parentheses. **p < .01. ***p < .001.

Figure 14

Table 8. Pearson correlation coefficients and principle components analysis results for the four vocabulary posttests