Trying to anticipate what will happen next is a behavior that we seem unable not to engage in. This has led to the characterization of human brains as prediction “engines” or “machines” (Clark, Reference Clark2013; van Berkum, Reference van Berkum2010). The recognition that prediction plays an important, perhaps a fundamental, role in human cognition in general, and in language processing in particular, has transformed the field of psycholinguistics over the past decades, shifting the focus from comprehension as a result of the incremental integration of bottom-up information, to language processing as a proactive mechanism driven in large part by top-down expectations (Ferreira & Chantavarin, Reference Ferreira and Chantavarin2018). What remains under discussion and debate is just how ubiquitous these proactive mechanisms are, and how they might vary depending on the specific circumstances of language use (e.g., Huettig & Guerra, Reference Huettig and Guerra2019; Huettig & Mani, Reference Huettig and Mani2016; Kuperberg & Jaeger, Reference Kuperberg and Jaeger2016).
One approach is to ask about the extent to which prediction generalizes beyond traditional groups in psycholinguistic experiments, that is, healthy, native-speaking, young adults pursuing a college education. As Huettig (Reference Huettig2015, p. 130) concluded in a critical review of the role of prediction in language processing, “[t]he study of prediction … has a particularly strong need for more diverse participant populations” (p. 130; see also Arnett, Reference Arnett2008; Henrich, Heine, & Norenzayan, Reference Henrich, Heine and Norenzayan2010). A number of recent studies have shown substantial variability in the extent to which different types of native language users, such children, nonstudent and older adults, engage in predictive processing (see Huettig, Reference Huettig2015; Pickering & Gambi, Reference Pickering and Gambi2018). At the same time, a flourishing but largely separate literature has investigated predictive processing among second language users (for review, see Kaan, Reference Kaan2014). On the assumption that use of a nonnative language (L2) is normal human behavior, characteristic of the majority of language users worldwide, it strikes us as surprising that these literatures have not been more strongly interconnected. In the belief that L2 processing constitutes one of many variations of human language processing, we situate the present study on expectation-based processing of reference among adult L2 users within the wider context of exploring variability between diverse participant populations.
More specifically, this paper aims to contribute to this broader endeavor by investigating the extent to which a particular effect attributed to expectation-based processing previously reported among college-educated native speakers of English generalizes to an equal-sized sample of college-educated L2 speakers of English. We focus on anticipatory processing involving discourse-level expectations, a domain that relative to prediction at the level of lexical and morphosyntactic processing remains underexplored, especially in the context of L2 processing. The remainder of this paper is organized as follows. After presenting a brief overview of the literature on prediction in native language (L1) processing among language users other than college-aged native speakers, we review relevant previous work on expectation-based (L1 and L2) processing at a discourse level and lay out our research questions. We then report findings from a visual-world eye-tracking experiment that was designed to capture proactive coreference expectations. We probe comprehenders’ expectations prior to the encounter of a disambiguating referential expression, thereby allowing for empirical evidence of prediction in the strict sense of prediction defined by Pickering and Gambi (Reference Pickering and Gambi2018, p. 1005), namely, that the effect is observed prior to disambiguating information in the input. Findings from L1 speakers of English were previously reported in Grüter, Takeda, Rohde, and Schafer (Reference Grüter, Takeda, Rohde and Schafer2018), and showed significant effects indicative of proactive coreference expectations (discussed in more detail below). Here we present results from an equal-sized sample of L2 speakers of English (N = 52), with results showing (a) that the same factor that drove proactive coreference expectations in the L1 group does not appear to do so in the L2 group and (b) that L2 proficiency does not modulate the pattern of effects. We discuss the findings in light of the broader question of the generalizability of expectation-based processing to a wider population of language users, and to L2 users in particular.
Prediction among language users other than native-speaking college students
The evidence for proactive language processing in healthy adult L1 speakers includes examples of prediction at just about all levels of linguistic representation: phonological form, morphosyntax, structural dependencies, lexical semantics, up to the discourse level. For recent reviews, including critical discussion of proposed mechanisms and limitations of prediction in L1 processing, the reader is referred to Huettig (Reference Huettig2015) and Pickering and Gambi (Reference Pickering and Gambi2018). Prediction of upcoming words based on lexical–semantic and morphosyntactic cues has also been shown in children as young as 2 years old (e.g., Lew-Williams & Fernald, Reference Lew-Williams and Fernald2007; Mani & Huettig, Reference Mani and Huettig2012; but see Gambi, Gorrie, Pickering, & Rabagliati, Reference Gambi, Gorrie, Pickering and Rabagliati2018, for later emergence of form-related prediction), with evidence for increasing engagement in predictive processing with increasing age and vocabulary knowledge (Borovsky, Elman, & Fernald, Reference Borovsky, Elman and Fernald2012; Fernald, Zangl, Portillo, & Marchman, Reference Fernald, Zangl, Portillo, Marchman, Sekerina, Fernandez and Clahsen2008). This evidence suggests that prediction is part of human language behavior from early on in development, and that “language prediction skills” are honed in tandem with language development more generally. Potential causal relations in this tandem development during childhood meanwhile remain a matter of ongoing debate (e.g., Huettig & Mani, Reference Huettig and Mani2016).
Predictive processing has also been investigated in adult L1 speakers who are not college students. Federmeier, Kutas, and Schul (Reference Federmeier, Kutas and Schul2010) have presented evidence from a series of studies suggesting that healthy adults in their 60s and above engage less in prediction than younger adults (see also Wlotko & Federmeier, Reference Wlotko and Federmeier2012). More recently, Huettig and Janse (Reference Huettig and Janse2016; see also Huettig & Pickering, Reference Huettig and Pickering2019) have argued for the opposite, namely, that older adults may rely more on prediction, based on findings that age correlates positively with predictive behavior once potentially confounding factors, such as decreased working memory and processing speed, are controlled for. Another relevant factor appears to be L1 literacy and reading experience. Reduced effects of prediction, compared to the typical college student baseline, have been reported in both adults with low literacy (Mishra, Singh, Pandey, & Huettig, Reference Mishra, Singh, Pandey and Huettig2012) and adults with dyslexia (Huettig & Brouwer, Reference Huettig and Brouwer2015).
The body of research reviewed so far has shown that even among L1 users, there is substantial variability in the extent to which prediction contributes to language processing. Factors such as age, literacy, and education appear to matter and are typically framed in terms of individual differences among members of the wider population of speakers of the language. Research that has looked at the role of prediction in L2 processing, in contrast, has typically framed the difference between L1 and L2 speakers as a categorical contrast between two different populations, rather than treating “L1 versus L2 user status” as an individual difference variable among the population of users of the language. Within this general framing, the dominant research question has been whether or not L2 users are able to perform like L1 users, and what individual difference factors among L2 users lead to “more nativelike” performance. We return to this issue in the Discussion section, where we suggest an alternative perspective on the conceptualization of L1–L2 comparisons. For detailed reviews of prediction in L2 processing, the reader is referred to Kaan (Reference Kaan2014), and for a more recent perspective, Schlenter (Reference Schlenter2019). In sum, findings from this literature have been variable, with some studies reporting effects of prediction in L2 similar to those observed in L1 control groups (e.g., Dijkgraaf, Hartsuiker, & Duyck, Reference Dijkgraaf, Hartsuiker and Duyck2017; Foucart, Martin, Moreno, & Costa, Reference Foucart, Martin, Moreno and Costa2014), while others report no or reduced prediction effects among L2 speakers (e.g., Grüter, Lew-Williams, & Fernald, Reference Grüter, Lew-Williams and Fernald2012; Kaan, Dallas, & Wijnen, Reference Kaan, Dallas, Wijnen, Zwart and de Vries2010; Martin et al., Reference Martin, Thierry, Kuipers, Boutonnet, Foucart and Costa2013; Mitsugi & MacWhinney, Reference Mitsugi and MacWhinney2016).
Individual difference factors that might modulate L2 learners’ engagement in prediction have been a key focus in recent work on L2 processing (see Kaan, Reference Kaan2014, for a programmatic review), with overall L2 proficiency perhaps the most frequently mentioned. While a few earlier studies reported significant modulation of a predictive effect by overall L2 proficiency (e.g., Chambers & Cooke, Reference Chambers and Cooke2009), more recent work specifically investigating this factor has produced remarkably little evidence in support of proficiency modulating predictive processing. Looking at prediction based on lexical–semantic cues, neither Dijkgraaf et al. (Reference Dijkgraaf, Hartsuiker and Duyck2017) nor Ito, Corley, and Pickering (Reference Ito, Corley and Pickering2018) observed significant effects of proficiency. Similarly, Kim (Reference Kim2018) observed no effects of proficiency, measured through three independent tasks, on L2 learners’ use of implicit causality to preactivate upcoming referents. In the domain of morphosyntax, neither Hopp (Reference Hopp2015) nor Mitsugi (Reference Mitsugi2018), focusing on case marking in L2 German and numeral classifiers in L2 Japanese, respectively, found modulation of predictive effects by proficiency. Of note, both of these studies reported significant effects of proficiency on later processes of information integration, indicating that the proficiency measures employed were able to capture relevant variability between participants; yet this variability did not modulate engagement in prediction. In order to further examine the extent to which overall L2 proficiency contributes to expectation-based processing at a discourse level, the present study included independently measured L2 proficiency as a potential predictor in the analysis.
The observation of generally more variable findings regarding prediction among L2 speakers has led to the proposal that L2 speakers may have “reduced ability to generate expectations” during language processing, also known as the RAGE hypothesis (Grüter, Rohde, & Schafer, Reference Grüter, Rohde, Schafer, Orman and Valleau2014, Reference Grüter, Rohde and Schafer2017; see also Kaan et al., Reference Kaan, Dallas, Wijnen, Zwart and de Vries2010). This hypothesis has been clearly disconfirmed in its strongest form by studies showing prediction effects of comparable magnitude in L2 and L1 groups (e.g., Djikgraaf et al., Reference Dijkgraaf, Hartsuiker and Duyck2017). At the same time, differences between L1 and L2 users with regard to engagement in prediction have now been observed in numerous experiments, and to the best of our knowledge these differences have, without exception, been in the direction of weaker and/or delayed effects of prediction in L2 compared to L1 groups. The consistent direction of these effects, when present, strikes us as notable, and in need of explanation. We return to this point in the Discussion section, where we revisit the RAGE hypothesis in light of the findings from the present study.
Proactive processing at the discourse level
The majority of research on prediction in language processing has focused on comprehenders’ uptake of contextual cues to anticipate a specific upcoming word. In most cases, the cues in question immediately precede the target word or phrase, as in cues resulting from verb semantics and argument structure restrictions (eat … the cake) or morphosyntactic marking such as gender marking on a determiner (la… pelota, “theFEM ball”). Some studies have also targeted contexts where predictions about an upcoming word cannot be attributed to a single preceding cue within the same clause, but where expectations arise as a result of the wider discourse. For example, in a set of experiments with native Dutch-speaking adults (van Berkum, Brown, Zwitserlood, Kooijman, & Hagoort, Reference van Berkum, Brown, Zwitserlood, Kooijman and Hagoort2005), participants were presented with contexts such as “The burglar had no trouble locating the secret family safe. Of course, it was situated behind a …” (original in Dutch), which were highly predictive of a specific noun (painting). Taking advantage of gender-marking on adjectives preceding indefinite nouns in Dutch, participants then read continuations consisting of either an adjective with gender marking consistent with the predicted noun (… een groot schilderij, “a bigNEU paintingNEU”) or an adjective and noun with different gender (… een grote boekenkast, “a bigCOM bookcaseCOM”). Results showed a reliable positive deflection in the event-related potential (ERP) waveform 50–250 ms after the onset of adjective inflection on adjectives in the discourse-inconsistent condition in an ERP experiment (but see Fleur, Flecken, Rommers, & Nieuwland, Reference Fleur, Flecken, Rommers and Nieuwland2019, for partially inconsistent results in subsequent studies), and slowdowns in the same condition in a self-paced reading study. Findings such as these suggest that prediction is not limited to immediately adjacent cues and targets, and that expectations may not always arise from a single cue, but from a more complex discourse-level representation of an event.
Just as the source of prediction cannot always be tied to a single cue, the target of an expectation is not always a specific lexical item, although this has been the paradigm case in the experimental literature. This is evident, for example, in the case of expectations about next-mention of previously introduced discourse referents. It is a well-known phenomenon that certain verbs induce biases to remention either their subject or object in a causal dependent clause: people are more likely to continue the sentence Helen feared Sue because … with a mention of Sue (either with the pronoun she or the name Sue) than they are for the sentence Helen frightened Sue because …. In these cases of “implicit causality” (IC; Garvey & Caramazza, Reference Garvey and Caramazza1974; Hartshorne, Reference Hartshorne2014), the prediction or expectation is neither for a specific lexical item nor is it deterministic. A number of studies have specifically investigated when differential bias for reactivating subject versus object referents of IC verbs emerge during real-time comprehension. Findings from adult L1 speakers of various languages have not been entirely consistent. Most studies have reported effects of referential bias associated with the implicit causality of the preceding verb during or shortly after the pronoun in the continuation sentence (… because she…; see Koornneef, Dotlačil, van den Broek, & Sanders, Reference Koornneef, Dotlačil, van den Broek and Sanders2016, for review). Such effects are consistent with expectations built up prior to the encounter of the pronoun, and they have been interpreted explicitly as effects of prediction (e.g., van Berkum, Koornneef, Otten, & Nieuwland, Reference van Berkum, Koornneef, Otten and Nieuwland2007). They are also consistent, however, with explanations based on integration difficulty once the pronoun is encountered, that is, accounts not invoking prediction or expectations. Of note, three recent studies have extended the investigation of implicit causality in real-time comprehension to bilingual and L2 speakers (Contemori & Dussias, Reference Contemori, Dussias, Bertolini and Kaplan2018; Kim, Reference Kim2018; Schlenter, Reference Schlenter2019). All three employed the visual-world paradigm, and framed the investigation within a comparison between L1 and L2 groups. All three reported delay and/or reduction of IC biases in the L2 versus the L1 group.
A related finding on coreference processing emerged from an ERP study that examined comprehenders’ well-attested bias to remention goal (vs. source) referents following transfer-of-possession events (Ferretti, Rohde, Kehler, & Crutchley, Reference Ferretti, Rohde, Kehler and Crutchley2009). As observed in a number of earlier offline continuation studies, following a transfer-of-possession event (e.g., Bob passed the salt to Bill), participants preferentially write continuations starting with the goal (Bill) rather than the source (Bob) of the event (Arnold, Reference Arnold2001; Stevenson, Crawley, & Kleinman, Reference Stevenson, Crawley and Kleinman1994). This bias has been shown to be modulated by whether the event is presented as completed (passed, perfective aspect) or ongoing (was passing, imperfective; Kehler, Kertz, Rohde, & Elman, Reference Kehler, Kertz, Rohde and Elman2008). This is consistent with processing accounts that emphasize the status of referents in comprehenders’ mental models of events (Ferretti, Kutas, & McRae, Reference Ferretti, Kutas and McRae2007; Madden & Ferretti, Reference Madden, Ferretti, Klein and Li2009; Magliano & Schleich, Reference Magliano and Schleich2000): for a completed transfer-of-possession event that is part of a narrative, the goal or recipient of the transfer act is in greater focus than in an ongoing event, since that referent is now in possession of the transferred theme. Consistent with this, Ferretti et al. (Reference Ferretti, Rohde, Kehler and Crutchley2009) presented evidence showing that participants experienced disruption, indicated by an enhanced P600 component, when a pronoun referred to the source (the less expected referent) of a transfer event, but only when that event was portrayed as completed (Sue handed the timecard to Fred. She … vs. Sue was handing the timecard to Fred. She …). This effect observed at the pronoun is consistent with an explanation in terms of proactive expectations for reference based on comprehenders’ situation models. Yet like the results of the studies on implicit causality discussed above, which found effects at or shortly after the pronoun in the dependent clause, this effect cannot be interpreted unambiguously as an effect of prediction, as it does not satisfy the more stringent criterion for prediction as an effect measurable prior to disambiguating information in the input (Pickering & Gambi, Reference Pickering and Gambi2018).
Identifying unambiguous effects of prediction satisfying Pickering and Gambi’s criterion has been a challenge in research on reference processing. In an attempt to probe more directly for predictive effects in the context of the goal-bias with transfer-of-possession events, Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018) used the visual-world paradigm to measure L1 English-speaking listeners’ eye gaze to event participants during a pause between a sentence about a transfer-of-possession event and a continuation sentence. Results showed a greater preference for looking at the goal after completed than after ongoing transfer events. This preference was temporally dispersed but sustained and emerged well before the onset of the continuation sentence. This suggests that listeners’ expectations about who would be mentioned in the continuation sentence were influenced not just by an overall goal-bias in transfer-of-possession events but that this bias was further modulated by whether or not the event was described as completed, as marked by grammatical aspect. This finding aligns with that of Ferretti et al. (Reference Ferretti, Rohde, Kehler and Crutchley2009), and provides unambiguous evidence that a differential bias is present prior to the encounter of a referential expression, and hence cannot be the unique result of postlexical integration when a pronoun is encountered. We interpret this bias as a reflection of comprehenders’ continuously updated situation models and the relative salience of event participants therein, which in turn informs their expectations about how a discourse will continue.
This interpretation aligns well with findings from the story continuation studies mentioned above, which showed that L1 English speakers’ reference choices are modulated by the event structure, indicated by grammatical aspect, of the preceding transfer-of-possession event. This effect has been replicated with L1 speakers of other languages that mark aspect grammatically, including Japanese (Ueno & Kehler, Reference Ueno and Kehler2016) and Korean (Kim, Grüter, & Schafer, Reference Kim, Grüter and Schafer2013). Of note, it was not replicated when the same paradigm was used with Japanese- and Korean-speaking L2 learners of English (Grüter et al., Reference Grüter, Rohde and Schafer2017), suggesting L2 users rely less on proactive expectations during reference processing, even when event structure is grammatically marked in a similar way in the speakers’ L1.
Yet although Grüter et al. (Reference Grüter, Rohde and Schafer2017) found no significant effects of aspect on L2 participants’ choice of source or goal referents in their written continuations, it is important to note that aspect did significantly modulate their choice of coherence relation between the context sentence and the continuations they wrote. Specifically, participants were more likely to provide a continuation focusing on the end state of the previous event (occasion or result; see Kehler, Reference Kehler2002, for a typology of coherence relations) following a perfective-marked sentence. This effect was not different from that observed among L1 participants, and demonstrates that the L2 participants in that study were sensitive to some discourse-level implications of grammatical aspect. Grüter et al. (Reference Grüter, Rohde and Schafer2017) attributed the differential effect of aspect on L2 users’ reference versus coherence relation choices to the temporal sequence of decisions they had to make when actively constructing a continuation, postulating that L2 users considered aspect retroactively, at a later point in the production planning process of the continuation. However, offline measures do not allow for direct evaluation of when aspect-driven effects arise. The present study was conceived to address this question more directly by using a more fine-grained probe of participants’ moment-to-moment referential processing through the visual-world paradigm. We thus pose the following research questions:
RQ1: Does the proactive use of discourse-level information in reference processing documented among L1 speakers generalize to L2 speakers of English drawn from the same wider population of college educated young adults?
RQ2: Among L2 speakers, does increased language proficiency lead to greater engagement in proactive use of discourse-level cues?
In addition to the visual-world task reported in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018), L2 speakers also completed three additional tasks: two independent measures of English language proficiency, as well as an offline task specifically designed to test knowledge of grammatical aspect in English.
Method
Participants
Our initial criteria for inclusion in the L2 group were (a) answer “no” to the question “Do you consider yourself a native speaker of English?” and (b) first exposure to English at age 6 or older (for final N = 52: M = 10.6, SD = 2.9, range 6–21). No restrictions on participants’ L1 were imposed as our interest was in generalization to L2 users in general, and as previous work had indicated that L2 speakers do not appear to benefit from facilitative L1 transfer with regard to the role of aspect marking on reference processing (Grüter et al., Reference Grüter, Rohde and Schafer2017). A total of 67 participants who fit these criteria took part in the study. Two additional participants who answered “yes” to the question, but indicated an age of first exposure to English of 6 or above, and did not include English as a language spoken in their childhood homes, were also included. Of these 69 participants, data from 11 was excluded prior to analysis due to nonnormal vision or hearing (N = 1), distraction during the experiment (2), equipment malfunction (1), or poor calibration (7), and data from 6 was excluded after data inspection, due to insufficient valid data in the eye gaze record (see Results section for more detail), leaving 52 participants (33 females) in the final L2 group.
L2 participants were recruited from the University of Hawai‘i student community. They came from a variety of L1 backgrounds, including Bahasa Indonesia (N = 2), Burmese (2), Chinese (24), Croatian (1), French (1), German (2), Italian (1), Japanese (6), Korean (5), Magahi (1), Spanish (2), Swedish (1), Tagalog (1), Thai (1), Urdu (1), and Vietnamese (1). L2 participants completed three independent measures of English proficiency: the Versant English Test (Pearson, 2011), the LexTALE English Test (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012), and self-rating of their English language abilities on a scale of 0–10 (in all four subskills separately, as well as overall). Descriptive statistics are provided in Table 1. Pearson correlations between the three proficiency measures were moderate to strong: Versant~LexTALE, r(48) = .65; Versant~Self-rating: r(48) = .68; LexTALE~Self-rating: r(50) = .51, all p < .001. Given these consistent correlations and the fact that the Versant Test was the most comprehensive measure in terms of including direct assessment of all four subskills, overall scores from the Versant Test were used as indices of proficiency in models examining the contribution of proficiency on experimental outcomes. Conversions for Versant Test scores to general level descriptors of the Council of Europe framework (Pearson, 2011) indicate that the majority of L2 participants fall into the midlevel category of Independent User (B1: 13, B2: 16), with a few lower level advanced Basic Users (A2: 8) and some highest level Proficient Users (C1: 8, C2: 5).
Table 1. Participant information (means and ranges)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_tab1.png?pub-status=live)
aPearson (2011); values from two participants missing. bLemhöfer & Broersma (Reference Lemhöfer and Broersma2012).
Results from the L2 group will be compared with those from the L1 group in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018; N = 53, 28 females). Criteria for inclusion in the L1 group were (a) answer “yes” to the question “Do you consider yourself a native speaker of English?”; (b) answer “English” to “What language do you feel most comfortable speaking in casual conversation now?”; (c) first exposure to English at age 5 or younger (M = 0.4, range 0–5); and (d) self-ratings for English speaking and listening ability of 8 or higher (out of 10).
Knowledge-of-aspect task
A key assumption underlying the main experiment is that listeners associate perfective aspect with completed events, and imperfective aspect with ongoing events. An offline knowledge-of-aspect task was included to verify that this assumption is justified for the L2 participants in this study. Participants read brief narratives describing events that were either ongoing or completed, and were then asked to judge the truth of a statement about a particular time point in the story. An example is provided in (1). The narrative remained on the screen until the participant made a judgment by pressing “true,” “false,” or “not sure.” We expect test sentences with imperfective aspect, as in (1), to be judged “true” in the ongoing and “false” in the completed condition.
(1) Story beginning:
Brenda is at the hospital visiting Anne. [picture of soup]
This is the bowl of soup that Brenda will feed to Anne.
At 11:00, Brenda is ready with the soup and a spoon.
Ongoing condition: Story end + test sentence
At 11:05, Brenda puts the first spoonful of soup into Anne’s mouth.
In the afternoon, Pikachu says:
“At 11:05, Brenda was feeding the bowl of soup to Anne.”
Completed condition: Story end + test sentence
At 12:00, the bowl is empty and Anne wipes her mouth.
In the afternoon, Pikachu says:
“At 12:00, Brenda was feeding the bowl of soup to Anne.”
The task consisted of a total of 22 items of the type illustrated in (1), including 10 items with an imperfective-marked transfer-of-possession verb following a story in which the transfer was portrayed as ongoing (k = 5; true) or completed (k = 5, expected judgment: false). The transfer-of-possession verbs were the same as those used in the visual-world experiment. Given the existence of cross-linguistic differences in the marking of continuous aspect and the interaction of aspect with verb event classes (e.g., Gabriele, Reference Gabriele2009), we considered these two conditions the most critical for determining whether L2 learners understand the function of aspect marking in English. An additional 12 items were included to further assess learners’ interpretations of other aspect, verb class, and event type combinations, including perfective-marked achievement verbs following a completed (k = 4; true) or ongoing (k = 4; false) event, and imperfective-marked accomplishment verbs following a completed event (k = 4; false). Participants were assigned to one of four semirandomly ordered lists counterbalanced for the presentation of verbs in the two critical conditions.
L2 participants completed this task immediately after the visual-world experiment. L1 participants did not complete this task, as we had evidence from a prior L1 group (N = 61, drawn from the same population) on L1 performance. When presenting the results from the L2 group on this task, we refer to these previous L1 data as the L1 reference group.
Two L2 participants did not complete this task; we report results from the remaining 50, together with those from the L1 reference group. “Not sure” responses were rare overall (4.5%, cf. 4.8% in L1 the reference group). The mean proportion of expected judgments across all five conditions was 0.81 (SD = 0.12) in the L2 and 0.84 (SD = 0.10) in the L1 group. A logistic mixed-effect regression model (glmer(IsExpected ~ group + (1|subject) + (1|item))) indicated no significant difference between the two groups (B = –.21, z = –1.4, p = .16). Critically, both L1 and L2 speakers judged statements with imperfective aspect predominantly true with ongoing transfer-of-possession events (L1: M = 0.89, SD = 0.15; L2: M = 0.85, SD = 0.19) and predominantly false with completed transfer-of-possession (L1: M = 0.86, SD = 0.17; L2: M = 0.88, SD = 0.22) and other accomplishment events (L1: M = 0.86, SD = 0.19; L2: M = 0.80, SD = 0.27). The reverse was the case for statements with perfective aspect, which were predominantly judged true with completed events (L1: M = 0.74, SD = 0.26; L2: M = 0.82, SD = 0.20) and false with ongoing events (L1: M = 0.80, SD = 0.19; L2: M = 0.86, SD = 0.17). Separate models were run on the data from each condition separately. Only one—imperfective-marked accomplishments with completed events—showed an L1–L2 difference that was significant at an unadjusted α = .05 level (B = –.80, z = –2.2, p = .03; all other B < |.47| and p >.14).
We take these results to provide overall support for the assumption that the L2 participants in this study did not differ significantly from L1 speakers in their interpretation of aspect in sentences such as those in the visual-world experiment. Any between-group differences we might observe in the visual-world experiment are thus unlikely to be attributable to incomplete knowledge of aspect among L2 participants.
Visual-world experiment
In this task, participants listened to two-sentence mini discourses, as in (2), while looking at visual scenes on a screen. Each trial began with a 2000 ms silent preview of the visual scene (Figure 1a), followed by a context sentence (mean duration = 2856 ms, SD = 277), a 2500-ms intersentential pause, a continuation sentence (mean duration = 2755ms, SD = 331), a 1750-ms pause, and a question. During a second pause of 250 ms, the visual scene changed such that the individual images were replaced with gray boxes (Figure 1b). Participants were instructed to click on the box corresponding to the position of the event participant that best answered the question. This memory component was added to engage participants and disguise the objective of the experiment. Participants’ final responses were not of interest here. Of critical interest in view of the research questions under investigation are listeners’ looks to event participants during the intersentential pause. In particular, in the critical transfer-of-possession events, do we see biases to differentially fixate goals versus sources, and most critically, are these biases modulated by aspect in the context sentence? As reported in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018), such modulation, suggestive of proactive expectations, was observed among L1 speakers of English in this experimental paradigm. Here we ask to what extent the same holds for L2 listeners.
(2) Patrick {gave/was giving} Emma a bottle of nice wine. [context sentence]
{He/She} obviously knew about fancy food and drink. [continuation]
Who knew about fancy food and drink?
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_fig1.png?pub-status=live)
Figure 1. Example of visual displays.
Materials
The experiment consisted of a total of 60 two-sentence items of the type illustrated in (2), composed of 20 experimental and 40 filler items. Experimental items contained one of 10 transfer-of-possession verbs, each used in 2 items, in a double-object construction with a human subject/source, a human indirect object/goal, and an inanimate direct object/theme. Source and goal always differed in gender, such that the pronominal subject of the continuation sentence (she/he) disambiguated reference at that point. Four versions of each sentence were created by manipulating aspect in the context sentence (perfective/imperfective) and reference of the subject pronoun in the continuation (source/goal of context sentence, disambiguated through gender). The four versions were distributed across four experimental lists in a 2 × 2 Latin square design, such that each participant encountered only one version, for a total of five items in each of four aspect/reference conditions.
The 40 fillers items, as well as 6 practice items presented at the beginning of the experiment, were analogous in their overall structure (context–continuation–question) but described non-transfer-of-possession events, including transitive and intransitive verbs from various verb classes. Half of the context sentences in fillers had perfective, half had imperfective aspect. Continuations started with pronouns referring to human or nonhuman participants in the context sentence (she/he/they/it); gender disambiguated reference in some but not all filler items. Questions in fillers asked about various aspects of the context or continuation sentences. A complete list of all items is provided in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018, see online-only Supplementary materials). All sentences were recorded by a female native speaker of American English using a clear speaking style.
All visual scenes contained three areas of interest (AOIs; Figure 1). In experimental items, the three AOIs always represented source, goal, and theme of the transfer-of-possession event. Location of the three AOI types was counterbalanced between items. In filler items, visual scenes contained one or two human event participants, and one or two nonhuman referents from the context sentence, including objects and locations.
Data measurement, treatment and analysis approach
The experiment was conducted using an SMI RED250 eye-tracker sampling at 250 Hz. Eye gaze was recorded and exported through SMI Experiment Suite software, and classified as fixations, saccades, and blinks using the software’s default settings. For further analysis, data were binned into 20-ms samples. Next, we calculated, for each trial, the proportion of sample points containing fixations out of all sample points. Trials with values more than 2 SDs below the mean were excluded. This accounted for 7.1% of the L2 data. As noted earlier, we excluded participants with insufficient valid data, in this case those with fewer than 15 (out of 20) experimental trials remaining after this procedure (N = 6).
Following the same procedures as in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018), we calculated a “GoalAdvantage” score as a measure of bias to fixate goals versus sources during two time windows in each trial. GoalAdvantage scores were calculated by subtracting the number of 20-ms bins with looks to source from those with looks to goal during each of two time windows. The first window (“Silence”) extends from 500 ms after the offset of the context sentence until 200 ms after the onset of the continuation (following standard procedures to offset analyses of fixations by 200 ms; Matin, Shao, & Boff, Reference Matin, Shao and Boff1993). The second window (“Continuation”) extends from 200 to 1500 ms after the onset of the continuation. The silence window constitutes the critical temporal region of interest for our analysis of anticipatory looking behavior. The continuation window was included in the original analysis to explore potential effects of the disambiguating pronoun and interactions with aspect within a single omnibus analysis avoiding multiple comparisons. No such effects were found in the L1 group (Grüter et al., Reference Grüter, Takeda, Rohde and Schafer2018). We likewise include this window here in order to allow for the most direct comparisons possible.
GoalAdvantage scores over relatively long temporal windows were chosen as the dependent measure for two reasons. First, unlike other effects typically investigated in visual-world experiments, the hypothesized effect of aspect cannot be tied to a unique event in the speech signal. Instead, the effect likely results from a complex combination of multiple cues in the construction of a mental event model—a critical component of discourse processing. We thus had no specific predictions about change in fixations over time during the intersentential pause, and for this reason, opted for the most conservative approach to aggregate data over the entire silence region.Footnote 1 Second, GoalAdvantage scores were more normally distributed than empirical logits (Barr, Reference Barr2008) calculated from the same trial-level data, and thus aligned better with the underlying assumptions of the linear mixed models adopted in our analysis.Footnote 2
Results
All analyses were conducted in R (3.6.0), using the lmerTest package (Kuznetsova, Brockhoff, & Christensen, Reference Kuznetsova, Brockhoff and Christensen2017). Figure 2 presents an overview of fixations in both the L2 and the L1 groups over the course of an entire trial, collapsing across all experimental manipulations. What is perhaps most notable on visual inspection is how similar the overall pattering and timing of fixations are in the two groups; in particular, we see no evidence of overall slower or delayed processing in the L2 group. The most apparent difference is in participants’ looks to the theme during the intersentential pause: L2 listeners appeared more likely than L1 listeners to redirect their gaze away from the (last mentioned) theme to one of the two human referents. For L1 listeners, the proportion of looks to theme reduces below that of either human referent by the onset of the continuation sentence (0 ms) whereas for L2 listeners, this happens much earlier (around –1600 ms). This difference was not expected, and we have no ready explanation for it. However, as our critical focus is on differential looks to source and goal, it has no direct impact on our critical analysis.Footnote 3
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_fig2.png?pub-status=live)
Figure 2. Time course of fixations by group, collapsed over aspect and reference. Proportions to each areas of interest (AOI) are calculated out of all fixations to an AOI. Trials were aligned by the onset of the continuation. (In half the items the subject of the continuation refers to the source, in the other half to the goal. Hence when collapsing over reference, roughly equal fixations to source and goal are expected in the continuation.)
Also apparent in Figure 2 is an overall bias, emerging in both groups during the second half of the intersentential pause, to look more at the goal than the source. This is consistent with the well-documented preference to remention goals after transfer-of-possession events (Arnold, Reference Arnold2001; Stevenson et al., Reference Stevenson, Crawley and Kleinman1994), and suggests that this preference originates in proactive expectations prior to the encounter of any referential expressions in the continuation. We further examine this goal bias, and critically its potential modulation by aspect in the context sentence, through mixed-effect modeling following the same procedures as in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018). We begin by modeling the L2 data alone to address our two primary research questions. Do the effects previously observed among L1 speakers of English generalize to L2 speakers drawn from the same wider population (RQ1), and to what extent does proficiency modulate L2 speakers’ engagement in proactive reference processing (RQ2)? We will then combine the L2 data with the L1 data from Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018) to further probe whether the overall effect of aspect remains robust in this larger yet more diverse sample, and whether native-speaker status defined categorically (L1, L2) constitutes a potentially modulating factor.
Analysis of L2 data
We began by fitting the final model for the L1 data (GoalAdv ~ Aspect * Reference * Window + (1 + Aspect * Reference | Subject) + (1 + Aspect * Reference | Item); Grüter et al., Reference Grüter, Takeda, Rohde and Schafer2018, Appendix B) to the L2 data, with aspect (contrast-coded; perfective: –0.5, imperfective: 0.5), reference (Source: –0.5, goal: 0.5), and Window (silence/continuation; treatment-coded, reference-level: Silence) constituting the fixed effects. (Note that when a factor is treatment-coded, all effects not involving this factor are calculated not as main effects but as simple effects at the reference level of this factor. The treatment-coding of window here means that the effects of aspect and reference in the model output reflect the effects of these two factors in the silence window only, i.e., in the time period of critical interest.) This model generated a singular-fit warning and a convergence code of zero, indicating that the model is likely too complex for these data. We thus removed slopes from the random effect terms, and reran the model with only intercepts for random effects. Fixed effects are summarized in Table 2. The only effect that reached significance was the interaction between reference and window, indicating that, as expected, looks to goal and source were affected by the disambiguating pronoun at the onset of the continuation. Critically, no significant effect of aspect was observed (b = 0.25, p = .91). We also note the positive intercept term (b = 3.67, p = .096), indicating a marginal trend toward an overall bias to look at goals during the (reference-level) silence window.Footnote 4
Table 2. L2 data (N = 52): Model statement and summary of fixed effects in linear mixed effects model of GoalAdvantage. Reference level for Window = Silence. Formula: GoalAdv ~ Aspect * Reference * Window + (1 | Subject) + (1 | Item)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_tab2.png?pub-status=live)
Note: *** p < .001, ** p < .01, * p < .05.
In order to verify that the previously reported effects in the L1 group remain when the L1 data are submitted to the simplified model fitted to the L2 data, we fit the same model to the previous L1 data (Table 3). The pattern of results reported in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018) remained unchanged: in addition to the expected Reference × Window interaction (b = 8.63, p = .004), the only other fixed effect that reached significance was aspect (b = –5.41, p = .011). The intercept term also remained significant (b = 3.82, p = .015). We further probed whether the simpler model presents a significant decrease in model fit compared to the fuller model previously reported. Model comparison using the anova() function indicated that this was not the case (χ2 = 9.23, df = 18, p = .95). We thus base all further analyses on models with random intercepts only.
Table 3. L1 data (N = 53): Model statement and summary of fixed effects in linear mixed effects model of GoalAdvantage. Reference level for Window = Silence. Formula: GoalAdv ~ Aspect * Reference * Window + (1 | Subject) + (1 | Item)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_tab3.png?pub-status=live)
Note: *** p < .001, ** p < .01, * p < .05.
Returning to the L2 data, we next probed for potentially modulating effects of English proficiency by adding participants’ overall scores from the Versant English Test (centered) to the model (Table 4).Footnote 5 Model comparison with the previous model indicated a marginal increase in model fit (χ2 = 14.9, df = 8, p = .07). Unexpectedly, the simple effect of proficiency was significant (b = –2.76, p = .03), showing a decrease in the overall bias to look at goals with increasing proficiency during the (reference-level) silence window. More important, proficiency did not interact with aspect (b = 0.47, p = .87). The only other significant effect, beyond the previously observed and expected Reference × Window interaction, was an unexpected three-way interaction between aspect, reference, and proficiency (b = 11.3, p = .02). To explore this interaction and better understand the contribution of proficiency, we divided the data from the silence region by median split of Versant scores, and inspected the data separately for higher (Versant Score ≥ 60, N = 26) and lower proficiency participants (N = 24). In neither of these subgroups did we find effects of aspect, reference, or an interaction between the two. Given that the factor reference should not be observable during the silence region (i.e., the listener does not yet know whether the pronoun in the continuation will refer to the source or goal), and that we found no further support for an Aspect × Reference interaction in the higher and lower proficiency subgroups, we believe the observed three-way interaction may be spurious and not further informative with regard to our research questions. The simple effect of proficiency in the overall model was reflected in the subgroup models through a significant intercept term in the lower proficiency (b = 7.4, p = .02) but not in the higher proficiency subgroup (b = 1.0, p = .7). Interestingly, it is the pattern in the lower proficiency subgroup that is more similar to what was observed in the L1 data, where a similar overall preference for goals was found.
Table 4. L2 data, including proficiency (N = 50): Model statement and summary of fixed effects in linear mixed effects model of GoalAdvantage. Reference level for Window = Silence. Formula: GoalAdv ~ Aspect * Reference * Window * scale(Versant) + (1 | Subject) + (1 | Item)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_tab4.png?pub-status=live)
Note: *** p < .001, ** p < .01, * p < .05.
In sum, modeling of the L2 data following the same procedures as Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018) did not reveal the effect of aspect previously observed in an equal-sized data set from L1 speakers drawn from the same wider college-student population. Figure 3 presents a visual illustration of (the absence of) the aspect effect in the L2 and the L1 sample. This visualization indicates that there is little evidence that a potential effect of aspect is present but weaker or noisier in the L2 data. Our analyses also showed that there was no evidence for proficiency interacting with aspect. Given that the distribution of proficiency scores in the L2 group was wide, and the group included a substantial number of participants at the higher end of the proficiency spectrum, these results suggest that it is unlikely that the effect of aspect is one that might emerge with increasing proficiency.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_fig3.png?pub-status=live)
Figure 3. Time course of fixations by group and aspect, collapsed over reference.
Analysis of combined data set
Combining the L1 and L2 data into a single larger data set provides an additional opportunity for assessing whether the effect of aspect observed among L1 speakers scales up to a more heterogeneous sample of language users. We thus submitted the combined data (N = 105) to the same model as above (GoalAdv ~ Aspect * Reference * Window + (1 | Subject) + (1 | Item)). The effect of aspect did not reach significance (b = –2.56, p = .105). The only significant effects were the expected Reference × Window interaction (b = 13.72, p < .001) and the intercept term (b = 3.74, p = .027). Next, in order to evaluate whether speaker status categorically perceived (L1, L2) was a significant predictor of participants’ reference expectations, we added group (contrast-coded and centered) to the model (Table 5). Model comparison indicated that this addition improved model fit (χ2 = 15.7, df = 8, p = .048). The model returned a marginal interaction between group and aspect (b = 5.74, p = .069); the main effect of aspect remained marginal (b = –2.58, p = .103). The model also indicated significant interactions between group and reference (b = –6.38, p = .044) and between group, reference, and window (b = 10.20, p = .022). Further investigation of these latter effects revealed a nonsignificant trend in the L1 group for greater goal-bias in the silence region for items whose reference would later be disambiguated toward goal, whereas in the continuation region, the effect of reference, highly significant in both groups, was somewhat stronger in the L2 group. We do not have an explanation for these interaction effects, but do not believe they impact conclusions with regard to the research questions under investigation. No other effects reached significance.
Table 5. L1 and L2 data combined (N = 105): Model statement and summary of fixed effects in linear mixed effects model of GoalAdvantage. Reference level for Window = Silence. Formula: GoalAdv ~ Aspect * Reference * Group * Window + (1 | Subject) + (1 | Item)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220130043414143-0157:S0142716420000582:S0142716420000582_tab5.png?pub-status=live)
Note: *** p < .001, ** p < .01, * p < .05.
In sum, analysis of the combined data set (N = 105) showed only a nonsignificant trend in the expected direction for the effect of aspect. This trend was modulated by a marginally significant interaction with group. Analysis of the L2 data alone (N = 52) had shown no effect of aspect, whereas the effect appeared robust in the L1 group (N = 53).
Discussion and conclusion
The goal of this study was to examine to what extent proactive use of discourse-level information in real-time reference processing, as previously observed among L1 speakers, is also characteristic of comprehenders who are not native speakers of the language. To this end, we compare the results of a visual-world eye-tracking experiment with L1 speakers of English, previously reported in Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018) to an equal-sized group of L2 speakers of English. L2 participants also completed an independent offline task to assess their understanding of the linguistic contrast hypothesized to give rise to the predictive effect in real-time processing. Results from this knowledge-of-aspect task showed that the L2 participants associated perfective aspect with completed events and imperfective aspect with ongoing events and that their judgments for transfer-of-possession events in this offline task did not differ significantly from those of L1 speakers. Differences in the use of aspect to create proactive expectations about reference in real-time processing are thus unlikely to be attributable to limitations in L2 participants’ knowledge of aspect in English.
L2 participants’ fixations on source and goal referents during an intersentential pause following sentences containing perfective- or imperfective-marked transfer-of-possession verbs were submitted to the same analyses as those from L1 speakers who completed the same experiment. Whereas these analyses had revealed a significant effect of aspect among L1 speakers, who fixated more on goal referents following perfective-marked than following imperfective-marked transfer verbs, no such effect was found in the L2 group. Further inspection and visualization of the effect did not indicate any trends in the expected direction, suggesting that it is unlikely that a larger sample with increased power would yield a different outcome. With regard to our first research question (RQ1), we therefore conclude that the effect observed by Grüter et al. (Reference Grüter, Takeda, Rohde and Schafer2018) among native English-speaking college students does not easily generalize to nonnative speakers of English drawn from the same wider population of college educated young adults.
Further support for this conclusion comes from the analysis of the combined data from L1 and L2 speakers, in which the main effect of aspect was no longer robust. The marginal interaction between aspect and group (p = .069) indicates that the categorical classification of participants as native versus nonnative constitutes a potentially meaningful predictor of listeners’ engagement in expectation-based processing. A prima facie explanation for this pattern lies in participants’ overall English proficiency. L1 speakers’ self-ratings of their overall English proficiency were significantly higher than those among L2 participants (see Table 1; W = 118.5, p < .001). However, if overall English proficiency were a significant factor for engaging in predictive processing, we would expect proficiency measured as a continuous variable to be a significant modulator of the aspect effect within the L2 group. Yet our analyses showed no such modulation, and no evidence for any trends toward an effect of aspect among the more highly proficient L2 participants. This is noteworthy in that our L2 sample showed substantial variability in overall proficiency, covering the range of A2 to C2 levels, and included participants at the highest (C2) level. Moreover, the measure of proficiency used in our analyses (Versant English Test overall scores) correlated moderately to strongly with two other, less comprehensive measures of proficiency commonly used in L2 processing studies (LexTALE, self-ratings), suggesting that the variance captured by this measure is representative of the construct of L2 proficiency as used in other research in the field. In answer to our second research question (RQ2), we thus conclude that increased overall language proficiency did not lead to greater engagement in proactive use of discourse-level cues in this experiment.
The null effect of proficiency may appear surprising as it is a commonly stated assumption that proficiency modulates L2 learners’ engagement in predictive processing (e.g., Kaan, Reference Kaan2014). As discussed above, however, empirical support for this assumption has been remarkably weak, with multiple studies targeting different linguistic phenomena, in different languages, and using a variety of proficiency measures, reporting no significant effects of proficiency on predictive processing (Djikgraaf et al., Reference Dijkgraaf, Hartsuiker and Duyck2017; Hopp, Reference Hopp2015; Ito et al., Reference Ito, Corley and Pickering2018; Kim, Reference Kim2018; Mitsugi, Reference Mitsugi2018). Proficiency also did not significantly modulate L2 learners’ reference and coherence choices in Grüter et al.’s (Reference Grüter, Rohde and Schafer2017) story continuation study. The null effect of proficiency in the present study is thus consistent with accumulating evidence from recent work suggesting that despite common and perhaps intuitive assumptions, overall proficiency does not appear to play a notable role in L2 speakers’ engagement in expectation-based processing.
A broader goal of this study is to contribute to the understanding of variability among different types of language users with regard to engagement in proactive, expectation-driven mechanisms during language processing. We found that the effect observed among adult L1 speakers did not generalize to L2 users drawn from the same wider population of college-educated young adults. More specifically, unlike L1 users, L2 users did not appear to make use grammatical aspect as an indicator of event structure to dynamically update their situation model and the discourse expectations associated with it during the intersentential pause. This is consistent with the findings from Grüter et al.’s (Reference Grüter, Rohde and Schafer2017) offline story continuation study, and consistent with the RAGE hypothesis in its broad statement that “L2 speakers have Reduced Ability to Generate Expectations” (2017, p. 224).
What remains unknown is why we see differential reliance on expectations in different types of language users. The original RAGE hypothesis suggested that explanations would lie at the level of L2 users’ ability to generate and continuously update expectations during processing, consistent with reduced-capacity models of L2 processing more generally (e.g., Hopp, Reference Hopp2010). We have come to realize, however, that this is not the only possibility (see also Grüter, Lau, & Ling, Reference Grüter, Lau and Ling2020). There may be circumstances when relying on, or even generating, expectations would not enhance processing success, and thus would not be a processing strategy that rationally optimizes available resources. In other words, under certain circumstances, the costs always associated with prediction, that is, the potential need to revise a proactively created representation that turns out to be false, may not outweigh its benefits (see also Federmeier, Reference Federmeier2007). For L2 users, this could apply when relevant L2 knowledge is represented differently as a result of how young (L1) versus older (L2) learners extract information from the input (see, e.g., Grüter et al., Reference Grüter, Lew-Williams and Fernald2012, following proposals by Arnon & Ramscar, Reference Arnon and Ramscar2012). In this case, a particular linguistic cue may not be reliable enough within an L2 user’s system of knowledge representations to make the risks that come with launching a prediction worthwhile. Reduced engagement in predictive processing under this scenario would then not necessarily be a reflection of reduced ability to predict, but of reduced utility of prediction under these circumstances.
This perspective is consistent with Kuperberg and Jaeger’s (Reference Kuperberg and Jaeger2016) view of prediction as a utility function, whereby a resource-bound rational comprehender weighs its advantages and disadvantages in a given situation. The extent to which a language user will engage in prediction is then seen as “a function of its expected utility, which, in turn, may depend on comprehenders’ goals and their estimates of the relative reliability of their prior knowledge and the bottom-up input” (p. 32). Exploring the possibility that L2 users’ reduced engagement in prediction in certain domains is not due to differences in ability but to differences in utility of prediction requires a more fundamental reframing of the often implicit research question in the field of L2 processing, namely, whether L2 learners can “achieve L1 processing efficiency.” Instead, we may need to consider that what is most efficient for L1 users might not always be most efficient for L2 users; or, put differently, that reduced engagement in prediction (as compared to L1 speakers) is the most efficient and adaptive processing strategy for L2 users under certain circumstances.
Identifying what exactly these circumstances are is an issue that future work will need to explore. As a direction for future research into modulating factors of prediction in L2 processing, we would like to suggest a reframing of the RAGE hypothesis that allows for the possibility that differential processing outcomes between L1 and L2 speakers could result not only from reduced ability to engage in prediction but also reduced utility of prediction when weighing its potential costs and benefits in a given knowledge system and context. In other words, we suggest that when reduced engagement in prediction is observed in L2 users vis-à-vis an L1 comparison group, we consider the possibility that this is not deficient but maximally adaptive processing behavior.
To be very clear, the RAGE hypothesis, both in its original formulation and in the updated version we suggest here, does not constitute a scientific hypothesis in the sense that it would allow us to derive clearly falsifiable predictions (no pun intended). Yet we believe we simply do not know enough at this point to be able to formulate a more clearly testable hypothesis concerning the specific circumstances and contexts under which L2 users are predicted to show reduced engagement in or reliance on proactive processing mechanisms. At present, the RAGE hypothesis in its current (and previous) form merely offers a broad statement of the empirical observation that in some, but attestedly not all, circumstances L2 users show reduced effects of expectation-based processing (see Grüter et al., Reference Grüter, Rohde and Schafer2017, p. 224). Further exploratory research is needed to hone our understanding of when L2 users employ predictive mechanisms. We believe the consideration of utility, in the sense of Kuperberg and Jaeger (Reference Kuperberg and Jaeger2016), in addition to ability will be productive in designing such research. The RAGE hypothesis thus constitutes an invitation to further explore the role that predictive mechanisms play in L2 processing, seen as an instance of human language processing more generally, in the hope that such further exploration will eventually allow us to formulate more directly testable predictions about the contexts in which we will or will not see reduced engagement in expectation-based processing among L2 users.
Acknowledgments
This work was supported by Standard Grant BCS-1251450 from the National Science Foundation. We thank Amy J. Schafer, as well as A. L. Blake, Amber Camp, Catherine Gardiner, Victoria Lee, Wenyi Ling, Ivana Matson, Maho Takahashi, and Aya Takeda, for their various contributions to this project, and the reviewers of this journal for their very helpful feedback.