Introduction
For the proficient, adult native listener, sentence processing requires the coordination of multiple interacting sources of linguistic and non-linguistic information, which are used to converge on a target interpretation rapidly and accurately. To what extent are the same sentence processing mechanisms employed and the same sources of information relied upon to a comparable degree by language learners?
The evidence gathered within the last 30 years has uncovered a high degree of continuity between the end-state and the learning human sentence parser. Much like adult native speakers, in fact, both first and second language learners make probabilistic use of multiple cues to interpret sentences in real time (e.g., Hopp, Reference Hopp2010; Omaki & Schulz, Reference Omaki and Schulz2011; Trueswell & Gleitman, Reference Trueswell and Gleitman2007; Ying, Reference Ying1996), albeit with some important differences. For instance, children rely more heavily on the cues that are most reliable within the target language (e.g., Bates, MacWhinney, Caselli, Devescovi, Natale & Venza, Reference Bates, MacWhinney, Caselli, Devescovi, Natale and Venza1984; Slobin & Bever, Reference Slobin and Bever1982; Snedeker & Trueswell, Reference Snedeker and Trueswell2004), while second language speakers have been shown to rely more strongly on semantic and discourse information than on structural cues (e.g., Clahsen & Felser, Reference Clahsen and Felser2006). A well-known, striking characteristic of child sentence processing concerns children's difficulties revising initial parsing commitments that are contradicted by late-available evidence (e.g., Trueswell, Sekerina, Hill & Logrip, Reference Trueswell, Sekerina, Hill and Logrip1999). Children's systematic inabilities to revise have been linked to their immature cognitive control and executive function (EF) skills (e.g., Choi & Trueswell, Reference Choi and Trueswell2010; Novick, Trueswell & Thompson-Schill, Reference Novick, Trueswell and Thompson-Schill2005; Woodard, Pozzan & Trueswell, Reference Woodard, Pozzan and Trueswell2016). Specifically, the idea is that domain general EF-skills are engaged when the processing system needs to abandon an initially preferred analysis in favor of a dispreferred one; revision of initial interpretations is thus negatively affected if these cognitive skills are impaired (e.g., in patients) or underdeveloped (e.g., in children).
This proposal lends itself to the straightforward prediction that difficulties revising initial interpretations are a developmental, rather than a learner phenomenon, i.e., they should not characterize the processing profiles of adult second language (L2) learners, as cognitive capacities are fully mature in this latter group. The available evidence on this topic is inconclusive: reading studies indicate that adult L2 learners might display particular difficulties revising initial interpretations, especially when such initial interpretations are plausible and supported by the context (Juffs & Harrington, Reference Juffs and Harrington1996; Williams, Möbius & Kim, Reference Williams, Möbius and Kim2001; Roberts & Felsers, Reference Roberts and Felser2011), but such findings have not been consistently replicated across speakers and structures (Juffs, Reference Juffs2004; Williams, Reference Williams2006; Roberts & Felser, Reference Roberts and Felser2011). Furthermore, although reading studies are informative to theories of real-time parsing commitments, differences between L1 and L2 learners may be the product of reading itself, as L2 learners are additionally challenged by having to read in their L2, a factor that could mask or enhance L1 vs. L2 processing differences. Relatedly, the existing results cannot be easily compared to those from L1 children, because L2 studies have focused on written comprehension. For these reasons, we conducted a visual world study that would allow us to compare more directly how L2 learners’ sentence processing patterns compare with those of L1 adults and children.
This topic is important for at least two reasons. First, the extent to which adult L2 learners resemble more closely native adults or child language learners in their ability to recover from garden-paths has implications for current theories of child sentence processing, which, as discussed above, have linked children's difficulties with revision to their underdeveloped EF-skills. According to this cognitive immaturity view, adult L2 learners should not experience selective difficulties with garden-path recovery. To the extent that L2 adults’ revision patterns resemble instead those of child learners, a more general theory of garden-path recovery might be preferable. According to this more general theory, difficulties revising initial interpretations would stem from the cognitive load associated with less automatized processing routines and partially incomplete language representations. Second, a better understanding of processing similarities and differences between adult and child learners might provide important insights into the similarities and differences between L1 and L2 acquisition itself, especially since real-time processing limitations – most notably the difficulty revising initial parsing commitments – have been shown to influence grammar acquisition (see Pozzan & Trueswell, Reference Pozzan and Trueswell2015).
The study presented here thus focuses on how adult L2 learners process temporarily ambiguous structures that often require revision of initial interpretations as compared to unambiguous sentences of similar complexity, length, and meaning. For example, a sentence like (1) is hypothesized to be often associated with revision of an initial interpretation and consequent processing difficulties because the prepositional phrase “on the napkin” is initially analyzed as the goal of the action, but this initial interpretation needs to be revised and the prepositional phrase re-interpreted as a nominal modifier once the second prepositional phrase (“into the box”) is heard. The temporary ambiguity, and the subsequent need for revision, is absent in (2), due to the presence of the relative complementizer “that” which guides the parser from the start towards the target modifier interpretation.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160920223351898-0925:S1366728915000838:S1366728915000838_tabU1.gif?pub-status=live)
Differently from previous studies in the second language acquisition literature (e.g., Juffs & Harrington, Reference Juffs and Harrington1996; Roberts & Felser, Reference Roberts and Felser2011; Williams, Möbius & Kim, Reference Williams, Möbius and Kim2001; Jacob & Felser, Reference Jacob and Felser2015), which have focused exclusively on written comprehension, this study instead focuses on how L2 adults process and carry out spoken instructions and interact with a visual world while their eye-movements are recorded. Our design and materials were closely modeled on previous garden-path studies. Similarly to those studies, the referential context in which utterances were presented was manipulated, alongside the ambiguity manipulation. Referential context affects native adults’ processing and interpretation, in that 2-referent contexts (see Figure (1A)) are typically associated with fewer looks to the incorrect goal (the empty napkin in the figure, which serves as a potential goal, where to put the frog) and fewer signs of processing difficulty as compared to 1-referent contexts (see Figure (1B); e.g., Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy, Reference Tanenhaus, Spivey-Knowlton, Eberhard and Sedivy1995; Spivey, Tanenhaus, Eberhard & Sedivy, Reference Spivey, Tanenhaus, Eberhard and Sedivy2002). In contrast, referential effects are not reliably observed in children, possibly due to the fact that being able to use this cue requires a non-trivial understanding of the speaker's current knowledge state and referential domain (e.g., Snedeker & Trueswell, Reference Snedeker and Trueswell2004; Trueswell & Gleitman, Reference Trueswell and Gleitman2007; see also Brown-Schmidt, Campana & Tanenhaus, Reference Brown-Schmidt, Campana, Tanenhaus, Trueswell and Tanenhaus2005).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921032526-91996-mediumThumb-S1366728915000838_fig1g.jpg?pub-status=live)
Figure 1. Example of visual world for “Put the frog on the napkin onto the box” in 2-Referent (1A) and 1-Referent (1B) contexts.
Our L2 participants were adult L1-Italian L2-English speakers; this population was selected because Italian has the same word order as English, as well as the same PP-attachment ambiguity, and comparable referential constraints (see (3)–(4) below). Any differences between the performance (in English) of the native and the L2 adults should thus reflect additional difficulties associated with processing a non-native language.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160920223351898-0925:S1366728915000838:S1366728915000838_tabU2.gif?pub-status=live)
We predicted that, if difficulties revising initial interpretations are due to cognitive immaturity, adult L2 learners should not show selective difficulties interpreting and revising temporarily ambiguous sentences. If, on the other hand, difficulties revising an initial interpretation are better conceived as a learner phenomenon, potentially arising in response to the greater cognitive demands associated with processing a less proficient language, revision difficulties might appear in L2 adults as well as native children. Furthermore, since adult L2 learners are particularly sensitive to referential and discourse information (e.g., Dekydtspotter, Donaldson, Edmonds, Fultz & Petrush, Reference Dekydtspotter, Donaldson, Edmonds, Fultz and Petrush2008; Pan & Felser, Reference Pan and Felser2011; Roberts & Felser, Reference Roberts and Felser2011), and the semantics and discourse properties of definite determiners are similar in English and Italian, we predicted that 2-referent contexts might be associated with decreased processing costs in adult L2-speakers as compared to 1-referent contexts even when processing the L2 language.
Experimental Investigation
Method
Participants
Data from 63 participants were analyzed: 33 adult L1-Italian L2-English speakers, who were tested in Italy (Age Range 23–60, Mean Age = 30.75 years, SD = 7.52 years; 18 females), and 30 monolingual English adults (Age Range: 18–25, Mean Age: 19.47 years, SD: 1.36 years; 17 females) who were tested at the University of Pennsylvania. L2 speakers’ exposure to English was through a combination of formal instructions and study abroad/professional activities. Their English proficiency, as assessed by oral comprehension subtest of the Michigan Test of English Language Proficiency (MTELP), was intermediate (Range: 20/45–44/45; Mean: 35/45, SD: 6.3).
Materials and procedure
All materials were pre-recorded by a native speaker of American English. The experiment consisted of two practice trials followed by 24 experimental sentences in a two (temporary ambiguity: ambiguous vs. unambiguous) by two (referential context: 1- vs. 2-referents) design. Each experimental sentence (e.g., “Put the frog on the napkin onto the box”) was followed by either one or two follow-up filler sentences (e.g., “Now move the frog back”; “Now move it up and down”) and intermixed with 36 additional filler trials that began with a non-target sentence (e.g., “Put the pear near the stapler”) and continued with follow-up filler sentences.
Stimulus display was controlled by E-Prime and delivered on a Tobii T120 eye-tracker monitor (display size: 800 × 600). At the beginning of each trial, objects were labeled as they individually appeared in one of the four corners of the display. Participants looked and clicked on the crosshair in the center of the screen to hear a pre-recorded instruction. They then performed an action by using the mouse to move the objects. Participants could start a response at any point during the instructions, but had only 1500 milliseconds after the end of the instruction to perform their action;Footnote 1 a “beep” would indicate that the time to complete the instruction was up, at which point the next trial would start automatically. Participants’ eye movements were recorded every 16 ms; act-out performance was recorded by E-Prime and coded manually.
Results
Actions
Participants’ actions were coded following Trueswell, Sekerina, Hill & Logrip (Reference Trueswell, Sekerina, Hill and Logrip1999)'s coding scheme. Actions were coded as ‘correct’, if the Target (e.g., the frog on the napkin in Fig. 1) was moved directly to the Correct Goal (e.g., the box), ‘incorrect goal’ (IG) if an animal was initially moved to the incorrect goal (e.g., the napkin) or as ‘other’.Footnote 2 An ‘incorrect goal’ action suggests that the listener failed to recover the correct parse of the sentence, and instead persisted in interpreting the first PP (e.g., on the napkin) as the goal phrase. Participants’ actions evidenced difficulties revising initial interpretations in response to ambiguous sentences, as ‘incorrect goal’ actions disproportionally occurred in the ambiguous conditions. Critically, as shown in Fig. 2, while L2 learners’ performance was overall more error-prone than that of native speakers, act-out errors were particularly high in response to ambiguous sentences in 1-referent contexts. In these contexts, L2 participants’ error rates were numerically comparable to those observed in children, as illustrated by the comparison with Trueswell et al. (Reference Trueswell, Sekerina, Hill and Logrip1999)'s results.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921032526-30315-mediumThumb-S1366728915000838_fig2g.jpg?pub-status=live)
Figure 2. Proportions of actions towards the Incorrect Goal (IG) as a function of ambiguity, referential context, and group.
The observation that L2 adults experienced greater difficulties recovering from garden-paths, especially in 1-referent contexts, is confirmed by the results of a multi-level linear model on e-logit-transformed incorrect goal action rates (see Table 1 for a summary of the model). Overall, L2 speakers produced more incorrect actions than native speakers, ambiguous sentences were associated with higher error rates than unambiguous sentences, and 2-referent contexts were associated with fewer errors than 1-referent contexts. These significant main effects were qualified by three two-way interactions and a three-way interaction; as the estimates in Table 2 indicate, the effect of ambiguity was stronger for L2 speakers, as compared to native speakers, but this differential effect was more pronounced in contexts where referential information did not support the target modifier interpretation (i.e., 1-referent contexts).
Table 1. Mixed effects model of actions towards the IG.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160920223351898-0925:S1366728915000838:S1366728915000838_tab1.gif?pub-status=live)
Notes: The maximal converging random effect structure of the model included by-subject random intercepts and by-subject random slopes for the effects of referential context and ambiguity.
Table 2. Mixed effects model of looks to the IG.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160920223351898-0925:S1366728915000838:S1366728915000838_tab2.gif?pub-status=live)
Notes: The maximal converging random effect structure of the model included by-subject and by-item random intercepts, together with by-subject random slopes for the effects of referential context and ambiguity and by-item random slopes for the effects of ambiguity, group and their interaction.
To further explore the nature of the observed three-way interaction, participants’ performance was examined separately for ambiguous and unambiguous sentences. Performance on unambiguous sentences was modulated only by the effect of group (Estimate: .28, S.E. = .06, p <.001), with L2 adults performing more incorrect goal actions than native speakers. In contrast, performance in ambiguous sentences was modulated by the effects of group (Estimate: .69, S.E. = .12, p <.001), referential context (Estimate: .30, S.E. = .07, p <.001), and their interaction (Estimate: .27, S.E. = .07, p <.001): L2 learners’ performance on ambiguous sentences was overall less accurate than that of native speakers, but particularly so in 1-referent contexts (1-referent: Estimate: .95, S.E. = .15, p <.001; 2-referents: Estimate: .42, S.E. = .12, p = .001).
In summary, the act-out results indicate that L2 speakers’ performance, similarly to that of child and adult native speakers, is negatively affected by the presence of a temporary ambiguity. Crucially, however, L2 act-out patterns differed in important ways from those of the native adults tested in this experiment and those reported for children in previous visual world studies: while L2 speakers were overall less accurate than native adults, their performance was particularly error-prone in ambiguous contexts. Differently from the patterns observed for children, however, adult L2 learners benefited from referential cues, as act-out accuracy significantly increased in 2-referent as compared to 1-referent ambiguous contexts.
These findings suggest that failures to revise initial processing commitments are not unique to participants with immature executive functions, but seem to characterize the processing profiles of adult L2 learners as well. The data presented so far, however, do not allow us to conclude this with certainty. It is possible that L2 adults’ particularly high error rates stemmed from increased initial consideration of the non-target goal interpretation during early parsing committments, rather than from increased difficulties with revision. That is, L2 speakers’ higher error rates might not be due to increased revision difficulties, but rather to an increased initial tendency to interpret the PP “on the napkin” as the goal of the action.
In order to investigate this issue, the next section examines participants’ eye movements before disambiguating information became available in the sentence and could thus be used for revision. Our approach was two-fold: first, to determine whether L2 learners were more likely than native speakers to interpret the PP “on the napkin” as the goal of movement during online processing, we examine whether the two groups differed in terms of their initial consideration of the incorrect goal. This analysis also allows us to determine whether referential information was used by L2 participants early on during sentence processing (as might be expected in 2-referent contexts), and thus helped avoid a garden-path to begin with, or whether it was mainly used to revise an incorrect interpretation. Second, we examine whether the observed act-out differences between L2 and native-speaker participants are eliminated once participants’ initial consideration of the incorrect goal during processing was taken into account.
Eye movements during sentence processing
Trials in which overall trackloss exceeded 40% were excluded from the analyses. Altogether, 3% of the trials were dropped; after these trials were removed, average track-loss was 3%.
Figure 3 plots the proportion of time participants spent looking at the incorrect goal from the onset of the critical prepositional phrase (e.g., “on the napkin”) until disambiguating information became available to the participant (i.e., the onset of “box”) as a function of group, ambiguity, and referential context. This time period best reflects early real-time processing commitments that might later need to be revised.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921032526-49492-mediumThumb-S1366728915000838_fig3g.jpg?pub-status=live)
Figure 3. Proportions of looks to the IG as a function of ambiguity, referential context and group during the time window from the onset of the critical prepositional phrase (e.g., “on the napkin”) until disambiguating information (i.e., onset of “box”).
As seen in the figure, mean looking time patterns between native and L2 speakers were fairly similar, with only small differences emerging. This conclusion was supported by subsequent analyses. In particular, the data were analyzed using a multi-level linear model performed on e-logit-transformed looks to the ‘incorrect goal’ referent (for a summary of the relevant model, see Table 2).
Overall, ambiguous sentences were associated with higher consideration of the incorrect goal than unambiguous sentences, and 1-referent contexts with higher looks to the incorrect goal than 2-referent contexts. The main effect of group was not significant, indicating similar overall consideration of the incorrect goal for L2 and native adults during this time window. Furthermore, two two-way interactions between ambiguity and referential context and group and ambiguity emerged. To better understand these two interactions, participants’ performance was examined separately for ambiguous and unambiguous sentence. A reliable effect of referential context emerged in both ambiguous (Mean Proportion Difference = .05; Estimate = .45, SD = .06, p < .001) and unambiguous sentences (Mean Proportion Difference = .02; Estimate = .15, SD = .06, p = .02), although it was stronger for ambiguous than unambiguous sentences. The effect of group was not reliable in either ambiguous (Mean Proportion Difference: .02; Estimate = .18, SD = .10, p = .09) nor unambiguous sentences (Mean Proportion Difference: .00; Estimate = −.04, SD = .09, p = .67), although it was stronger for ambiguous than unambiguous sentences.
These results indicate that differences between how adult L2 learners and native speakers interpret temporarily ambiguous sentences before disambiguation are quite subtle and unlikely to be the only cause of L2 learners’ considerable high act-out error rates associated with ambiguous sentences. To further investigate this issue, for each item, we calculated individual participants’ looking times (measured in e-logits) towards the incorrect goal during the time window preceding disambiguation; we used these looking times to predict act-out errors and calculated residuals for individual items and subjects. Finally, we used a model that contained the main effects of ambiguity, referential context, and language group, together with all interactions, to predict the obtained residuals. The rationale behind this analysis is that if L2 learners’ higher rates of act out errors associated with ambiguous sentences only stem from increased consideration of the incorrect goal during processing, but not from difficulties with revision, the effect of group and its interaction with ambiguity should not emerge once consideration of the incorrect goal is accounted for in the act-out errors. This was not the case. While consideration of the incorrect goal was a significant predictor of act-out errors (Estimate = .21, SD = .04, p < .001), all critical effects survived once this relationship was taken into account.
Taken together, these results suggests that adult L2 learners differ from native speakers in their ability to abandon an incorrect analysis on the basis of later arriving information, above and beyond the presence of differences during early processing commitments. These results also show that both groups of adults use referential information early during sentence processing to help drive parsing commitments.
General Discussion
We asked whether difficulties revising initial processing commitments are a unique feature of the immature parser, or rather a characteristic of the learning parser. A group of adult L2 speakers of English, together with a group of adult native English speaker controls, acted out instructions containing temporary PP-attachment ambiguities, as well as unambiguous sentences. Online processing patterns were similar for adult L2 and native speakers, in that both groups showed higher consideration of the incorrect goal in response to temporarily ambiguous sentences as compared to unambiguous ones, and early signs of sensitivity to referential information. For instance, the interpretation patterns of both groups of adults were modulated by referential context early during sentence processing, in sharp contrast with the pattern observed for children. Adult learners’ ability to use referential information might be the result of L1-transfer, given that in Italian, like English, the use of a definite determiner in a 2-referent context is anomalous in the absence of further modifying information, but might also be related to adult learners’ enhanced sensitivity to contextual and discourse factors (Pan & Felser, Reference Pan and Felser2011); future research on L2 learners whose L1s do not display analogous definiteness contrasts (e.g., Russian) is likely to shed light on this issue.
L2 learners’ act-out patterns were overall less accurate than those of native speakers, with particularly high error rates in response to temporarily ambiguous sentences. More specifically, L2 learners’ error rates on ambiguous sentences that did not benefit from contextual information were comparable to those reported for native children (e.g., Trueswell et al., Reference Trueswell, Sekerina, Hill and Logrip1999; Woodard et al., Reference Woodard, Pozzan and Trueswell2016), and consistent with systematic revision failures. These results suggest that difficulties revising initial interpretations are more widespread than previously thought, in that they are not only found in language learners with immature cognitive abilities, but are also observed in those whose EF-skills are fully mature and intact.
How can these results be integrated with previous findings of a robust relationship between revision abilities and EF-skills, across populations and methodologies (e.g., January Trueswell & Thompson-Schill, Reference January, Trueswell and Thompson-Schill2009; Novick, Trueswell & Thompson-Schill, Reference Novick, Trueswell and Thompson-Schill2005; Novick, Kan, Trueswell & Thompson-Schill, Reference Novick, Kan, Trueswell and Thompson-Schill2009; Woodard, et al., Reference Woodard, Pozzan and Trueswell2016)? It is certainly possible that revision difficulties have different causes in different populations, and hence that the low act-out accuracy associated with temporarily ambiguous sentences in adult L2 speakers simply reflect performance breakdowns associated with complex sentences. However, it seems to us that a more promising line of inquiry would be to integrate the present findings with existing accounts linking garden-path recovery and domain-general cognitive abilities. A way to do so is to capitalize on the proposal that brain structures related to cognitive control (i.e., LIFG and other prefrontal structures) are recruited during the processing of a non-native, non-fully-proficient language system to a larger extent than processing an L1 or a highly proficient L2 (e.g., Abutalebi, Reference Abutalebi2008). Under this view, difficulties with revision in L2 learners would stem from cognitive depletion/overload of the cognitive control network, since processing of a not-fully-proficient language and sentence revision would be competing for the same set of cognitive resources. Thus, while an individual's cognitive resources are a stable characteristic of that individual, the amount of resources that can be allocated to a given task (e.g., revision) depend on the demands of the concurrent tasks (L1 processing vs. L2 processing) that are being performed by the system at any given time.
As this was a purely behavioral study, and time limitations prevented us from collecting measures of EF-skills to correlate with individual subjects’ performance, this proposal is speculative at the moment; it is our hope that future studies will investigate its predictions in more detail. Research on the availability of specific domain-general cognitive resources during L2 processing seems particularly important, especially given recent findings that processing limitations affect language learning trajectories (Pozzan & Trueswell, Reference Pozzan and Trueswell2015), that training-related EF improvements correlate with processing improvements for garden-path sentences in native adults (Novick, Hussey, Teubner-Rhodes, Harbison & Bunting, Reference Novick, Hussey, Teubner-Rhodes, Harbison and Bunting2014) and L2-children (Pozzan, Woodard & Trueswell, Reference Pozzan, Woodard and Trueswell2014), and that individual differences in EF-skills predict language learning outcomes (Kapa & Colombo, Reference Kapa and Colombo2014).