1. Introduction
One notion that has received widespread attention in the psycholinguistics of language production is structural priming,Footnote 1 i.e., the fact that speakers tend to re-use structures they have recently comprehended or produced themselves. For instance, all other things being equal, a speaker who has just heard and/or used a ditransitive/double-object (DO) construction (cf. (1)a) and then intends to describe another transfer-of-possession scenario is more likely to use a ditransitive construction again than a prepositional dative/object (PO) construction (cf. (1)b), compared to a speaker who has just heard a prepositional dative construction.
-
(1)
-
a. [NP The man] [VP gave [NP Recipient the squirrel [NP Patient the nuts]].
-
b. [NP The man] [VP gave [NP Patient the nuts] [PP to [NP Recipient the squirrel]]].
-
Such priming effects are robust and widespread: they have been obtained with both observational and experimental methods, even over long distances between the first use of a construction (the prime) and the subsequent use (the target), both from production to production and from comprehension to production, in various tasks (picture description, sentence completion, dialog tasks. . .), in many languages and, even more interestingly, between languages (see e.g., Pickering & Ferreira, Reference Pickering and Ferreira2008, for a review).
Structural priming is assumed to have numerous social and cognitive functions; it is seen as a mechanism underlying the creation of mutual intelligibility in dialogue (Pickering & Garrod, Reference Pickering and Garrod2004), a way to facilitate selection and planning processes during language production (e.g., MacDonald, Reference MacDonald2013), and a mechanism of implicit language learning (e.g., Chang, Dell & Bock, Reference Chang, Dell and Bock2006; see Ferreira & Bock, Reference Ferreira and Bock2006, for more information about the functions of structural priming). In addition, it is used as a window into the cognitive mechanisms of utterance planning, based on the following logic: if syntactic information is primable, then this information represents a meaningful processing unit in the production planning process. Results on structural priming have thus played a major role in the advancement of theoretical and computational models of language production beyond the single-word level (e.g., Bock, Reference Bock1986; Pickering & Branigan, Reference Pickering and Branigan1998; Reitter, Keller & Moore, Reference Reitter, Keller and Moore2011).
While the first mentions of priming as a phenomenon in its own right can be found in observational studies, ever since Bock (Reference Bock1986), the study most widely considered the first (experimental) structural priming study,Footnote 2 the vast majority of studies of priming have been experimental in nature. A likely factor in the appeal of experiments is their ability to control and manipulate variables and thus to rule out counter-explanations of structural priming, which is, of course, much more difficult with corpus data (though see below). Indeed, at some point in time, corpus-based priming studies were held in quite low esteem by proponents of experimental methods; the following assessment by Branigan, Pickering, Liversedge, Stewart and Urbach (Reference Branigan, Pickering, Liversedge, Stewart and Urbach1995: 492) can be seen as representative of the then predominant view:
there are several nonsyntactic factors which could lead to repetition. [. . .] Corpora have proved useful as a means of hypothesis generation, but unequivocal demonstrations of syntactic priming effects can only come from controlled experiments (cf. also Pickering & Branigan, Reference Pickering and Branigan1999:136).
So why would we bother to do corpus-based structural priming research? The answer is, first of all, that experiments are often based on relatively artificial language contexts and non-spontaneous language behavior. For instance, by the very nature of the tasks and controlled conditions involved, experimental stimuli often have syntactic characteristics that are atypical in actual discourse (e.g., three full lexical NPs in ditransitive clauses) or expose subjects to distributionally unrealistic linguistic sequences. To be able to draw conclusions in terms of ecological validity, especially with respect to the assumed functions of structural priming, experimental results should be complemented with findings from spontaneous language use (see also Gullberg, Indefrey & Muysken, Reference Gullberg, Indefrey, Muysken, Bullock and Toribio2009, for the importance of validating experimental data with corpus data; and vice versa). Secondly, recent innovations in corpus-based approaches have made it increasingly possible to study structural priming with a high level of internal validity. Not only has the number of corpus-based studies of structural priming increased in number and diversity, but these studies are also based on new resources and methods that have become available, allowing for both broader and deeper study of priming. These include more corpora containing spoken data and, crucially, the way in which the field has begun to use more powerful statistical tools, which provide more possibilities for ruling out counter-explanations and thus result in stronger conclusions. It is therefore appropriate to (i) take stock of how observational priming research has evolved over the last three decades as well as (ii) discuss the role that observational data can, and maybe should, play in priming research.
In this paper, we will do this with a specific eye towards priming in bilingual situations. There are multiple reasons to adopt a corpus-based approach to priming in bilingual situations. Firstly, even more so than in the monolingual literature, results on structural priming in bilingual situations are almost exclusively based on laboratory experiments. Corpus-based research is needed to validate these results with evidence from spontaneous language use. A welcome development in this case is that the availability of large-scale bilingual corpora is growing, which makes it increasingly easy to perform quantitative analyses of priming on the basis of spontaneous bilingual language use (see e.g., Fricke & Kootstra, Reference Fricke and Kootstra2016; Myslín & Levy, Reference Myslín and Levy2015; Torres Cacoullos & Travis, Reference Torres Cacoullos and Travis2013, Reference Torres Cacoullos and Travis2016; Travis, Torres Cacoullos & Kidd, Reference Travis, Torres Cacoullos and Kidd2017, which will be discussed in Section 4 of this paper). A second reason is that most corpus-based research on bilingual language production has focused on the level of the single sentence / utterance (e.g., Broersma & de Bot, Reference Broersma and De Bot2006; Carter, Deuchar, Davies & Parafita Couto, Reference Carter, Deuchar, Davies and Parafita Couto2011; Poplack, Reference Poplack1980; Poplack, Zentz & Dion, Reference Poplack, Zentz and Dion2012), and not so much on dependencies between utterances in the form of priming. Corpus-based priming research can provide insight into the extent to which priming influences bilingual language production above and beyond these ‘single-sentence’ factors. A third reason to study priming in bilingual corpora is that it can substantiate the monolingual priming literature: given that more than half of the world population is bilingual (e.g., Grosjean, Reference Grosjean2010), it is important to investigate whether findings from monolingual priming research also generalize to bilingual situations (see also Fricke & Kootstra, Reference Fricke and Kootstra2016).
The paper is structured as follows. In Section 2 we provide a (by necessity, brief) overview of experimental research on cross-linguistic priming, which serves to showcase the kinds of methods that have been used as well as the main findings generated, and main questions explored. In Section 3, we provide a selective overview of how observational studies of priming from the monolingual literature have evolved from the very earliest approaches till now; again, we will emphasize the kinds of conceptual and methodological steps that facilitated this development, and do not have the ambition to be fully comprehensive in our review.Footnote 3 Finally, Section 4 will conclude by showing and discussing how the corpus-based developments from the monolingual literature can be used to study structural priming in bilingual settings.
2. Cross-linguistic priming
Most research on structural priming in bilinguals is focused on cross-linguistic priming, i.e., the phenomenon that hearing/producing a syntactic structure in one language will increase the probability of producing a related structure in another language. The main theoretical reason why cross-linguistic priming has been studied is that it provides a powerful window into the bilingual mind, especially with respect to the levels of processing at which cross-language activation can occur, which is a key issue in the psycholinguistics of bilingualism (cf. e.g., Hartsuiker & Pickering, Reference Hartsuiker and Pickering2008). That is, if bilinguals’ syntactic choices are primed by the structure of a non-target-language, this can only be explained by assuming the existence of cross-language interaction at the syntactic level. Another reason to study cross-linguistic priming is that it can provide a potential explanatory mechanism for a variety of language contact phenomena, such as code-switching, contact-induced language change, and cross-linguistic interaction in second language learners (see e.g., Kootstra, van Hell & Dijkstra, Reference Kootstra, van Hell and Dijkstra2010, Reference Kootstra, van Hell and Dijkstra2012; Kootstra & Doedens, Reference Kootstra and Doedens2016; Loebell & Bock, Reference Loebell and Bock2003; Muysken, Reference Muysken2013).
The first ground-breaking study on this was Loebell and Bock (Reference Loebell and Bock2003). Loebell and Bock (Reference Loebell and Bock2003) investigated cross-linguistic priming in the use of the dative alternation and the voice alternation of German–English bilinguals; the former syntactic pattern involves similar structures in both languages, the latter does not. The manipulation in their experimental design involved hearing and repeating a sentence in one language and then describing it in the other. For the dative alternation, there was a priming effect, regardless of target language, and a trend of more priming from German to English than vice versa. No priming effects were found for the voice alternation, however. Loebell and Bock interpreted the results within the implicit-learning account of structural priming developed by Bock and Griffin (Reference Bock and Griffin2000) and Chang, Dell, Bock and Griffin (Reference Chang, Dell, Bock and Griffin2000) and argued that “whenever languages share common procedures for building sentence structures, the use of the shared procedure in one language makes it more accessible to the other” (p. 809).
Similar results were obtained with a dialogic picture-description task by Hartsuiker, Pickering and Veltkamp (Reference Hartsuiker, Pickering and Veltkamp2004), who focused on priming of transitive actives, passives, intransitives, and OVS sentences in native speakers of Spanish with moderate or high proficiency in English. The question was whether participants’ syntactic choices in English would be primed by the structure of a Spanish sentence produced just before by a confederate. This was indeed what Hartsuiker et al. found. Unlike Loebell and Bock, they interpreted their findings with regard to Pickering and Branigan's (Reference Pickering and Branigan1998) combinatorial-nodes model of lexical and structural representations in the mental lexicon. Based on their cross-language results, Hartsuiker et al. argued that the most efficient way to account for cross-language priming is by assuming a model in which syntactic representations (i.e., combinatorial nodes) are shared between languages. This study has had a major impact on the development of models of bilingual language processing beyond the single-word level (see Hartsuiker & Bernolet, Reference Hartsuiker and Bernolet2017, for more information on this model and how it is assumed to develop in language learners).
In an attempt to experimentally disentangle cross-linguistic lexical priming from syntactic priming (i.e., the above priming effects could [partly] have been due to the to in datives and the by in passives; cf. also Bock & Loebell, Reference Bock and Loebell1990), Desmet and Declercq (Reference Desmet and Declercq2006) explored whether relative clause attachment, a syntactic choice that minimizes the role of lexical priming in the subjects' syntactic choices, can be primed from Dutch to English. They used a sentence-completion task, in which Dutch native speakers with a very high proficiency in English completed sentences to determine whether high or low relative clause attachment in Dutch affects production probabilities in English. They indeed found the hypothesized priming effect, and subsequently ruled out that that this effect was due to priming of discourse-based representations.
While the above discusses just a few early studies, the area of cross-linguistic priming studies has been growing considerably in the last few years, in terms of fleshing out details, comparing predictions made by theoretical alternatives, exploring factors that boost and/or constrain cross-linguistic priming, and broadening the scope of investigation. For instance, Schoonbaert, Hartsuiker and Pickering (Reference Schoonbaert, Hartsuiker and Pickering2007) studied boosted effects of cross-linguistic priming when the primes and targets contain translation equivalents (see also Cai, Pickering, Yan & Branigan, Reference Cai, Pickering, Yan and Branigan2011); Salamoura and Williams (Reference Salamoura and Williams2007) explored the way in which syntactic structures and thematic-role order interact in cross-linguistic priming; Bernolet, Hartsuiker and Pickering (Reference Bernolet, Hartsuiker and Pickering2007) built on studies that showed that word order on its own can persist in ways that cannot be explained away by conceptual priming, and found that cross-linguistic priming is driven by shared word order (though see Chen, Jia, Wang, Dunlap & Shin, Reference Chen, Jia, Wang, Dunlap and Shin2013); Kantola and van Gompel (Reference Kantola and van Gompel2011) studied within-versus between-language priming and found no differences between the two (though see Travis et al., Reference Travis, Torres Cacoullos and Kidd2017); Bernolet, Hartsuiker and Pickering (Reference Bernolet, Hartsuiker and Pickering2012) found evidence of cognate facilitation effects in cross-language structural priming, based on which they argued for the existence of phonological feedback in sentence production; Fleischer, Pickering and McLean (Reference Fleischer, Pickering and McLean2012) argued on the basis of cross-linguistic priming of the Polish voice alternation that bilinguals construct a language-independent level of information structure; and Kootstra and Doedens (Reference Kootstra and Doedens2016) found evidence of cumulative forms of cross-linguistic priming both within and between experimental blocks. Also, work has been done on priming in code-switching (Kootstra et al., Reference Kootstra, van Hell and Dijkstra2010, Reference Kootstra, van Hell and Dijkstra2012) and on within-language priming in the L2 (cf. e.g., Flett, Branigan & Pickering, Reference Flett, Branigan and Pickering2013; Gries & Wulff, Reference Gries and Wulff2005, Reference Gries and Wulff2009; McDonough, Reference McDonough2006; McDonough & Trofimovich, Reference McDonough and Trofimovich2011; Nitschke, Kidd & Serratrice, Reference Nitschke, Kidd and Serratrice2010; Nitschke, Serratrice & Kidd, Reference Nitschke, Serratrice and Kidd2014; Shin & Christiansen, Reference Shin and Christianson2009, Reference Shin and Christianson2012 see also Schoonbaert, Hartsuiker & Pickering. Reference Schoonbaert, Hartsuiker and Pickering2007). Like the studies on cross-linguistic priming, these studies have found strong evidence of priming, thus confirming that the mechanism of structural priming plays an important role in multiple forms of bilingual speech.
As noted, almost all research on structural priming in bilinguals has been experimental, a development that is somewhat at odds with the growing number of studies on within-language priming that use corpus data. Recently, however, large-scale bilingual corpora have become accessible and the first corpus-based priming studies have been done (e.g., Fricke & Kootstra, Reference Fricke and Kootstra2016; Torres Cacoullos & Travis, Reference Torres Cacoullos and Travis2013, Reference Torres Cacoullos and Travis2016; Travis et al., Reference Travis, Torres Cacoullos and Kidd2017). To be able to interpret and appreciate this corpus-based work with reference to previous research on within-language and cross-linguistic priming (in Section 4), it is first necessary to recap the evolution of corpus work in priming research, which is the topic of the next section.
3. The development of corpus-based studies on within-language priming
3.1. Early corpus-based studies
In this section, we will briefly discuss a few of the earliest corpus-based publications that either mention priming or study it directly (even though not yet necessarily using the term priming). Typically, the first observational study of priming that is mentioned (cf. e.g., Pickering & Ferreira Reference Pickering and Ferreira2008: 428, who also mention earlier studies that mentioned priming effects in passing) is Schenkein (Reference Schenkein and Butterworth1980). Schenkein discussed repetitions of topical, inflectional, structural, or thematic material in a conversation between burglars over walkie-talkies.
Another study, which was concerned with the voice alternation of active vs. passive, is Weiner and Labov (Reference Weiner and Labov1983). Their concern was to identify the factors that make speakers choose passive structures over active ones. Their study was based on interview data from 21 speakers from working-class white neighborhoods in Philadelphia; for statistical analysis, they used a version of Varbrul analysis, an outdated variant of binary logistic regression (cf. Johnson, Reference Johnson2009, for discussion). In a way that was way ahead of nearly all work on alternations at the time, they explored many predictors of passive choices, both external (such as style, sex, ethnicity, and class of the speakers) and internal (such as givenness of the logical object). For the present purposes, the most important predictors they included were those of structural parallelism and later preceding passives – i.e., priming – but it is worth pointing out that they already also explored the role of the distance between prime and target as well as the possibility that priming effects may be cumulative, an issue to which we will return below. Among other things, they found that “preceding passive proves to be an independent and powerful conditioning factor” (p. 52), which can be seen as evidence in the direction of structural priming.
The probably most systematic observational study of priming predating most experimental work (with the exception of Levelt & Kelter, Reference Levelt and Kelter1982) is Estival (Reference Estival1985), apparently the first study to use the notion priming. Estival followed up on Weiner and Labov exploring the voice alternation, but she focused on priming by attempting to partial out various potentially confounding effects such as
-
− repetitions resulting from discourse structure (e.g., question-answer sequences, denials, corrections. . .) as well as lexical repetitions;
-
− the availability of multiple competing referents;
-
− the alignment of co-referential NPs into identical argument positions.
Even after correcting for these confounds, she still found a robust effect of priming, which shows already at this early stage that it is possible to document and explore priming in observational data.
While these are the studies that are commonly cited as the earliest attempts to study priming effects on the basis of observational data, it is worth pointing out that there is some corpus-based work that predates even the above references. To our knowledge, the earliest study specifically dedicated to priming (under a different name, though, and arguably not fully structural in nature) is Sankoff and Laberge (Reference Sankoff, Laberge, Sankoff and Laberge1978). They start out from the observation that, in previous (sociolinguistic) work, something like priming was usually ignored: “[Since Labov (Reference Labov1970), it has] become accepted practice to treat successive occurrences of a variable, even in the same utterance, as independent binomial trials”. Based on this observation, they explored three forms in the pronominal system of Montreal French and studied the degree to which different speakers switch from one realization of a variable/alternation to the other. They represented these speaker-specific switch rates in plots which feature the proportion of one variant on the x-axis, the rate with which the speaker switches to that variant on the y-axis, and each speaker as a point in the coordinate system; cf. Figure 1 as an example, which reveals priming by the fact that most speakers are below the main diagonal, which represents the null hypothesis of random switching. In some sense, thus, their approach is a descriptive version of runs tests (cf. Sheskin, Reference Sheskin2011: Test 10) per speaker.
While all these observational studies yielded interesting results – and the studies by Weiner and Labov and by Estival in particular were conceptually quite advanced and foreshadowed much later work – after Bock's influential (1986) experimental study, it seems that the study of structural priming was left to experimental psycholinguists for the next 20 years, until developments in cognitive/usage-based linguistics led to a first renaissance of priming studies using observational data, which is the topic of the next section.
3.2. The second wave of corpus-based studies
After a 20-year hiatus, the second wave of corpus-based studies of within-language priming arose out of work by Gries (Reference Gries2003 [2000], Reference Gries2005, Reference Gries and Schönefeld2011), which was inspired by developments in work in the domains of cognitive and usage-based linguistics as well as psycholinguistics, and Szmrecsanyi (Reference Szmrecsanyi2005, Reference Szmrecsanyi2006), which was inspired by work in variationist sociolinguistics and psycholinguistics. This section discusses these studies as well as their implications for the then subsequent third and current wave of corpus-linguistic work on priming.
3.2.1. Gries (Reference Gries2005)
Gries (Reference Gries2005) was one of the first studies introducing the second wave of corpus research of priming. This study followed up on first priming-related remarks in Gries (Reference Gries2003 [2000]), one of the first corpus-based multifactorial studies of syntactic alternations outside of sociolinguistics that also took priming into consideration. The study arose in the then emerging area where corpus linguistics and cognitive/usage-based linguistics and psycholinguistics overlap, and paved the way for countless similar studies (of which Bresnan, Cueni, Nikitina, & Baayen, Reference Bresnan, Cueni, Nikitina, Baayen, Bouma, Krämer and Zwarts2007, is probably the most widely cited). Gries (Reference Gries2005) reports on two case studies, one on the dative alternation (as was exemplified in (1) above); the other on the alternation of particle placement (which is exemplified here in (2)).
-
(2)
-
a. [NP The squirrel] [VP picked up [NP Patient the nuts]].
-
b. [NP The squirrel] [VP picked [NP Patient the nuts] up].
-
Both case studies were based on several thousand examples each from the British Component of the International Corpus of English (ICE-GB), a one-million-word corpus (60% spoken data, 40% written data) that is part-of-speech tagged and syntactically-parsed. Gries used a two-pronged strategy. First, he used a multifactorial statistical approach, for which he annotated a variety of predictors, which he then entered (with their statistical interactions) into a general linear model. As for the predictors, it is useful to distinguish between alternation predictors, i.e., linguistic/contextual predictors that have been argued to govern an alternation (often factors such as length, givenness, definiteness, etc.), and priming predictors, i.e., predictors that have to do with the nature of priming (such as distance between prime and target and others to be discussed below). Using that terminology, both case studies of Gries (Reference Gries2005) can be summarized as involving
-
− one alternation predictor, namely Medium, whether the examples are from spoken or written data;
-
− several priming predictors such as CPrime (the construction/variant used in the prime), Distance (the distance between prime and target, here measured numerically in the ICE-GB's parse units rather than categorically as in Bock & Griffin, Reference Bock and Griffin2000), SpeakerID (is the speaker of the target the same as that of the prime?), and then a variety of predictors that code how similar the prime is to the target: VFormID (is the verb form the same?), VLemmaID (is the verb lemma the same?), plus, for particle placement, VPartID (is the particle the same?) and PhrasVID (is the phrasal verb the same?);
-
− the response CTarget (the construction/variant used in the target).
Gries’ main findings for the dative alternation were that there was an overall priming effect, the size of which was very similar to the one reported in Bock (Reference Bock1986). The effect was independent of Medium but participated in interactions with several other predictors. For instance, compatible with Pickering and Branigan (Reference Pickering and Branigan1998: Exp. 1), priming was stronger when the verb form/lemma was the same, i.e., when the prime and target are similar. In addition, there was an effect that priming decays logarithmically with the distance between prime and target.
For particle placement, the findings were on the whole similar, but a bit more complex. For example, a significant interaction Medium:VLemmaID:CPrime revealed that priming in writing is stronger when the lemma was the same than when it was not. Also, it turned out that using the same construction with the same verb was stronger when the speaker changed, and again there was a logarithmic decay of priming with Distance.
In addition to the multifactorial statistical approach, the second new and major aspect of Gries' work was his application of distinctive collexeme analysis (DCA, Gries & Stefanowitsch, Reference Gries and Stefanowitsch2004) from cognitive/usage-based linguistics, or Construction Grammar, to priming. Starting out from (i) Potter & Lombardi's (Reference Potter and Lombardi1998: 278) suggestion that priming effects might be different for different verbs and (ii) the recognition that other aspects of processing are highly lexically-specific (e.g., Direct Object (DO) / Sentential Complement (SC) parsing preferences; cf. Garnsey, Pearlmutter, Myers & Lotocky, Reference Garnsey, Pearlmutter, Myers and Lotocky1997), he explored whether particular verbs in the target slot are more or less likely to be primed for one construction or the other. Specifically, a DCA involves (i) creating for each verb attested in at least one of the two constructions of an alternation a table such as the one exemplified in Table 1 and (ii) computing from that an association measure that quantifies a notion called (distinctive) collexeme strength whether verb v likes to occur with construction x or y and how much so; many different association measures are available, but most publications use -log10 p Fisher-Yates exact test; in what follows, we will refer to this variable as VTargetPref, because it quantifies the (degree of) constructional preference of the verb in the target slot.
Gries found in both case studies that verbs are differently likely to be primed towards one construction or the other in a way that is correlated with their constructional preferences as computed from a DCA; specifically, verbs in the target are more likely to be primed in the direction of the construction they ‘prefer’ and ‘resist’ priming towards the other construction. The main implications of this work are, therefore, the possibility to study priming in a corpus-based fashion with multifactorial statistics and the advice that future work on priming should take lexically-specific preferences (more) into consideration.
3.2.2. Szmrecsanyi (Reference Szmrecsanyi2005, Reference Szmrecsanyi2006)
The work by Szmrecsanyi went beyond that of Gries (Reference Gries2003, Reference Gries2005) in essentially three different ways. First, he widened the scope of persistence – the term he prefers over priming – by distinguishing two different kinds of it:
-
− α-persistence, where the use of a specific variant of Z increases the likelihood that the same variant of Z will be used again, which is straightforward priming;
-
− β-persistence, where the use of a pattern Z* that is parallel/similar to one variant of Z increases the likelihood that that variant of Z will be used again (cf. below for an example).
Second, for each of his case studies, he included a larger number of alternation predictors, which makes sure that the variability in the constructional choices that they explain cannot be claimed by the priming predictors, which in turn makes the results for the priming predictors more reliable (since they cannot ‘get credit’ for accounting for variability that is better explained by non-priming factors). Third, he used a better-suited statistical approach, namely binary logistic regressions (from the generalized linear model), which does more justice to the distributional characteristics of the data. In this section, we will briefly discuss two case studies from Szmrecsanyi (Reference Szmrecsanyi2005) and refer to Szmrecsanyi (Reference Szmrecsanyi2005, Reference Szmrecsanyi2006) for more discussion.
Szmrecsanyi's (Reference Szmrecsanyi2005) first case study involves comparison choice as exemplified in (1).
-
(1)
-
a. The squirrel solved the trickier problem.
-
b. The squirrel solved the more tricky problem.
-
His analysis of 533 instances of comparison choices in the context-governed part of the BNC involved a variety of predictors:
-
− the alternation predictors of Length (the length of the synthetically inflected form), Morphology (does the adjective base begin with un-?), Stress (is the polysyllabic adjective stressed on the final syllable?), Frequency (of the adjective), Syntax (is the adjective used attributively?), DegreeMod (is the adjective preceded by a degree modifier?), Complement (is the adjective followed by a prepositional or infinitival complement?);
-
− the priming predictors of MoreTrigger (a measure of β-persistence: does the form more occur within the preceding 25 words (even if not in a comparative structure?), Distance (the distance between prime and target?), and CPrime (α-persistence: the comparison choice used in the prime).
Szmrecsanyi found that many of the alternation predictors exhibit the effects one would expect from previous literature but, more interestingly for our present purposes, he also found effects of CPrime:Distance (i.e., a priming effect that logarithmically decays with increasing distance) as well as of the β-persistence predictor MoreTrigger: a preceding more that is not part of an analytic comparative also triggers analytic comparatives).
Szmrecsanyi then revisits particle placement, the alternation exemplified earlier in this paper in (2). He coded data from the Freiburg English Dialect Corpus (FRED) for, again, a sizable number of predictors, arriving ultimately at 1048 annotated instances:
-
− the alternation predictors of DefiniteDO (is the DO definite?), NewsValueDO (has the referent of the DO been mentioned before?), SyllablesDO (length of the DO in syllables), ComplexityDO, Literalness (is the meaning of the construction literal/spatial or idiomatic?), DirectionalPP (is the construction followed by a directional PP?), VTargetPref (what is the constructional preference of the verb in the target based on CollStrength?), and DialectArea (in FRED);
-
− the priming predictors of VlemmaID, Distance, CPrime, SentenceLength (the length of the sentence of the target).
He found that alternation predictors have the effects expected from Gries (Reference Gries2003 [2000]). In addition, CPrime interacted with other priming predictors: priming was stronger with identical verb lemmas, which replicates Pickering and Branigan (Reference Pickering and Branigan1998) and Gries (Reference Gries2005), priming decayed with increasing prime-target distance, and the more complex the sentence containing the target, the weaker the priming effect. On the whole, however, the priming predictors again significantly increased the classification accuracy of the statistical models.
In sum, Szmrecsanyi's studies are very interesting in that they show that priming effects are observed for alternations less studied in the experimental literature and can be modeled well when the effects of many alternation predictors are statistically controlled for; they involve β-persistence as well as lexically-specific effects, and they increase with increased prime-target similarity.
3.2.3. Gries (Reference Gries and Schönefeld2011 [2008])
The study by Gries – first presented in 2008 and then published in 2011 – consists of re-analyses of the data of Gries (Reference Gries2005). It is a methodological paper and mainly relevant here in how it reveals the crucial importance of choosing the right type of statistical analysis. Specifically, Gries discusses three levels of granularity at which the effects of priming predictors on priming can be studied. First, at the coarsest level of granularity, one can theoretically do simple cross-tabulation and explore prime and target construction frequencies with a chi-squared test and odds ratios; as pointed out above, these results are similar to Bock (Reference Bock1986).Footnote 4
Second, one can, as Szmrecsanyi did, use a generalized linear model into which all sorts of predictors (and ideally their interactions) are entered to determine the effects of predictors in a truly multifactorial setting. Using such an analysis, Gries shows that a variety of predictors reach standard levels of significance: Medium, Distance, CPrime:VLemmaID, CPrime:VFormID, CPrime:SpeakerID, all of which have the expected effects; the overall model, while including only priming predictors is significant with an R 2 of 0.25 and a classification accuracy of 63.7%; again this model strictly speaking suffers from a dependence-of-data points issue; cf. note 4.
Third and most importantly, however, one can compute a generalized linear mixed effects model (GLMM), i.e., a regression that, here minimally, includes varying adjustments to intercepts for corpus files (to heuristically approximate authors/speakers and maybe registers) and verbs (for lexically-specific effects) and that, therefore, addresses the fact that the data points are not independent (see also e.g., Baayen, Davidson & Bates, Reference Baayen, Davidson and Bates2008; Jaeger, Reference Jaeger2008, in which similar statistical techniques are introduced with reference to experimental data). Choosing this statistically most appropriate tool has two important consequences:
-
− the coefficients of the predictors remaining in the final model are much more precise, boosting the classification accuracy to 89.8%;
-
− since now the varying adjustments (significant according to Likelihood ratio tests) ‘take care of’ many idiosyncratic effects, the number of significant fixed-effect predictors is much smaller: the only relevant effect now is CPrime:VFormID; an additional comparison reveals that it is especially medium-frequency verbs whose classification accuracy is boosted (often by more than 50%).
While Gries (Reference Gries and Schönefeld2011 [2008]) does not advance priming research much in terms of relevant predictors, it does show the importance attached to the choice of statistical methods: accounting for lexically-specific and speaker-specific variation enhances the statistical model's precision, but may also lead to alternation and priming predictors not reaching standard levels of significance anymore.
3.2.4. Interim summary of early and second-wave studies
The above shows how corpus-based priming research has become increasingly sophisticated. Gries (Reference Gries2005) was the first multifactorial corpus-based study of priming and the first to study lexically-specific effects in more detail, but did not include sufficiently many alternation predictors and used a sub-optimal statistical tool (ANOVA). Szmrecsanyi's work improved on this by including alternation predictors as well as β-persistence and using logistic regression modeling, but did not take lexically-specific and speaker-specific variation into consideration that much. Gries (Reference Gries and Schönefeld2011 [2008]) then again did not include alternation predictors, but showed how GLMMs help dealing with lexically-specific and speaker-specific variation and with identifying truly relevant determinants of priming, truly relevant in the sense of remaining significant in a model that includes lexically-specific and speaker-specific effects.
All of these studies show that priming can be studied corpus-linguistically, that such studies do not necessarily inflate priming results as may have been feared (because of the noisiness and collinearity that are much more characteristic of corpus data than of experimental data), and that different types of persistence may be distinguished. In addition, corpus data allow the researcher to study more words, prime-target distances, registers, or any other kind of moderator variables than most experimental studies would, as well as to explore the phenomenon in ecologically more valid scenarios: one can easily include lexically-specific frequencies and baseline frequency effects in the analyses and avoid exposing subjects to unnatural stimuli or stimulus distributions potentially leading to within-experiment learning effects (e.g., Schütze, Reference Schütze1996: Section 5.2.3, Gries & Wulff, Reference Gries and Wulff2009, Jaeger, Reference Jaeger2010, Doğruöz & Gries, Reference Doğruöz and Gries2012, Torres Cacoullos & Travis, Reference Torres Cacoullos and Travis2013, and others), which should therefore also be included in the statistical modeling of priming effects in experimental studies (see e.g., Kootstra & Doedens, Reference Kootstra and Doedens2016, for an example of this). Given the resulting complexity, the movement towards GLMMs, which is now also becoming the standard in experimental studies, is a welcome development: they
-
− avoid conflating individual data points into proportions per lexical item and/or participant which make it difficult, for instance, to explore within-subject accumulative priming effects of the type explored by Gries and Wulff (Reference Gries and Wulff2009);
-
− avoid different ANOVAs on different constructional choices (as in Savage et al., Reference Savage, Lieven, Theakston and Tomasello2003) or successive experiments by allowing one to combine datasets and probe interactions between the predictors and a variable coding for datasets; the corpus-linguistic parallel to this would be to not do separate analyses on different speakers or different corpora, but include indicator variables for corpora and speakers as predictors or random effects;
-
− avoid unnecessary methodological decisions such as the factorization of numerical data;
-
− provide a state-of-the-art approach towards handling data points that exhibit dependencies including crossed random effects (speakers and/or lexical items) as well as nested random effects (registers/conversations/speakers) and can handle data even if they violate assumptions of repeated-measures ANOVAs (such as sphericity).
Current corpus-based studies of priming address many of these issues in many ways to be discussed now.
3.3. Recent and current developments
This section discusses a variety of recent and current – third-wave, so to speak – corpus-based studies of priming. A first sub-section is devoted to discussing proposals for additional variables thought to be correlated with priming (Section 3.3.1); another is concerned with recent developments broadening the scope of priming studies in various ways (Section 3.3.2).
3.3.1. Additional variables: similarity, cumulativity, surprisal
One interesting extension of the second-wave work discussed above (in particular Szmrecsanyi's) is the second case study of Snider (Reference Snider, Taatgen and van Rijn2009),Footnote 5 in which he explores in more detail the role of similarity on priming. More specifically, a variety of studies (e.g., Pickering & Branigan, Reference Pickering and Branigan1998; Gries, Reference Gries2005, Reference Gries and Schönefeld2011; Szmrecsanyi, Reference Szmrecsanyi2005, Reference Szmrecsanyi2006) have shown that prime-target similarity enhances priming, but the focus of these studies was essentially on the variable VlemmaID – Snider's approach to similarity is much more global: he compares each prime to the corresponding target using the multi-feature Gower metric as a distance measure, which can compare the similarity of two objects based on many categorical and/or numeric features characterizing the objects. He proceeds to test the hypothesis that “two exemplars that are more similar in the sense that they share more features and have a lower [distance] between them, are more likely to prime” (p. 818), where the features included in his data were all predictors of the dative alternation that Bresnan et al.’s (Reference Bresnan, Cueni, Nikitina, Baayen, Bouma, Krämer and Zwarts2007) data were annotated for.
They report results of a GLMM that included all predictors of the dative alternation and CPrime:VLemmaID, but also the interaction CPrime:GowerDistance. Wald z-scores and Likelihood-Ratio tests revealed that, as hypothesized, CPrime:GowerDistance had a significant effect: “[when] the prime construction is PO, the PO construction is 10.6 times more likely in the target for every one-unit decrease in the distance between the prime and target in feature space” (p. 819), and this is above and beyond the effect of CPrime:VLemmaID. This is an interesting finding because the generality of this similarity effect can help understand the more specific effects of VFormID and VLemmaID reported in earlier studies and may perhaps also be related to Szmrecsanyi's β-persistence (a maybe tenuous connection, however, that Snider does not make).
Another extension of previous work is pursued in Jaeger and Snider (Reference Jaeger, Snider, Love, McRae and Sloutsky2008), which explores the notion of Cumulativity, i.e., the degree to which there is syntactic priming “beyond the most recent structure” (p. 1062), which would be unexpected by transient activation accounts of priming (which assume that priming effects are relatively short-lived). Again, while they do not make that connection, this is essentially the type of cumulativity already alluded to much earlier by Weiner and Labov (Reference Weiner and Labov1983) (and later studied by Gries and Wulff, Reference Gries and Wulff2009, under the name SelfToRatio). Their case study of the voice alternation, passivizable actives and passives from the Penn Treebank that had a preceding prime and involved verbs occurring >9 times in the data, showed that priming is sufficiently long-lived and cumulative, which relates back to Weiner and Labov's proposals and also makes a convincing case for the fact that “cumulativity of syntactic persistence cannot be reduced to the rather unnatural distributions of structures that participants were exposed to in previous laboratory experiments” (p. 1064).
The final extension to be mentioned here is concerned with a currently hot notion in psycholinguistic research on production and comprehension: surprisal. Following Hale (Reference Hale2001:4), who in turn refers back to work as early as Attneave (Reference Attneave1959), surprisal can be defined as “the combined difficulty of disconfirming all disconfirmable structures at a given word” or, more mathematically, log2 (1/ p (word i |word i −1, . . ., context)) or -log2 p (word i |word i-1, . . ., context); thus, surprisal is a heuristic measure of processing difficulty. In the same study of the voice alternation, Jaeger and Snider (Reference Jaeger, Snider, Love, McRae and Sloutsky2008) hypothesize that surprisal manifests itself in priming as the degree to which a construction primes more if it is less expected given its (lexical) context. It is worth pointing out that their notion of Surprisal is essentially the application of the earlier-discussed notion of VTargetPref, which is the constructional preference of the verb in the target, to the prime, i.e., VPrimePref: Surprisal/VPrimePref essentially quantifies the constructional preference of the verb in the prime and is statistically strongly correlated with the association of the verb used in the prime to the two constructions. Jaeger and Snider (Reference Jaeger, Snider, Love, McRae and Sloutsky2008) indeed found a surprisal-sensitivity effect for passives, but not for (the much more frequent) actives (see also Chang, Dell & Bock, Reference Chang, Dell and Bock2006, or Reitter et al., Reference Reitter, Keller and Moore2011). This is an interesting account of the inverse frequency effect reported in priming studies and provides further evidence for lexically-specific effects (see Bernolet & Hartsuiker, Reference Bernolet and Hartsuiker2010, for an experimental approach to the same question).
In addition to this study, Jaeger and Snider (Reference Jaeger and Snider2013) revisited the notion of surprisal in a larger theoretical context and tested it on, again, the Bresnan et al.’s (Reference Bresnan, Cueni, Nikitina, Baayen, Bouma, Krämer and Zwarts2007) dative alternation data. They explored the hypothesis that “the strength of syntactic priming in language production is a function of the prediction error – “the deviation between what is observed and expectations prior to the observation” – given context-dependent expectations given both prior and recent experience” (p. 60). They explored approximately 1000 instances of the dative alternation annotated with regard to twelve alternation predictors (including semantic and structural properties of the theme and the recipient) as well as seven priming predictors: CPrime, VLemmaID, Distance (and their interactions), Cumulativity, and Surprisal/VprimePref. The results showed that all alternation predictors work as expected but, more importantly, there was a marginally significant effect of CPrime and a significant effect of CPrime: Surprisal/VPrimePref: “the more surprising a PO prime, the more likely the target is to be a PO, but [. . .] the more surprising the DO structure, the more likely it is to be repeated, which means a PO is less likely in the target” (p. 64). This is compatible with their theoretical account in terms of comprehenders continuously adapting their expectations about incoming signals to deal with complex and noisy input in linguistic communication settings.
3.3.2. Broadening the scope
A final development to discuss is that research on priming has widened in scope considerably. For instance, all work discussed so far is concerned with priming by fluent native speaker adults, but there are now some first studies of priming effects in other speakers such as during first language acquisition in children. Gerard, Keller and Palpanas (Reference Gerard, Keller, Palpanas, Ohlsson and Catrambone2010) appear to be the first to study priming in L1 acquisition using corpus data (see Huttenlocher, Vasilyeva & Shimpi, Reference Huttenlocher, Vasilyeva and Shimpi2004, for an early experimental study). They tested two hypotheses, (i) that overall priming increases with age (as more abstract syntactic representations become available, which is the opposite of the prediction that structural priming effects are larger in less skilled speakers and in children; cf. Flett, Reference Flett2006; Messenger, Branigan & McLean, Reference Messenger, Branigan and McLean2011) and (ii) that the lexical boost effect – essentially the interaction CPrime:LemmaID or more general similarity between prime and target – decreases with age (as children become less dependent on particular lexical items in their production). In their first case study, Gerard et al. explored the voice alternation on the basis of data from the CHILDES database (covering children between the ages of 2;0 and 7;6); approximately 400 data points were annotated for several priming predictors: CPrime, Age (the age of the child with precision to the day), LexBoost (the ratio of the number of words in common between prime and target to the total number of words in the target), and SpeakerID. Interestingly, they used a GLMM with nested random effects: varying adjustments to intercepts for each child, which were nested into the part of the CHILDES database (to account for potential effects of annotation (dis)preferences). They found a priming effect in combination with similarity in the shape of a significant interaction CPrime:LexBoost, but no effect at all of Age (neither as a main effect nor in an interaction); the former is compatible with, but the latter is in contrast to, Rowland et al.’s (Reference Rowland, Chang, Ambridge, Pine and Lieven2012) experimental findings, who also found a facilitative effect of lexical similarity increasing with age, but an insignificant tendency for priming to decrease with age; Gerard et al. speculate that the lack of an age effect in their study may in part be due to the extreme rarity of passives in the corpus data.
Another way in which research has become broader is in terms of how priming is studied. The vast majority of the existing work has been on one particular constructional alternation – the voice and the dative alternation being the key objects of study – but in the last few years research has turned to studying the repetitions of any kind of phrase structure rule, not just semantically/functionally near-equivalent constructions. The first study of this kind seems to be Reitter, Moore and Keller (Reference Reitter, Moore and Keller2006), who explored the Switchboard corpus and the HCRC Map Task corpus with regard to repetitions of phrase structure rules (excluding verbatim repetitions of phrases). GLMMs were used to model the frequencies of target rules given prime rules at different values of Distance (measured in utterances or seconds); the hypothesis is that p(target|prime, Distance d) is greater than p(target|¬prime, Distance d). In spontaneous conversation, they found within-speaker priming, but not between-speaker priming, an overall effect of log(distance), and an interaction of Distance with Role (within vs. between speakers); in task-oriented conversation, they found priming and Distance effects, but no interaction of the two; on the whole, measuring Distance in seconds or utterances made no large difference.
Dubey, Keller and Sturt (Reference Dubey, Keller and Sturt2008) is a similar study in point. In their first corpus study, they explored priming of coordinate NPs (i.e., also a specific construction) in corpus data (the Wall Street Journal corpus and the Brown corpus in the Penn Treebank, Rel. 2), but in their second corpus study, they went beyond coordinate structures and explored six NP categories in any kind of structure but the already studied coordinate ones, both within and across sentences. They, too, used a regression approach and found priming effects in both corpora for most NP categories; overall, the priming effect appears to be stronger in the coordinate structures than in arbitrary structural configurations, an effect that is compatible with Snider's (Reference Snider, Taatgen and van Rijn2009) finding regarding similarity.
Case study 2 of Gerard et al. (Reference Gerard, Keller, Palpanas, Ohlsson and Catrambone2010) is an interesting application of the priming-of-arbitrary-structures approach to L1 acquisition. They identified all structures from the CHILDES database that consist of three levels in their dependency parse and occur 20 or more times in the corpus, which left approximately 4300 unique structures for the analysis. Each of these structures was a target that may or may not have been primed from somewhere in the preceding 15 utterances; complete lexical repetitions were discounted. A GLMM was fit with a binary dependent variable (coding whether or not repetition of structure was observed at some distance) and with the predictors Distance, Age, SpeakerID, LexBoost, and Frequency (the logged frequency of the structure in the corpus). The results showed that several predictors' interactions with Distance were significant. Space does not permit a discussion of all findings so let us mention only that Distance:Frequency showed that less frequent structures show strong adaptation (reminiscent of Jaeger & Snider's work on surprisal), and that Distance:Frequency:Age revealed that “the inverse-frequency effect is stronger for older children than for younger children” (p. 222).
A study that goes beyond this is Moscoso del Prado (Reference Moscoso del Prado2013). He used data from the Tübingen Spoken Treebanks of manually tagged and parsed natural dialogs in English, German, and Japanese to highlight a shortcoming of the vast majority of experimental studies, namely their failure to consider the fact that turn-taking in natural dialogue is so tightly organized: given that dialog turns are usually either perfectly synchronized or even overlap in time, a speaker must have begun planning his turn before the interlocutor has finished speaking. Thus, incoming syntactic structures may not arrive in time for inclusion in the next turn, which should be reflected in a time-sensitive effect of SpeakerID. Among many other things, he indeed finds that comprehension-to-production priming of arbitrary structures is delayed by one sentence relative to production-to-production priming, with the results being comparable in all three languages studied. This effect may help understand both Weiner and Labov's (Reference Weiner and Labov1983) finding that the effect of givenness is strongest after a similar one-clause delay and Bock, Dell, Chang and Onishi's (Reference Bock, Dell, Chang and Onishi2007: 452f) finding that “the magnitude of the persistence effect was smallest at lag 0 [. . .] Other experiments have found similarly weakened immediate effects”. This means that delays in processing complex structures should be an important part of priming research, and that production and comprehension work in parallel during natural dialogue (for other work on priming of arbitrary structures, cf. Reitter, Reference Reitter2008, or Reitter et al., Reference Reitter, Keller and Moore2011).
4. From within-language priming to cross-language priming
The review in the previous section shows the potential of a corpus-based approach to structural priming. Especially over the last 10 years, corpus-based studies of priming have not only demonstrated that priming can be studied corpus-linguistically, but they have also injected a variety of ideas into priming research that have left their mark on the field:
-
− the exploration of Distance and Cumulativity in more fine-grained ways than experiments typically allow for, and their potential roles for distinguishing between transient-activation and implicit-learning accounts of priming;
-
− the recognition that priming effects are correlated with lexically-specific preferences of elements in the target (VTargetPref), which was later also observed for the prime (Surprisal/VPrimePref);
-
− the role of similarity for priming (from VLemmaID via LexBoost to Snider's use of a multi-feature similarity metric);
-
− the degree to which lexical and structural priming – in spite of their differences – may nonetheless be affected by similar characteristics or, more provocatively, to which priming effects are merely artificial/epiphenomenal (see Healey, Purver & Howes, Reference Healey, Purver and Howes2014);
-
− the exploration of priming effects that are not tied to one particular alternation alone;
-
− the possibility to explore priming in relation to language acquisition.
In addition, the field has benefited a lot from methodological advances in the statistical analysis of priming data. While mixed-effects and other kinds of regression modeling are of course applied to experimental and corpus data alike, they have perhaps been even more useful for corpus-bases studies than for experimental studies, given that corpus data are often skewed and involve so many different covariates and potentially intervening variables. Mixed-effects models are able to handle these ‘noisy’ aspects of corpus data much better than the ANOVA approaches that have dominated the field till 2008, and have thus contributed greatly to improving the validity of corpus-based structural priming research.
The many innovations that have taken place in corpus-based studies of priming are not only interesting in how they extend the scope of priming research descriptively – they are also interesting in how they provide a fertile ground for theorizing about the scope and function of priming. For example, a particularly well-known model is the Interactive Alignment Model (Pickering & Garrod, Reference Pickering and Garrod2004). This model provides a specification of the processing levels (semantic, syntactic, lexical, phonological, and phonetic) involved in producing and comprehending utterances in dialogue. It assumes that comprehension and production is based on activation of the same representations, which leads to interactive alignment when recently activated representations are re-used. Structural priming is just one (level of) manifestation of alignment between speakers and hearers in general. The interactive alignment model is fully compatible with some of the results discussed above, like Reitter et al.’s (Reference Reitter, Moore and Keller2006) finding of more priming in task-oriented than spontaneous conversation (more empirical support is discussed in Reitter & Moore, Reference Reitter and Moore2014). While this account is still being fine-tuned and explored empirically, it provides a promising framework in which to situate all sorts of priming effects, monolingual and cross-linguistic ones (see e.g., Fricke & Kootstra, Reference Fricke and Kootstra2016; Kootstra et al., Reference Kootstra, van Hell and Dijkstra2010).
This brings us to the final part of our paper: how do these developments relate to corpus-based approaches to cross-linguistic priming, and are there specific opportunities and/or limitations with respect to studying priming in bilingual corpora? Although anything that corpus-based approaches have so far brought to the study of within-language priming could in principle inform and carry over to cross-linguistic priming, there is an important complicating factor in bilingual corpora: bilingual corpora are ‘messier’ in that they contain a lot of code-switching and rarely contain the clean between-language prime-target sequences as they are investigated in experimental paradigms (see also Fricke & Kootstra, Reference Fricke and Kootstra2016). This necessarily calls for a broad approach to cross-linguistic priming, beyond syntactic choice from one language to the other in clean, unilingual sentences. While this can be seen as a limitation, it can also lead to an enrichment. That is, similar to the recent broadening of the scope of within-language priming studies (see Section 3.3.2), studies of bilingual priming with a different dependent variable than syntactic choice in unilingual sentences can provide insight into the generalizability of priming in bilinguals.
Another complicating factor about bilingual corpora is that tagging for parts of speech is not straightforward. It hardly needs stating that even something as widespread as part-of-speech tagging may become very difficult in corpora that feature a lot of code-switching. In addition, many of the above variables may need to be considered from the perspectives of both languages involved: While the role and statistical treatment of some predictors (such as Cumulativity, Distance, SentenceLength, and the time course of interaction) may not change in a cross-linguistic priming setting, many others will have to be tweaked or extended considerably. One of the less complex changes involves an issue that has already been considered in experimental approaches, as when VLemmaID and VFormID now cannot literally evaluate whether lemmas/forms in prime and target are identical, but ‘only’ whether they are translation equivalents. One change involves Surprisal/VPrimePref and VTargetPref, where researchers now would have to compute those for the relevant words and their constructional contexts in both languages to determine how speakers behave when the prime uses a construction with a verb v that is not strongly associated with it (meaning high surprisal, i.e., stronger prime strength), but when the lexical item one would most likely use for v in the other language has a different constructional preference (see Kootstra & Doedens, Reference Kootstra and Doedens2016, for how this was done in an experimental context). The most complex changes involve measuring the similarity between prime and target: it is not obvious how notions such as LexBoost or Snider's multi-feature similarity metric would be best applied in cross-linguistic priming studies. Additional issues are concerned with the degrees to which the studies speakers speak both languages: is one dominant, if so how much, and how can such knowledge be quantified reliably (see also Costa, Pickering & Sorace, Reference Costa, Pickering and Sorace2008)?
While these and other issues pose challenges, they are not insurmountable, and the complementary benefits that corpus data provide in relation to experimental data at least with regard to some of the above-mentioned issues should be incentives to try and tackle them. Indeed, the first corpus-based priming studies have now been done, and have been able to tackle some of the issues raised above. Most of this corpus-based research on cross-linguistic priming has been done by Torres Cacoullos, Travis, and colleagues (e.g., Torres Cacoullos & Travis, Reference Torres Cacoullos and Travis2011, Reference Torres Cacoullos and Travis2013, Reference Torres Cacoullos and Travis2016; Travis et al., Reference Travis, Torres Cacoullos and Kidd2017). They collected a big corpus of natural speech from New Mexican bilinguals, consisting of about 29 hours of speech from 41 bilinguals with varying ages (the NMSEB corpus; Torres Cacoullos & Travis, in preparation). Using this corpus, they focused on the overt expression of the Spanish first person singular subject pronoun (“yo”, whose usage is, in many cases, optional in Spanish, but nearly always present in English) as their dependent variable, thus broadening the scope of dependent variables that have been studied in cross-linguistic priming research. Building on innovations in statistical techniques, they analyzed their corpus on the basis of multivariate statistical techniques and, in their latest publication, on the basis of mixed-effects modeling. They found that the overt expression of “yo” was influenced by both within-language and cross-language priming, where cross-language priming tended to be weaker in strength and shorter-lived than within-language priming. In addition, they found that priming effects depended on the type of verb that was used in the construction: ‘cognition verbs’ (e.g., think, know) were less susceptible to priming than other types of verbs, because the use of ‘yo’ with these types of verbs is much more frequent than with regular verbs. This finding is highly relatable to the verb-specific priming findings from the within-language priming literature discussed earlier in this paper, and shows how priming is influenced by usage-based patterns of language. Indeed, when it comes to theoretical advances, Torres Cacoullos and colleagues take an interesting perspective by explicitly connecting the psycholinguistic notion of priming with usage-based aspects of language, thereby broadening the scope of priming. A final notable finding from the work of Torres Cacoullos and colleagues is that the persistence of priming of “yo” appeared to depend on subject continuity of the conversation: priming was stronger when speakers continued to talk about the same thing. This is an important finding, because it shows that priming effects can be influenced by conversational factors, which are relatively difficult to investigate on the basis of experiments.
In addition to the work of Torres Cacoullos and Travis, the only other study specifically focused on cross-linguistic priming on the basis of a large-scale corpus that we know of is Fricke and Kootstra (Reference Fricke and Kootstra2016). Fricke and Kootstra analyzed the Bangor-Miami corpus, a large corpus containing 56 conversations of about 30 minutes between 84 English–Spanish bilinguals from Miami, which has been automatically tagged for parts of speech and language membership of each word in each utterance (http://bangortalk.org.uk/). Building on previous experimental work on priming of code-switching (e.g., Kootstra, van Hell & Dijkstra, Reference Kootstra, van Hell and Dijkstra2010, Reference Kootstra, van Hell and Dijkstra2012), Fricke and Kootstra were interested in the question whether priming of code-switching also occurred in spontaneous discourse, and especially whether the occurrence and grammatical form of code-switching are influenced by some of the same key priming variables as in within-language priming. Using mixed-effects modeling as their statistical technique, they found that both bilinguals’ tendency to code-switch and the grammatical aspects of the code-switched sentence (i.e., the matrix language) were influenced by both short-term and cumulative forms of priming and by lexical overlap between the prime and target. They also found that self-priming (production-to-production) tended to be stronger than other-priming (comprehension-to-production). Thus, building on recent developments from corpus-based priming research, Fricke and Kootstra found that a number of hallmark findings from the within-language priming literature also apply to bilinguals’ tendency to code-switch and to the grammatical form of the code-switched sentences they produced. These findings not only enrich previous experimental research on cross-linguistic priming and code-switching, but also strengthen the generalizability of key findings from within-language priming research.
A final bilingual corpus study that is worth mentioning is Myslín and Levy (Reference Myslín and Levy2015). In a corpus of natural Czech–English bilingual discourse, Myslín and Levy used mixed-effects logistic regression to analyze what motivates bilinguals to switch languages in sentences. Although structural priming was not their primary variable of interest, they did find results that are strongly related to priming: focusing specifically on sentences in which the final word was code-switched, they observed that the language membership of earlier mentions of this final word in the previous discourse strongly influenced language choice in the target sentence. This influence of the language membership of previous mentions of a word can be seen as a form of lexical cohesion, which is strongly related to, for example, the priming results by Torres Cacoullos, Travis, and colleagues (e.g., Torres Cacoullos & Travis, Reference Torres Cacoullos and Travis2011, Reference Torres Cacoullos and Travis2013, Reference Torres Cacoullos and Travis2016; Travis et al., Reference Travis, Torres Cacoullos and Kidd2017; see also Angermeyer, Reference Angermeyer2002).
All in all, it is evident that the number of bilingual corpora available is increasing, and that it is possible to perform quantitative analyses on them, using advanced statistical modeling. The studies discussed in this section show the potential of a corpus-based approach to priming in a bilingual setting. The studies inform many aspects of both unilingual priming research and research on cross-linguistic interactions in bilingualism, and validate results from the experimental literature on bilingual priming to real-life situations.
5. Conclusion
15 years ago, corpus-based research on priming was virtually non-existent: not much corpus work had happened since the first wave of largely variationist studies of priming, whereas experimental research on priming was thriving. We think it is fair to say that this picture has changed considerably. Corpus-based priming research has developed tremendously, and conclusions drawn from this research now not only serve as a means of hypothesis generation for experimental priming studies, but actually enrich and validate structural priming research in many ways. What is more, as evident from recent studies, it is well possible to study cross-linguistic priming on the basis of bilingual corpora, leading to new insights into priming that go beyond the idea of prime-in-language-A-only, target-in-Language-B-only as it has been investigated in most experimental studies. Thus, observational data, their benefit, and the new methods they helped pioneer are here to stay and will hopefully continue to break new ground in priming research, be it experimental or observational, be it within languages or between languages.