Predicting head-marking variability in Yucatec Maya relative clause production

ELISABETH NORCLIFFE; T. FLORIAN JAEGER

doi:10.1017/langcog.2014.39

Predicting head-marking variability in Yucatec Maya relative clause production

Published online by Cambridge University Press: 09 December 2014

ELISABETH NORCLIFFE and

T. FLORIAN JAEGER

Show author details

ELISABETH NORCLIFFE*: Affiliation:
Max Planck Institute for Psycholinguistics
T. FLORIAN JAEGER: Affiliation:
Department of Brain and Cognitive Sciences, University of Rochester, and Department of Computer Science, University of Rochester
*: †Address for correspondence: Elisabeth Norcliffe, Max Planck Institute for Psycholinguistics, PO Box 310, 6500 AH Nijmegen, The Netherlands. e-mail: elisabeth.norcliffe@mpi.nl

Article contents

Abstract
Introduction
Experimental design
Method
Discussion
Conclusion
Footnotes
References

Rights & Permissions

Abstract

Recent proposals hold that the cognitive systems underlying language production exhibit computational properties that facilitate communicative efficiency, i.e., an efficient trade-off between production ease and robust information transmission. We contribute to the cross-linguistic evaluation of the communicative efficiency hypothesis by investigating speakers’ preferences in the production of a typologically rare head-marking alternation that occurs in relative clause constructions in Yucatec Maya. In a sentence recall study, we find that speakers of Yucatec Maya prefer to use reduced forms of relative clause verbs when the relative clause is more contextually expected. This result is consistent with communicative efficiency and thus supports its typological generalizability. We compare two types of cue to the presence of a relative clause, pragmatic cues previously investigated in other languages and a highly predictive morphosyntactic cue specific to Yucatec. We find that Yucatec speakers’ preferences for a reduced verb form are primarily conditioned on the more informative cue. This demonstrates the role of both general principles of language production and their language-specific realizations.

Keywords

cross-linguistic sentence production communicative efficiency morphosyntactic variation head-marking Yucatec Maya

Type: Research Article
Information: Language and Cognition , Volume 8 , Issue 2 , June 2016 , pp. 167 - 205

DOI: https://doi.org/10.1017/langcog.2014.39 [Opens in a new window]
Copyright: Copyright © UK Cognitive Linguistics Association 2014

1. Introduction

A definitional property of any semiotic system is that signals map on to meanings. In the case of human language, this mapping is not always isomorphic: signals that are formally distinct can be near meaning-equivalent. A speaker of English may, with equal veridicality, announce either that she gave some money to a charity, or that she gave a charity some money (Bresnan, Cueni, Nikitina, & Baayen, Reference Bresnan, Cueni, Nikitina, Baayen, Boume, Kraemer and Zwarts2007). Whether she expresses her intention to help the neighbor to shift the boxes, or to help the neighbor shift the boxes, her intention is arguably the same (Mair, Reference Mair2002; Rohdenburg, Reference Rohdenburg, Dalton-Puffer, Kastovsky, Ritt and Schendl2006). Consequently, in the process of transforming a prelinguistic message into words, speakers must frequently select between alternate ways of expressing the same idea. Such ‘choice points’ have been of particular interest in language production research, because identifying the factors governing a speaker’s preference for one alternant over another can shed light on the linguistic encoding processes involved in speaking.

It is by now well established that production choices are affected by pressures inherent to the production system, such as the relative ease with which words and structures can be retrieved and assembled (see Jaeger and Norcliffe, Reference Jaeger and Norcliffe2009, for a review). For example, speakers prefer grammatical alternatives that allow easily retrievable referents to be mentioned earlier in their utterances (Bock & Irwin, Reference Bock and Irwin1980; Bock & Warren, Reference Bock and Warren1985; Branigan, Pickering, & Tanaka, Reference Branigan, Pickering and Tanaka2008; Prat-Sala & Branigan, Reference Prat-Sala and Branigan2000; Tanaka, Branigan, & Pickering, Reference Tanaka, Branigan, Pickering, Yamashita, Hisore and Packard2011). They also show a preference for mentioning optional elements, such as disfluencies (Clark & Fox Tree, Reference Clark and Fox Tree2002; Shriberg & Stolke, Reference Shriberg, Stolke, Bunnell and Idsardi1996) and function words (Ferreira & Dell, Reference Ferreira and Dell2000; Jaeger, Reference Jaeger2010; Jaeger & Wasow, Reference Jaeger, Wasow, Cover and Kim2006; Roland, Elman, & Ferreira, Reference Roland, Elman and Ferreira2006) when upcoming information is more difficult to plan.

A central question in language production research is the extent to which linguistic encoding processes are also influenced by pressures beyond those inherent to the planning and retrieval of elements for production. Specifically, do the communicative goals of the speaker influence encoding processes, and consequently also have an impact on production choices? Transmitting a message in a way that will be intelligible to the hearer is, of course, a prerequisite for successful communication, and it is generally agreed that some aspects of utterance planning must therefore be influenced by the intention to be understood. For example, speakers will typically speak louder in noisier environments (van Summers, Pisoni, Bernacki, Pedlow, & Stokes, Reference van Summers, Pisoni, Bernacki, Pedlow and Stokes1988) or modify their vocabulary and speech style when speaking with children (Snow, Reference Snow, Snow and Ferguson1977). While such adjustments are arguably the consequence of communicative goals, it is less clear whether these goals can influence the decisions made during incremental linguistic encoding. These linguistic encoding processes are standardly assumed to be largely automatic and information encapsulated (Bock & Levelt, Reference Bock, Levelt and Gernsbacher1994; Levelt, Reference Levelt1989). In particular, the stage of linguistic encoding we focus on in this paper, grammatical encoding, is often assumed to be largely, or solely, affected by pressures inherent to production planning (Arnold, Reference Arnold2008; Ferreira, Reference Ferreira2008; Ferreira & Dell, Reference Ferreira and Dell2000; MacDonald, Reference MacDonald2013).

An alternative view holds that communicative goals do influence linguistic encoding, including grammatical encoding. Specifically, the view we pursue here holds that speakers aim to strike an efficient balance between ease of production and robust transfer of the intended message (Jaeger, Reference Jaeger2013). We will refer to this as communicatively efficient language production or, in short, communicative efficiency. Intuitively, an efficient communicative system trades off the average effort and time required during the en- and decoding of the message against the average likelihood of successful transfer of that message (for related perspectives, see Lindblom, Reference Lindblom, Hardcastle and Marchal1990; Pellegrino, Coupé, & Marsico, Reference Pellegrino, Coupé and Marsico2011; Zipf, Reference Zipf1949). This intuition can be formalized in terms of information theory (Shannon, Reference Shannon1948) and related mathematical frameworks for communication (for examples and discussion, see Ferrer i Cancho, Reference Ferrer i Cancho2005; Ferrer i Cancho & Díaz-Guilera, Reference Ferrer i Cancho and Díaz-Guilera2007; Genzel & Charniak, Reference Genzel and Charniak2002; Levy & Jaeger, Reference Levy, Jaeger, Schlökopf, Platt and Hoffman2007; Maurits, Perfors, & Navarro, Reference Maurits, Perfors and Navarro2010; Piantadosi et al., Reference Piantadosi, Tily and Gibson2011; van Son & Pols, Reference van Son, Pols and Berkman2003). Here, we contribute to this literature by assessing its cross-linguistic applicability at the level of grammatical encoding.

The hypothesis of communicatively efficient language production has received support from a variety of studies that have linked the contextual expectedness of linguistic units to their realization. Specifically, these studies measure the information carried by linguistic units, which stands in a log-reciprocal relation to the contextual probability of the linguistic unit (Shannon, Reference Shannon1948). At the phonetic level, English speakers articulate highly informative words with longer duration and more articulatory detail (e.g., Aylett & Turk, Reference Aylett and Turk2004; Bell, Jurafsky, Fosler-Lussier, Girand, Gregory, & Gildea, Reference Bell, Jurafsky, Fosler-Lussier, Girand, Gregory and Gildea2003; Gahl & Garnsey, Reference Gahl and Garnsey2004; Pluymaekers, Ernestus, & Baayen, Reference Pluymaekers, Ernestus and Baayen2005). The same patterns have been found for the realization of individual segments even if the informativity of the word is held constant (Aylett & Turk, Reference Aylett and Turk2006; van Son & van Santen, Reference van Son and van Santen2005). Phonological processes, such as optional epenthesis processes in Dutch, have also been shown to be affected by informativity (Tily & Kuperman, Reference Tily and Kuperman2012).

The same preference for providing more linguistic signal when the meaning encoded by that form is contextually unexpected has been observed in the realization of referential expressions. For example, speakers are more likely to omit optional arguments when their semantics is inferable from the verb (Resnik, Reference Resnik1996; see also Brown & Dell, Reference Brown and Dell1987). Similarly, speakers are more likely to choose more reduced expressions when the referent is expected in context (e.g., pronoun vs. lexical noun: Tily & Piantadosi, Reference Tily and Piantadosi2009; abbreviated vs. full nouns, like math vs. mathematics: Mahowald, Fedorenko, Piantadosi, & Gibson, Reference Mahowald, Fedorenko, Piantadosi and Gibson2013).

Finally, and most germane to the goals of this paper, there is also evidence that morphological and syntactic processes, which are standardly assumed to be part of grammatical encoding, are subject to similar preferences. For example, Frank and Jaeger (Reference Frank, Jaeger, Love, McRae and Sloutsky2008) investigated auxiliary and negation contraction in conversational English and found that speakers produced more reduced forms (e.g., doesn’t rather then does not) in contexts where the meaning was expected (for related data, see also Bybee & Schiebman, Reference Bybee and Schiebman1999). This preference seems to extend to the omission of optional function words that mark the beginning of constituents (Jaeger, Reference Jaeger2006, Reference Jaeger2010, Reference Jaeger, Bender and Arnold2011; Wasow, Jaeger, & Orr, Reference Wasow, Jaeger, Orr, Simon and Wiese2011). For example, in many varieties of English, finite object extracted relative clauses allow the omission of the relativizer that, as in (1):

(1) This is the cake [_RC(that) I like best]

When such a relative clause is expected, given the type of determiner, adjective, or head noun of the noun phrase it is modifying, speakers are less likely to produce the optional that (Jaeger, Reference Jaeger2006; Wasow et al., Reference Wasow, Jaeger, Orr, Simon and Wiese2011). Consider, for example, the difference between (2a) and (2b):

(2) a. That’s the best thing (that) they found
b. That’s a good book (that) they found

Example (2a) differs from (2b) in the definite determiners, superlative adjective, and the semantically light head noun. In a corpus of conversational English, Wasow et al. (Reference Wasow, Jaeger, Orr, Simon and Wiese2011) found that all three factors increased the probability that the noun phrase (e.g., the best thing) is modified by a relative clause. For all three factors, increased probability of a relative clause also correlated with a lower preference of speakers to produce that (for the latter correlation, see also Fox & Thompson, Reference Fox and Thompson2007).

These studies support the idea that the cognitive systems underlying human language production exhibit computational properties that facilitate an efficient trade-off between ease and robust information transmission (for further discussion, see Jaeger, Reference Jaeger2013). This view bears a strong affinity to functional approaches to typology and language change, where it has been argued that the shapes of grammars themselves derive from frequency patterns in language use, via diachronic change (Bybee, Reference Bybee and Hawkins1988; Croft, Reference Croft2000; Haiman, Reference Haiman1983; Haspelmath, Reference Haspelmath1999, Reference Haspelmath2004; Hawkins, Reference Hawkins2004; Keller, Reference Keller1994; Zipf, Reference Zipf1949). Over time, frequently co-occurring elements may fuse together and become phonologically reduced (Haspelmath, Reference Haspelmath2004; Traugott & Heine, Reference Traugott, Heine, Traugott and Heine1991). These diachronic processes are argued to be motivated by cognitive principles of ‘economy’ or ‘utility’ (Givón, Reference Givón1991, Reference Givón and Payne1992; Haiman, Reference Haiman1983; Reference Haspelmath, MacWhinney, Malchukov and MoravcsikHaspelmath, to appear; Zipf, Reference Zipf1949): because frequent concepts are more predictable, less effort is required to convey them to the hearer. The functional/typological notion of utility bears a strong conceptual resemblance to communicative efficiency, pointing to the possibility that the preferences observed during language production create a bias that, accumulated over generations, contributes to language change (see Jaeger & Tily, Reference Jaeger and Tily2011, and references therein). Preliminary support for this idea comes with recent research showing that lexicons (Piantadosi et al., Reference Piantadosi, Tily and Gibson2011) and grammatical systems (Gibson, Piantadosi, Brink, Bergen, Lim, & Saxe, Reference Gibson, Piantadosi, Brink, Bergen, Lim and Saxe2013; Maurits et al., Reference Maurits, Perfors and Navarro2010) across languages exhibit properties expected from communicatively efficient systems (for a critique, see Ferrer i Cancho & del Prado Martín, Reference Ferrer i Cancho and del Prado Martín2011). A particularly striking demonstration is provided by Wedel, Jackson, and Kaplan (Reference Wedel, Jackson and Kaplan2013), who find that the probability of phonological mergers – i.e., the loss of a phonological contrast – is dependent on the informativity of that contrast in the language, measured as the number of words that the contrasts serves to distinguish (an overview of related findings is given in Hume & Mailhot, Reference Hume, Mailhot and Yu2013).

Communicative efficiency therefore holds explanatory potential not just for patterns of real-time language use, but also for the shape of grammars. However, both the claim that the language production system is designed to be communicatively efficient, and the notion that this in turn can account for typological patterns, only goes through in so far as it can be demonstrated that, across languages, speakers’ on-line production choices are regulated by communicative efficiency. As with psychological research more generally (Henrich, Heine, & Norenzayan, Reference Henrich, Heine and Norenzayan2010), research on speaker preferences in production has been undertaken on only a handful of phenomena, and a handful of languages, chief among them English (for an overview, see Jaeger & Norcliffe, Reference Jaeger and Norcliffe2009). For information-theoretic approaches to language production in particular, the empirical base is further limited, since this work tends to require larger databases on the basis of which informativity can be estimated (Bell, Brenier, Gregory, Girand, & Jurafsky, Reference Bell, Brenier, Gregory, Girand and Jurafsky2009; Piantadosi et al., Reference Piantadosi, Tily and Gibson2011; Resnik, Reference Resnik1996). To the extent that cross-linguistic investigations within information-theoretic frameworks exist, they have thus mostly focused on lexical and sublexical properties, i.e., levels of linguistic description for which units are more frequent (e.g., Graff & Jaeger, 2009; Pellegrino, Coupé, & Marsico, Reference Pellegrino, Coupé and Marsico2011; Piantadosi et al., Reference Piantadosi, Tily and Gibson2011; Qian & Jaeger, Reference Qian and Jaeger2012; Wedel et al., Reference Wedel, Jackson and Kaplan2013). Above the lexical level, some suggestive cross-linguistic support for communicative efficiency in fact comes from the comprehension research finding that comprehenders expect speakers to produce reduced forms when the meaning the form conveys is contextually expected, and less reduced forms otherwise. Odawa comprehenders, for example, appear to expect the production of full NPs as opposed to pro-dropped NPs in contexts where the referent in question has less expected combinations of semantic features (Christianson & Cho, Reference Christianson and Cho2009). Similarly, in Choguita Rarámuri (Uto-Aztecan), redundancy in morphological marking has been shown to help listeners only when the meaning conveyed by the morphology is unexpected in its context (Reference Caballero, Kapatsinski, Norcliffe, Harris and JaegerCaballero & Kapatsinski, to appear).

Here, we aim to contribute to the cross-linguistic evaluation of the communicative efficiency hypothesis. Specifically, we investigate a type of morphosyntactic reduction in a head-marking language. Typologists draw a distinction between two types of morphosyntactic marking found across languages, reflecting whether syntactic relations within the phrase are coded on the head of the phrase (e.g., the verb), or on the dependents (e.g., the arguments of the verb; Nichols, Reference Nichols1986; Nichols & Bickel, Reference Nichols, Bickel, Dryer and Haspelmath2011). ‘Dependent-marking’ languages, like Japanese, mark grammatical relations on dependents, via case marking. ‘Head-marking’ languages mark grammatical relations on heads. Head-marking languages have received relatively little attention in research on language production, in particular, in work on sentence production (for a notable exception, see Christianson & Ferreira, Reference Christianson and Ferreira2005). We take advantage of a cross-linguistically rare type of head-marking variability that exists in Yucatec Maya, to test the predictions of communicative efficiency against this understudied language type. Our results suggest that robust information transfer is a factor influencing morphosyntactic production in Yucatec, and thus support the idea that across languages of markedly different structural types, communicative efficiency influences production choices in real-time language use. Next we review the grammatical properties of Yucatec Maya relevant for our purpose. Then we spell out the predictions of communicative efficiency accounts for the alternation we investigate. Following that we present a sentence recall study that investigates speaker preferences in Yucatec sentence production.

1.1. morphosyntactic variation in yucatec maya

Yucatec Maya is spoken by around 700,000 people in the Yucatán peninsula of Mexico, and in parts of Belize and Guatemala. The canonical word order of the language is generally taken to be VOS (Bohnemeyer, Reference Bohnemeyer, Helmbrecht, Nishina, Shin, Skopeteas and Verhoeven2009), although the most frequently observed word order in transitive sentences with two lexically realized arguments in spoken narratives seems to be SVO (Gutiérrez-Bravo & Monforte, Reference Gutiérrez-Bravo, Monforte, Camacho, Gutiérrez-Bravo and Sánchez2010).

Yucatec Maya is a strictly head-marking language, meaning that heads of phrases (e.g., verbs, possessed nouns, or prepositions) are morphologically marked for their dependents by means of person markers. For example, in the transitive clause in (3), the morpheme -ech indicates that the object is second person.^{Footnote 1} The lack of overt object marking in (4) indicates that the object of the transitive clause is third person. In both examples, the morpheme -u(y) indicates that the transitive verb’s subject is third person (the y which follows u only occurs when the verb begins with a vowel). The subject marker u forms a phonological word with the aspect marker that precedes it (in the examples below, the imperfective marker k-). This unit is sometimes described as a prefix (Hanks, Reference Hanks1990), sometimes as a clitic (Bohnemeyer, Reference Bohnemeyer2002; Lehmann, Reference Lehmann1998; Verhoeven, Reference Verhoeven2007); this is indicative of the fact that it exhibits certain mixed properties of both classes. In the examples below, its semi-bound relationship to the verb is represented by the use of the equals symbol, following the standard in Mayan linguistics.

(3) k-uy=il-ik-ech le máak-o’
impv-3=see-inc-2.sgdef man-part
‘The man sees you.’
(4) k-uy=il-ik le sina’an le máak-o’
impv-3=see-incdef scorpion def man-part
‘The man sees the scorpion’

The variability we investigate relates to Yucatec’s head-marking. In subject extracted relative clauses, the verb in the relative clause may take one of two forms. One form is the regular head-marked verb form of main clauses, as described above. The other form is morphologically simpler: the subject marker u is omitted, together with the preverbal aspect marking k-. This alternation is restricted to subject relative clauses in which the relative clause verb is transitive (i.e., clauses of the type the boy that ate the tortilla). According to native Yucatec speakers, under these grammatical conditions, the use of either verb form is acceptable, and there is no discernible difference in meaning between the two forms (Norcliffe, Reference Norcliffe, Avelino, Coon and Norcliffe2009b). In Mayanist linguistics, this alternation is referred to as the ‘Agent Focus’ alternation, and the simpler verb form is referred to as the ‘Agent Focus’ verb, because its use is restricted to syntactic contexts that involve the relativization or focusing of an agent (in this case, the subject of a transitive clause; see, e.g., Aissen, Reference Aissen1999). To highlight parallels between this morphosyntactic choice and the literature on syntactic reduction, we will refer to the simpler alternant as reduced and the morphologically complex verb alternant as full. Example (5) gives an example of a relative clause with the full verb form; (6) gives an example with the reduced alternant (glosses are simplified for readability).

(5) le máak k-uy=il-ik le sina’an
the man see_FULL the scorpion
‘The man sees the scorpion’
(6) le máak il-ik le sina’an
the man see_RED the scorpion
‘The man sees the scorpion’

1.2. predictions

If Yucatec speakers follow the same principles of communicatively efficient language production that seem to drive speakers’ preferences in English and typologically related languages, then they should favor shorter expressions when less information is sufficient to successfully convey their intended message. Specifically, when producing relative clauses, speakers should be more likely to use the reduced relative clause verb when the information conveyed by the verbal morphology is more contextually expected.

Here we focus on the predictions this makes at the constituent level. In Yucatec Maya, the relative clause verb occurs at the onset of the relative clause. As in the case of optional that-mentioning at the onset of English object-relative clauses, Yucatec speakers should prefer to use the reduced form of the relative clause verb when the relative clause is more expected in its context.^{Footnote 2}

2. Experimental design

To assess whether Yucatec Mayan speakers prefer to produce reduced verbs when relative clauses are more contextually expected, participants were asked to complete a sentence recall task, in which they heard, and later recalled, sentences out loud. The central idea behind sentence recall is based on early findings that memory of syntactic form decays faster than that of the semantic gist of a sentence (Bates, Masling, & Kintsch, Reference Bates, Masling and Kintsch1978; Sachs, Reference Sachs1967), thereby allowing the study of form alternations. This method is effective in eliciting examples of tightly controlled stimuli that are not easily elicited through more ecologically valid means, such as picture description. Ferreira and Dell (Reference Ferreira and Dell2000) employed a sentence recall paradigm like the one employed here to study English that-mention. Their findings have been confirmed by corpus studies of conversational speech (Jaeger, Reference Jaeger2010; see also Roland, Dick, & Elman, Reference Roland, Dick and Elman2007).

Inspired by the findings of Wasow et al. (Reference Wasow, Jaeger, Orr, Simon and Wiese2011), our design set out to manipulate the expectedness of a relative clause through the type of determiner or quantifier in the modified NP (definite vs. universal vs. indefinite). As is common for sentence recall experiments, this manipulation of interest was crossed with the recall factor, i.e., the relative clause verb form (reduced or full). This recall factor is not of theoretical interest. It merely serves to balance the number of times each item is present with either of the two verb forms, thereby reducing the chance of floor or ceiling effects. An example item in all six conditions is given below, where boldface indicates the two manipulations.

UNIV-REDUCED
(7) Manuel-e’ t-uy=u’ub-ah tuláakal musiko pax-ik le
Manuel-topprv-3sg=listen-cmpuniv musician play-incdef
marimba-part
marimba-o’
‘Manuel listened to every musician that was playing the marimba’
DEF-REDUCED
(8) Manuel-e’ t-uy=u’ub-ah le musiko pax-ik le
Manuel-topprv-3sg=listen-cmpdef musician play-incdef
marimba-o’
marimba- part
‘Manuel listened to the musician that was playing the marimba’
INDEF-REDUCED
(9) Manuel-e’ t-uy=u’ub-ah hun túul musiko pax-ik le
Manuel-topprv-3sg=listen-cmpindef musician play-incdef
marimba-o’
marimba- part
‘Manuel listened to a musician that was playing the marimba’
UNIV-FULL
(10) Manuel-e’ t-uy=u’ub-ah tuláakal musiko
Manuel-topprv-3sg=listen-cmpuniv musician
k-u=pax-ik le marimba-o’
impv-3sg=play-incdef marimba-part
‘Manuel listened to every musician that was playing the marimba’
DEF-FULL
(11) Manuel-e’ t-uy=u’ub-ah le musiko
Manuel-topprv-3sg=listen-cmpdef musician
k-u=pax-ik le marimba-o’
impv-3sg=play-incdef marimba-part
‘Manuel listened to the musician that was playing the marimba’
INDEF-FULL
(12) Manuel-e’ t-uy=u’ub-ah hun túul musiko
Manuel-topprv-3sg=listen-cmpindef musician
k-u=pax-ik le marimba-o’
impv-3sg=play-incdef marimba-part
‘Manuel listened to a musician that was playing the marimba’

The three-way determiner contrast was designed as a means of distinguishing between two different instances of communicative efficiency accounts. Because Yucatec is a language for which no large-scale syntactically annotated corpora exist, alternative strategies are required to estimate relative clause probabilities. We took advantage of two cues to relative clause modification.

The first draws on previous work on that-omission in English described above (Wasow et al., Reference Wasow, Jaeger, Orr, Simon and Wiese2011). Wasow and colleagues observed that the determiner or quantifier of a noun phrase predicts how likely the noun phrase is to be modified by a relative clause. For pragmatic reasons, universally quantified and definite NPs have a higher probability of being modified by a relative clause than indefinite NPs. Universal assertions expressed with the universal quantifiers all and every, for example, are usually true only if applied over restricted domains. Thus, (13a) is true for many more VPs than (13b). The use of a relative clause therefore allows speakers to avoid making overly general claims.

(13) a. Every linguist we know VP
b. Every linguist VP

For indefinites the opposite is true. (14a) is true for many more VPs than (14b), since (14a) is true if VP holds of any linguist, whereas (14b) is true only if it holds of a linguist we know.

(14) a. A linguist VP
b. A linguist we know VP

Finally, definite determiners generally indicate that the referent of the NP it is introducing can be uniquely identified given the linguistic and non-linguistic context (Hawkins, Reference Hawkins1978; Heim, Reference Heim1982). Identifiability often requires further information about the referent than is expressed by the noun alone. A relative clause constitutes one means by which such additional information can be provided. Thus, (15b) can more successfully refer to a unique individual than (15a):

(15) a. The linguist
b. The linguist I told you about

Thus, the semantic/pragmatic restrictions of a, the, and every make them effective cues to the relative probability of a following relative clause. Based on the assumption that these pragmatics carry over into Yucatec, we predicted that the full verb form will be more likely when it occurs in a relative clause modifying an indefinite compared to definite and universally quantified NPs.^{Footnote 3} We had no clear expectations about differences between universals and definiteness (Wasow et al., Reference Wasow, Jaeger, Orr, Simon and Wiese2011, did not report significant differences between these two).

The second cue to relative clause modification is Yucatec-specific, and pertains to the distribution of a clause-final particle that is only found in combination with NPs marked as definite with the determiner le. When a clause contains a definite (le-marked) NP, the clause is obligatorily followed by a deictic particle. In simple transitive clauses with a clause-final definite object NP, this means that the particle immediately follows the object NP. This is shown in example (16), where the deictic particle, -o’, is in bold face, and glossed part:^{Footnote 4}

(16) Manuel-e’ t-uy=u’ub-ah le musiko-o’
Manuel-topprv-3sg=listen-cmpdef musician-part
‘Manuel listened to the musician that was playing the marimba’

When a relative clause modifies a definite NP, the deictic particle no longer occurs directly after the noun (here, musiko), but rather at the end of the entire modifying relative clause:

(17) Manuel-e’ t-uy=u’ub-ah le musiko pax-ik le marimba-o’
Manuel-topprv-3sg=listen-cmpdef musician play-incdef marimba-part
‘Manuel listened to the musician that was playing the marimba’

The absence of a deictic particle after the definite noun (e.g., musiko in (17)) thus provides a cue that postnominal modification, such as a relative clause, is to be expected. Crucially, no such deictic particle is present in indefinite or universally quantified NPs:^{Footnote 5}

(18) Manuel-e’ t-uy=u’ub-ah tuláakal musiko
Manuel-topprv-3sg=listen-cmpuniv musician
‘Manuel listened to every musician’
(19) Manuel-e’ t-uy=u’ub-ah hun túul musiko
Manuel-topprv-3sg=listen-cmpindef musician
‘Manuel listened to a musician’

In the case of universally quantified and indefinite NPs, the absence of the particle after the head noun is therefore uninformative: it should not lead to a greater expectation for an upcoming relative clause because these NP types never combine with the particle under any circumstances.

Our logic regarding the relative informativity of the absence of the particle follows from two fairly uncontroversial assumptions. First, that word boundaries can typically be inferred with relative certainty through a mixture of bottom-up cues (e.g., word-level and phrase-level prosodic contours) and top-down knowledge (e.g., implicit knowledge about grammatical phonological sequences as well as the lexical inventory of a language). From this it follows that the absence of a particle can usually be inferred. Second, that comprehension is not noise-free. This means that no cue is recognized with absolute certainty, meaning that a cue can still be informative (and thus helpful) even after other cues have already provided the ‘same’ information. Thus, even though the speaker may already have begun to produce the relative clause by the time the comprehender has inferred the particle’s absence, this absence may nevertheless be informative about the relative clause that is under production. Indeed, it is by now rather broadly accepted in research on language comprehension that comprehension is not a deterministic serial search (for evidence, see, e.g., ‘right context effects’, reviewed in Dahan, Reference Dahan2010).^{Footnote 6}

Working from either the pragmatic or the morphosyntactic-cue based estimate, communicative efficiency theories predict that Yucatec Maya speakers should be less likely to choose the full verb form when the relative clause is headed by a definite NP by comparison with an indefinite NP, because, according to either estimate, relative clauses should be more expected as modifiers of definite NPs. Where the pragmatic and the morphosyntactic-cue based estimates diverge is in the predictions they lead to regarding universally quantified NPs. If our pragmatic-based estimate is most reflective of Yucatec speakers’ own probability estimates, then they should have a higher expectation for a relative clause after universally quantified NPs by comparison with indefinite NPs. Therefore, they should be less likely to choose a full verb form for relative clauses headed by universally quantified NPs than for those headed by indefinite NPs. If, however, the expectancy of a relative clause is driven more by morphosyntactic cues present in the utterance, rather than by general pragmatic biases, then we should not expect a difference in verb form choice for relative clauses headed by universally quantified NPs compared to those headed by indefinite NPs, because the morphosyntactic cue singles out definite NPs only.

The results of our experiment should therefore also shed light on the extent to which different types of cues to sentence structure affect language production. In Yucatec, the morphosyntactic cue is highly informative. If speakers condition their preferences between full and reduced forms differentially on different cues, based on the informativity that these cues carry about the relevant linguistic structure (in this case, the presence of a relative clause), we should see that Yucatec speakers are more likely to use the reduced form whenever a relative clause modifies a definite NP.

3. Method

3.1. materials

The stimuli were developed by the first author in collaboration with a native-speaker consultant. Stimuli consisted of digitized recordings of spoken Yucatec sentences, read by an adult male native Yucatec speaker. They comprised twenty-four experimental items and thirty-two fillers.

For each item, the relative clause was embedded as the object of a transitive matrix clause. In all sentence frames the participants of the matrix clause were humans: the subject was a personal name, and the object of the matrix clause (the head of the relative clause) was a familiar occupation. The embedded object (the object of the relative clause, e.g. ‘marimba’ in (12)) was always inanimate. This property of the stimuli was critical in order to avoid global ambiguity. Yucatec relative clause constructions with full relative clause verbs can have either a subject or an object relative clause reading. Restricting our stimuli to event types involving human agents acting on inanimate objects eliminated the possibility of ambiguity. The definiteness of the embedded object nouns was balanced within items.

At least one filler intervened between any two of the items. The fillers were sentences consisting of intransitive clauses with locative or prepositional phrases. Six Latin-square designed lists were created so that each participant heard one and only one condition from each of the twenty-four items.

3.2. participants

Thirty-six Yucatec–Spanish bilingual speakers (23 women and 13 men) took part in the experiment. Participants were aged between 18 and 23 (mean = 19.7, SD = 1.3). All participants were undergraduate students at the Universidad de Oriente (UNO), a state university in Yucatán, located just outside of the town of Valladolid. UNO is attended by students living throughout the state of Yucatán, with the bulk of the population concentrated in Valladolid and surrounding villages. All participants were computer literate.

To assess participants’ daily exposure to Yucatec, we asked all participants to estimate the number of hours they spoke Yucatec each day (less than three, three–six, more than six). According to the estimates, 16% of participants spoke on average less than three hours a day; 14% spoke between three and six hours a day, and 70% spoke over six hours a day. We also gathered information about the village that each participant came from, as a potential indicator of micro-dialectal differences. In total, nineteen different villages were represented in the participant pool. Each participant was paid to take part in the experiment, which lasted no more than 45 minutes.

3.2.1. Procedure

The task was aurally presented sentence recall, programmed and run with Exbuilder (E. Longhurst, University of Rochester). Participants sat in front of a laptop computer in a private classroom, wearing a headset with a head-mounted microphone, while another participant (also a native speaker of Yucatec Maya) sat opposite them. Participants were instructed to direct their speech towards their partner, so that their partner could imagine the scenes being described. Each trial consisted of three parts:

1. Listen and repeat: the participant heard a sentence and immediately repeated it.
2. Video distractor task: the participant saw a short animated video clip of a simple event, which they described in a single sentence.
3. Sentence recall: the participant heard a prompt (a portion of the original sentence), and had to recall the original sentence from part one.

For each trial, the procedure was as follows: after pressing the space bar to initiate the experiment, participants proceeded to the first part of the trial, in which they first heard a sentence. When they were ready to speak, they advanced with another space-bar press. A green icon of a mouth appeared in the center of the screen, which signaled that they could begin speaking. They repeated the sentence, and then advanced by means of another space-bar press to the second part of the trial, the video distractor task. A brief (2–3 second) animated video was presented, which depicted a simple action scene, typically of a human agent acting on either an inanimate or an animate object (e.g., a man throwing ball) (created using EFrontier’s Poser software).^{Footnote 7} Once the video ended, the green icon of a mouth appeared on the screen, which again signaled to the participants to begin speaking, this time in order to describe the video clip. Once they had completed their description, they advanced once more with a space-bar press, to part three of the trial. At this point, participants heard the sentence prompt, which was intended to provide them with a cue to recall the target sentence from part one of the trial. On critical trials, the prompt always consisted of the subject and the verb of the matrix clause of the target sentence. That is, if the original sentence was “Rodrigo laughed at the diner that spilled the drink”, the prompt was “Rodrigo laughed”. Prompts were cut from the original sound clip. Following the prompt, participants then recalled the complete target sentence out loud. They then advanced to the next trial by pressing the space bar. Responses (for all three parts of each trial) were recorded onto the computer.

The task began with a training session consisting of six practice sentences, none of which contained the target structure. Instructions were presented by the experimenter in Spanish. Once the experimenter was satisfied that the participant and his/her partner had understood the task, and was comfortable with the computer commands, the participants completed the actual experiment.

3.2.2. Scoring and exclusions

All 864 experimental responses from the sentence recall part of the experiment were transcribed and scored by the first author, with the assistance of native Yucatec speaker consultants. To identify failures to recall the stimulus, we annotated whether the modified NP and the relative clause verb as well as the embedded NP in the relative clause were produced correctly. The exclusions of incomplete or inaudible responses resulted in 19.7% data loss. For the remaining data, scoring of the verb form revealed that participants produced one of the two intended forms (full or reduced) in 92.4% of all cases. Out of the 53 cases that contained alternative forms (e.g., involving different types of aspectual marking), only one was deemed sufficiently similar in meaning and behavior (involving t-, the perfective aspect marker, which can be omitted like k-). This left 651 cases.

Following previous work, we applied two additional criteria for inclusion of participant data in the analysis (Ferreira & Dell, Reference Ferreira and Dell2000; Jaeger, Reference Jaeger2010; Jaeger, Furth, & Hilliard, Reference Jaeger, Furth and Hilliard2012). Neither of these criteria affected the results reported below. First, we only included participants that showed evidence of the alternation. That is, we excluded data from 7 participants who did not produce both the reduced and the full form at least once in the experiment. Additionally, we excluded data from 11 participants because of 50% or more data loss. This left 438 cases from 22 participants. The exclusions together resulted in 46.3% data loss overall, which is within normal bounds for sentence recall experiments (e.g., Ferreira & Dell, Reference Ferreira and Dell2000).

Overall, the full form was produced more than the reduced form (254 full tokens vs. 184 reduced tokens). Speakers produced 3 instances of full verb forms with the perfective aspect marker t- (out of 273 full responses). We collapsed over these response types in our coding, treating both t- and k- as full responses. There was a high degree of individual variation in rates of full verb mentioning. The proportion of full verb productions ranged between 10% and 93% (median = 54%).

For the remaining analyzable cases, we annotated the article of the modified NP, which was manipulated by design. This annotation revealed that participants sometimes changed the determiner of the modified NP. Specifically, participants sometimes changed the form of the indefinite determiner from hun túul to hun p’éel. Indefinite determiners in Yucatec are composed of the unstressed numeral hun ‘one’, together with a classifier morpheme, the choice of which depends on the properties of the noun it combines with. Túul is used for ‘self-segmenting shapes’, and typically (but not categorically) combines with animate referents, while p’éel, the most neutral classifier, tends to be used by default for discrete inanimate referents (Lucy, Reference Lucy1992). The modified NPs in our stimuli were always animate, and were consistently presented with the classifier form túul. However, while human referents tend to occur with the túul classifier in natural speech, this is not a fixed grammatical rule.^{Footnote 8} Because both classifier forms in combination with the numeral hun are used to indicate indefinite reference, we kept both response types, treating both as indefinite.

Participants also frequently added a definite determiner to the universal quantifier (in 29% of the utterances), producing tuláakal le ‘all the’ instead of tuláakal ‘every, all’, as in (20) below. We decided to keep both response types. Given the relevance of the definite determiner for the predictions of communicative efficiency accounts, we return to this point below.

(20) Manuel-e’ t-uy=u’ub-ah tuláakal le musiko pax-ik le marimba-o’
Manuel-topprv-3sg=listen-cmpunivdef musician play-incdef marimba-part
‘Manuel listened to all the musicians that were playing the marimba’

Table 1 summarizes the distribution of determiner types in the responses depending on the determiner presented in the stimulus.

table 1. Counts of determiner types actually produced by the determiner type presented in the recall stimulus

3.3. analysis

We employed mixed logit regressions (Breslow & Clayton, Reference Breslow and Clayton1993; Jaeger, Reference Jaeger2008) to analyze the probability of full over reduced relative clause verbs in the recall responses. All analyses were conducted using the lme4 package (Bates, Maechler, & Bolker, Reference Bates, Maechler and Bolker2012) in R (R Development Core Team, 2005). All analyses reported below employed the maximum random effect structure justified by model comparison.^{Footnote 9} If not mentioned otherwise, the same results were obtained using maximal converging random effect structure. There were no signs of multicollinearity.

3.4. results

We present two types of analysis. In the first we treated the experimental design factors (determiner type of the modified NP presented) as predictors. That is, we analyze the data based on the properties of the original stimulus to be recalled. However, speakers sometimes deviated from the original form, while still producing the desired structure (see above). We therefore also analyzed the effect of the determiner actually produced by the speaker. For both sets of analyses, the effects reported below also held when additional controls were included in the analyses that accounted for potential confounds introduced by data loss (see ‘Appendix A’).

3.4.1. By design factors

In our first analysis, the determiner type of the modified NP presented was our predictor of interest. This was Helmert coded (contrasting definite vs. indefinite, and universal vs. the average of definite and indefinite). The recall variable (whether the full or reduced verb form was presented) was sum-coded.

As shown in Figure 1, full verb forms were produced less when the modified NP in the stimulus was definite, compared to when it was indefinite or universally quantified. In the model, this resulted in a significant effect of definiteness of the modified NP on verb form choice (p < .05); see Table 2). There was no significant effect of universal quantification of the modified NP. No interaction terms were significant. Table 2 and Figure 1 additionally show that there was a significant effect of the stimulus verb form on verb form choice: unsurprisingly, speakers were more likely to produce a full verb form when recalling a sentence if the critical verb in the stimulus was also a full verb form.

Fig. 1. Relative clause verb productions by the relative clause verb form and the NP head type presented during encoding.

table 2. Mixed logit regression of relative clause verb productions by design factors

3.4.2. By modified NP type produced by speakers

Our second analysis was based on the type of modified NP speakers actually produced. As described above, participants frequently added a definite determiner to universally quantified NPs, producing tuláakal le ‘all the’ instead of tuláakal ‘all/every’. To assess the effects of both definiteness and universal, we considered cases of modified NPs that contained a universal quantifier (with or without an additional definite article) as ‘universal’ and treated cases of modified NPs that included definite articles (with or without the universal quantifier) as ‘definite’. These two contrasts were sum-coded, as was the recall variable (following the first analysis).

As shown in Figure 2, definite determiner responses both without the universal quantifier (left panel) and with the universal quantifier (right panel) were associated with lower proportions of full verb forms. In the model (Table 3), this is reflected in definiteness once again emerging as a significant predictor of verb form choice: full verb forms were produced less with definite modified NPs (either universally quantified or not) compared to modified NPs that did not have a definite article (p < .05). Universal quantification was not itself a significant predictor of verb form choice, and neither was the interaction between universal quantification and definiteness. Finally, there was again a significant effect of the stimulus verb form on verb form choice: speakers were more likely to produce a full verb form when recalling a sentence if the critical verb in the stimulus was also a full verb form.

Fig. 2. Relative clause verb productions depending on relative clause verb form presented during encoding and the NP head type actually produced.

table 3. Mixed logit regression of relative clause verb productions depending on definiteness and universal quantification of modified NP produced by speaker

4. Discussion

Standard psycholinguistics accounts of grammatical encoding assume that grammatical encoding is subject largely or only to pressures inherent to production planning, such as the retrieval of lexical and grammatical information (Arnold, Reference Arnold2008; Ferreira, Reference Ferreira2008; Ferreira and Dell, Reference Ferreira and Dell2000; MacDonald, Reference MacDonald2013). An alternative view holds that even these highly automatic processes are affected, directly or indirectly, by communicative goals. This idea has received some support from a number of studies showing that morphological elements can be reduced or omitted if the meaning they encode is expected in context (e.g., negation and auxiliaries in conversational English: Frank & Jaeger, Reference Frank, Jaeger, Love, McRae and Sloutsky2008; optional case-marking in Japanese: Kurumada & Jaeger, Reference Kurumada, Jaeger, Knauff, Pauen, Sebanz and Wachsmuth2013). Similarly, optional function words have been found to be more likely to be produced when the constituent they introduce is unexpected in context (e.g., Jaeger, Reference Jaeger2010, Reference Jaeger, Bender and Arnold2011; Wasow et al., Reference Wasow, Jaeger, Orr, Simon and Wiese2011). The goal of the present work was to test the typological generalizability of this communicative efficiency hypothesis by investigating Yucatec Maya speakers’ preferences in the production of a typologically rare head-marking alternation that occurs in relative clauses. We found that speakers were more likely to choose longer (head-marked) verb forms over simpler forms when the relative clause presented in the stimulus or produced by speakers was headed by an indefinite NP rather than a definite NP. Under the assumption that relative clauses in Yucatec are less expected as modifiers of indefinite NPs (paralleling findings for English; Wasow et al., Reference Wasow, Jaeger, Orr, Simon and Wiese2011), this result is consistent with the predictions of communicative efficiency: speakers choose longer forms when they provide an additional signal to less predictable constituents. Below, we first summarize our results in more detail. Then we discuss to what extent our results are compatible with standard accounts that attribute speakers’ preferences during grammatical encoding solely to production ease. We also discuss a potential alternative explanation of our results in terms of conventionalized probabilistic preferences, rather than on-line decisions during language production. We tentatively conclude that our findings cannot be reduced to any of these competing explanations. We close with a discussion of the mechanisms that allow grammatical encoding to strike an efficient balance between production effort and communicative goals.

In work carried out on better-studied languages, reliable estimates of frequency distributions of construction types can be obtained from large-scale syntactically annotated corpora. For under-studied languages like Yucatec, such corpora are often not available. We therefore considered two alternative means of estimating the contextual expectedness of relative clauses in Yucatec. First, Wasow et al. (Reference Wasow, Jaeger, Orr, Simon and Wiese2011) have argued for English that relative clauses are less likely following indefinite NPs by comparison with definite NPs, due to the particular semantic/pragmatic properties of the different determiner types. Because definite determiners generally serve to indicate that the referent of the NP is contextually unique, a relative clause provides a means of providing the additional information necessary to successfully refer to a unique individual. This is not necessary for indefinite NPs, which do not have this uniqueness restriction. On the assumption that these pragmatic effects are not language-specific, relative clauses should be less expected following indefinite NPs in Yucatec also. Second, in Yucatec, definite noun phrases are obligatorily marked with a phrase-final deictic clitic. If a definite noun phrase is modified by a relative clause, the clitic occurs at the end of the entire modifying relative clause, rather than directly after the noun. The absence of a particle after a noun is therefore a reliable signal to the likelihood of an upcoming relative clause in the case of definite NPs alone. On language-internal morphosyntactic grounds, then, definite NPs are more informative as to the likelihood of an upcoming relative clause than indefinite NPs.

While both the pragmatic-based and the morphosyntactic-based estimate of relative clause likelihood are consistent with the observed definiteness effect, it is the morphosyntactic-based estimate that best captures our results overall. On the pragmatic account, universally quantified NPs should also create a higher expectation for a relative clause than indefinite NPs: further restriction by means of a relative clause is preferable in the case of universally quantified NPs, because universal assertions are generally only true of restricted sets. In terms of the morphosyntax of Yucatec Maya, by contrast, universally quantified NPs pattern like indefinite NPs: the absence of the deictic particle is equally uninformative in both cases. The fact that we found no effect of universal quantification on verb form choice, independently of the definiteness effect, suggests therefore that speaker sensitivity to the relative expectedness of the relative clause is driven primarily by language-specific morphosyntactic cues available within the utterance, rather than by overall pragmatic biases associated with the different determiner types. Thus, our results suggest that Yucatec speakers prefer the reduced relative clause verb form whenever the relative clause is more probable in its context, as determined by Yucatec-specific morphosyntactic cues. This pattern of preferences is consistent with communicative efficiency accounts of sentence production.

If confirmed by future work, the preferential weighing of the morphosyntactic cue might suggest that speakers condition their production preferences more on more informative cues. While the informativity of cues has received some attention in research on language comprehension, little is known about how cue informativity affects language production (but see Jaeger, Reference Jaeger2006; Post & Jaeger, Reference Post and Jaeger2010; Qian & Jaeger, Reference Qian and Jaeger2012). Post and Jaeger (Reference Post and Jaeger2010) investigated how the informativity of different cues to a word’s identity affected that word’s phonetic reduction. Following the procedure of Levy and Jaeger (Reference Levy, Jaeger, Schlökopf, Platt and Hoffman2007), Post and Jaeger extracted a large number of words from a corpus of conversational speech and built a smoothed maximum entropy classifier predicting words based on cues in the preceding context (e.g., the preceding word). This classifier weighted each feature based on the information it contains about word identity. Post and Jaeger then compared classifiers trained on different sets of features in terms of how well the word predictability estimates derived from the classifier predicted the degree of phonetic reduction observed for each word in their corpus. They found that cues that were better predictors of word identity (i.e., more informative cues) also led to better models of phonetic reduction. This suggests that speakers’ preferences between reduced and full forms are affected more strongly by more informative cues (for similar evidence from English that-mention, see also Jaeger, Reference Jaeger2006). The current results are compatible with this interpretation, thereby motivating future work on this question.

On the assumption that Yucatec speakers’ morphosyntactic choices were influenced by communicative considerations, this raises the question as to why we should have found this effect in our experiment, given that speakers’ utterances were elicited under conditions that did not promote natural interactive communicative behavior. This is, indeed, generally true of sentence production experiments (including sentence recall tasks like ours). Importantly, however, unlike many previous studies, our experiment involved an audience: participants were asked to recall the sentences out loud to a native Yucatec-speaking partner. While the task of this audience was not defined (they were not explicitly told to do anything but listen), previous work suggests that simply having an audience present can affect production choices. Ferreira and Dell (Reference Ferreira and Dell2000) found that whether or not speakers were engaged in a communicative task affected overall rates of that-complementizer production: when recalling complement clause constructions, speakers produced that more often when they were speaking to an addressee who was asked to rate the clarity of their utterances, compared to when no addressee was present. Lockridge and Brennan (Reference Lockridge and Brennan2002) found that in retelling stories to naive addressees, speakers were more likely to mention atypical instruments, and to mention them early, when the addressee did not see a picture illustrating the main action and the instrument, compared to when they could see such a picture. More generally, it is plausible to assume that participants transfer their expectations about talking to someone into the experiment. It thus seems plausible to assume that at least some of our participants considered their audience as sufficiently real, and therefore that speakers’ production choices could, in principle, have been influenced by communicative pressures.

4.1. could production ease explain our results?

Grammatical encoding is typically assumed to be solely, or predominantly, affected by pressures inherent to production planning (Arnold, Reference Arnold2008; Ferreira, Reference Ferreira2008; Ferreira & Dell, Reference Ferreira and Dell2000; MacDonald, Reference MacDonald2013). These accounts are sometimes referred to as production ease accounts. Here we do not dispute that grammatical encoding processes can be affected by production ease; robust evidence for this conclusion comes from a large cross-linguistic body of research on accessibility effects on syntactic preferences (see Jaeger & Norcliffe, Reference Jaeger and Norcliffe2009, for a review). Indeed, in ‘Appendix A’ we consider evidence suggesting that production ease also affects Yucatec relative clause production. We argue, however, that the observed Yucatec production preferences are not reducible to production ease.

We briefly elaborate on this point. In a review of the field, MacDonald (Reference MacDonald2013) identifies three preferences she attributes to production ease: producing available material early (‘availability-based production’), reusing previously produced material (‘priming’), and avoidance of similarity-based interference. While all three of these preferences have received empirical support (for references, see MacDonald, Reference MacDonald2013; for discussion, see also Jaeger, Reference Jaeger2013), we submit that they cannot account for the current findings. We discuss the three preferences in turn.

First, definiteness has been found to affect speakers’ preferences in preceding choice points (e.g., Ferreira & Dell, Reference Ferreira and Dell2000; Jaeger & Wasow, Reference Jaeger, Wasow, Cover and Kim2006; Roland et al., Reference Roland, Elman and Ferreira2006). For example, speakers tend to omit that more often before definite, compared to indefinite, complement clause subjects. This may be a consequence of definite NPs being available for articulation earlier than indefinite NPs: definiteness has been linked to the ease with which referents are retrieved from memory (Bock & Warren, Reference Bock and Warren1985). However, in the current study, the critical morphosyntactic choice (the choice between realizing or omitting ku- on the relative clause verb) follows our definiteness manipulation. Availability-based production makes no predictions in this case (cf. Ferreira & Dell, Reference Ferreira and Dell2000). In ‘Appendix A’, we present additional analyses, finding that the definiteness of the embedded NP does affect whether speakers produce ku. That is, there is evidence for availability-based production in Yucatec Maya, but it does not explain the critical results obtained here.

Second, it is well known that speakers show a tendency to repeat recently produced material. For example, there is ample evidence of morphological and syntactic priming (e.g., Bock, Reference Bock1986; Gries, Reference Gries2005; Pickering & Branigan, Reference Pickering and Branigan1998; Szmrecsányi, Reference Szmrecsányi2005) including in optional that-omission (Ferreira, Reference Ferreira2003; Jaeger, Reference Jaeger2010). However, in the current study, exposure to stimuli with the full and the reduced verb form was counterbalanced within the manipulation of interest, so that priming cannot account for our results.

Finally, similarity-based interference has been found to affect production, in that speakers seem to prefer grammatical structures that avoid the production of similar referential expressions in close sequential proximity (e.g., Gennari, Mirković, & MacDonald, Reference Gennari, Mirković and MacDonald2012). So, it is possible that speakers would produce the longer verb form (with ku) in order to increase the distance between two noun phrases that have the same definiteness features (i.e., when both are definite, or both are indefinite). In ‘Appendix A’ we present additional analyses which take into consideration the definiteness of the embedded NP. These did not reveal any effect of semantic interference. We thus tentatively conclude that production ease is unlikely to account for our results.

4.2. conventionalized preferences or on-line pressures?

Another possible explanation for the observed Yucatec production preferences could be that they reflect conventionalized probabilistic knowledge. On this view, the production behavior in our experiment would be the outcome of speakers’ fine-grained probabilistic knowledge of the linguistic conventions of their language: speakers have learned from exposure to their speech community that reduced verb forms are more frequent in relative clauses that are headed with definite NPs than with indefinite NPs, and they recapitulate these distributional patterns in their own productions. There is, indeed, thought-provoking evidence that on-line pressures alone are insufficient to explain patterns across varieties of English, and thus that variability in production may sometimes reflect probabilistic linguistic knowledge (Bresnan & Hay, Reference Bresnan and Hay2007; Rosenbach, Reference Rosenbach2002, Reference Rosenbach, Rohdenburg and Mondorf2003). Such knowledge could take the form of a probabilistic grammar (where probabilities are associated with conventional rules or constraints; see, e.g., Hale, Reference Hale2003; Manning, Reference Manning, Bod, Hay and Jannedy2003), or, as in the case of exemplar-based models, could consist of stored chunks of previously experienced language (Bod, Reference Bod1998). We cannot rule out the possibility that the patterns of preferences observed in our experiment are (partially) conventionalized.^{Footnote 10} Note, however, that even if the observed patterns were to be completely conventionalized, this still leaves open the question of how they came to be conventionalized and, specifically, what biases explain the direction of the conventionalization (cf. Jaeger, Reference Jaeger2006, Reference Jaeger2010, for discussion). Our results suggest that a bias for robust information transfer can account for the direction of the effects we find: relative clauses are less predictable when headed by indefinite NPs; speakers therefore preferentially use (or have conventionalized the preferential use of) full verb forms (which convey a more robust signal) in these contexts.

4.3. what is the mechanism?

How does a bias for robust information transfer come to affect production processes? One possibility is that speakers simulate the beliefs of their interlocutor, and make production decisions that take into account these estimates. Given the computational demands of speaking, this is often assumed to be unlikely (e.g., Ferreira, Reference Ferreira2008, but see Tanenhaus, Reference Tanenhaus2013). An alternative hypothesis is that communicative goals affect production preferences through lifelong and mostly implicit learning (Jaeger & Ferreira, Reference Jaeger and Ferreira2013, Kurumada & Jaeger, ms.).^{Footnote 11} On this view, speakers attend to feedback about the perceived communicative success of their own utterances, and then integrate this feedback into future production plans. This feedback may involve speakers’ own evaluation of their productions (self-monitoring), as well as implicit and explicit feedback from their interlocutors. In contexts where the information is expected, speakers will be less likely to receive negative feedback from the use of less robust signals (e.g., reductions or omissions). Where, however, the intended message is less expected or less inferable, speakers will be more likely to receive some kind of negative feedback from less robust signals. This feedback will then, it is hypothesized, affect their subsequent production choices (for further discussion, see Kurumada & Jaeger, ms.).

A learning account along these lines has the theoretical advantage that it would not require speakers to continuously simulate their interlocutors. It receives empirical support from research on articulation suggesting that speakers monitor themselves, and integrate their own feedback into their later productions (e.g., Houde, Reference Houde1998; Tourville, Reilly, & Guenther, Reference Tourville, Reilly and Guenther2008; Villacorta, Perkell, & Guenther, 2007). At the level of grammatical encoding, research on this question is still largely lacking; some initial support comes from a study by Roche, Dale, and Kreuz (Reference Roche, Dale, Kreuz, Ohlsson and Catrambone2010), who found that English speakers in an interactive dialogue-based task were more likely to adjust their productions to avoid syntactic ambiguity when their previous productions were not communicatively successful. While this result encourages the view that communicative biases can enter language production through learning, further research is required to establish the extent to which this is possible, and the aspects of linguistic encoding that are most likely to be affected by learning. These remain promising and exciting directions for future study.

5. Conclusion

A powerful aspect of the hypothesis of communicative efficiency is that its predictions are not bound to any particular level of linguistic representation. Consistent with this hypothesis, we have provided evidence that the effects of efficient information transmission are visible in the production of a typologically rare morphosyntactic alternation found in Yucatec Maya. These findings strengthen the possibility that communicative efficiency is a general computational strategy that shapes language production, both across language structures and across languages.

More generally, in the spirit of Christianson and Ferreira (Reference Christianson and Ferreira2005), who, to the best of our knowledge, were the first to bring controlled sentence production research to ‘the field’, our study contributes to a small but growing body of work investigating sentence production in understudied languages (see, among others, Butler, Bohnemeyer, & Jaeger, Reference Butler, Bohnemeyer, Jaeger, Machicao y Priemer, Nolda and Sioupi2014; Norcliffe, Reference Norcliffe2009a; Reference Norcliffe, Konopka, Mishra, Srinivasan and HuettigNorcliffe & Konopka, in press; Reference Norcliffe, Konopka, Brown, Levinson, Norcliffe, Harris and JaegerNorcliffe, Konopka, Brown, & Levinson, forthcoming; Santesteban, Pickering, & Branigan, Reference Santesteban, Pickering and Branigan2013; Sauppe, Norcliffe, Konopka, Van Valin Jr., & Levinson, Reference Sauppe, Norcliffe, Konopka, Van Valin, Levinson, Knauff, Pauen, Sebanz and Wachsmuth2013). Head-marking languages constitute one particular language type that has traditionally been beyond the purview of psycholinguistic research, and about which little is currently known from a cognitive perspective (Jaeger & Norcliffe, Reference Jaeger and Norcliffe2009). Our study represents a small step towards increasing this knowledge, and, concomitantly, extending the empirical base against which psycholinguistic theories can be evaluated.

appendix A

In this appendix, we present additional follow-up analyses based on the form of the embedded NP. As discussed in the main text, the form of the embedded NP is critical when evaluating an account of our findings in terms of similarity-based interference (Gennari et al., Reference Gennari, Mirković and MacDonald2012). Additionally, both availability-based production (e.g., Ferreira & Dell, Reference Ferreira and Dell2000, and ambiguity avoidance accounts predict that the form of the embedded NP affects speakers’ preference for the full relative clause verb form (with ku-). We first evaluate availability-based and ambiguity-avoidance accounts and test the similarity-based interference account.

The original stimuli in our experiment held the embedded NP (the object of the relative clause verb) constant within an item, and balanced the determiner type of the NP across items. Data loss led to some imbalance, however, and speakers did not consistently reproduce the determiner of the embedded NP exactly as they heard in the input (Table 4). For that reason we conducted an additional analysis that included the actual determiner of the embedded NP as a control variable. For this, we used Helmert coding, contrasting bare vs. indefinite determiners, and definite determines vs. the average of bare and indefinite determiners produced. No random slopes were included for this variable because it was not a design variable (Barr, Levy, Scheepers, & Tily, Reference Barr, Levy, Scheepers and Tily2013). The model was otherwise identical to the second analysis reported above.

table 4. Counts of actually produced determiner types of embedded NP by the determiner type presented in the recall stimulus

As shown in Table 5, our main predictor of interest (definiteness of the head NP) remains significant when the definiteness of the embedded NP is included as a control variable. Additionally, embedded NP definiteness also emerges as a significant predictor of verb form choice: the full verb form was more likely to be produced if the embedded NP was definite, rather than indefinite or bare.

table 5. Mixed logit regression of relative clause verb productions depending on definiteness and universal quantification of modified NP and definiteness of embedded NP produced by speaker

Two mutually compatible explanations for this definiteness effect for the embedded NP deserve consideration. First, as described briefly in the discussion section, definiteness has been linked to the ease with which the referent, the NP, or the mapping between them are retrieved from memory (Bock & Warren, Reference Bock and Warren1985). This would predict that definite embedded NPs might be available for articulation earlier than indefinite embedded NPs. Availability –or, ease of production of upcoming material – is known to affect sentence planning (Bock & Warren, Reference Bock and Warren1985; Branigan et al., Reference Branigan, Pickering and Tanaka2008; Brown-Schmidt & Konopka, Reference Brown-Schmidt and Konopka2008; Ferreira, Reference Ferreira1996; Ferreira & Dell, Reference Ferreira and Dell2000; Prat-Sala & Branigan, Reference Prat-Sala and Branigan2000). For example, when upcoming material is not available for production, speakers are more likely to lengthen preceding words (Fox Tree & Clark, Reference Fox Tree and Clark1997) or insert additional words (Clark & Fox Tree, Reference Clark and Fox Tree2002), including optional function words, such as optional that in English complement or relative clauses (Ferreira & Dell, Reference Ferreira and Dell2000; Roland et al., Reference Roland, Elman and Ferreira2006). Specifically, there is evidence that speakers are less likely to produce that if an object relative clause starts with a definite NP, compared to an indefinite NP (Elsness, Reference Elsness1984; Jaeger & Wasow, Reference Jaeger, Wasow, Cover and Kim2006; Tottie, Reference Tottie, Melchers and Warren1995). However, these results come from conversational speech, in which definiteness is correlated with previous mention (Prince, Reference Prince and Cole1981). Since previous mention has been shown to affect that-production (Ferreira & Dell, Reference Ferreira and Dell2000; Jaeger & Wasow, Reference Jaeger, Wasow, Cover and Kim2006; Temperley, Reference Temperley2003), it is thus unclear whether definiteness has any direct effect on availability. In our experiment, both definite and indefinite expressions were previously mentioned (due to the nature of the procedure, in which the critical data comes from recall of previously produced sentences).

An alternative interpretation holds that the strong association between full verb mention and the definiteness of the embedded NP is – perhaps partly grammaticalized – a consequence of ambiguity avoidance. Recall that the full verb form with ku- is compatible with an interpretation as either an object- or subject-extracted relative clause, while the reduced verb form only allows a subject-extracted relative clause interpretation. In our stimuli, we made the object-extracted interpretation very implausible, by keeping the embedded NP inanimate in all items (e.g., yielding implausible object-extracted interpretations such as ??Rodrigo yelled at a/the/every waitress that the soup spilled). Still, these stimuli lead to a temporary ambiguity: until the semantics of the embedded NP are processed, the sentence remains compatible with two interpretations. Compared to an indefinite or bare NP, a definite determiner on the embedded NP following the full verb form may temporarily bias listeners towards the (unintended) object-extracted interpretation, in so far as definite NPs are more likely to be subjects. Producing the reduced verb form before definite embedded NPs ameliorates this problem, because it is only compatible with the subject-extracted interpretation. This possibility is consistent with proposals advanced in the Mayanist linguistics literature that the verbal alternation is driven by ambiguity avoidance, that is, by the necessity to accurately convey the grammatical functions assigned to the relative clause verb’s arguments (see, e.g., Gutiérrez-Bravo & Monforte, Reference Gutiérrez-Bravo, Monforte, Avelino, Coon and Norcliffe2009, for Yucatec, and for related Mayan languages, see Aissen, Reference Aissen, Nowak and Yoquelet2003, for Tzotzil, and Mondloch, Reference Mondloch1978, for K’iche’).

The possibility that Yucatec speakers are making production choices in ways that allow them to alleviate temporary ambiguity, and thus to guarantee the successful transmission of their intended message is broadly compatible with the idea that the principle of robust information transfer can influence processes of grammatical encoding. Producing the reduced verb form will increase the probability that listeners will correctly infer the grammatical function assigned to the relative clause head NP and to the embedded NP, and in this way will facilitate robust information transfer of the intended meaning. From the perspective of communicative efficiency, however, it is interesting to note that the alternant that guarantees the most faithful transmission of the intended message is the reduced alternant. No trade-off is therefore required between production ease (a preference for shorter, more accessible forms) and signal robustness: it is the shorter, less morphologically complex form that provides the most robust signal, in terms of clarifying the intended meaning of the structure.

While ambiguity avoidance might explain the embedded NP effect, it is unlikely to account for our main finding reported above, namely that the full relative clause verb form is more likely following indefinite modified NPs. Given that indefinite NPs are less likely to be subjects compared to definite NPs, if the choice between verb forms was driven by ambiguity avoidance alone, speakers should prefer the unambiguous (reduced) verb form where the modified NP is indefinite, because in such cases listeners should be biased towards an (unintended) object-extracted interpretation. Instead, speakers prefer the full form in such contexts. As noted in the ‘Discussion’, this is accounted for by communicative efficiency on the assumption that relative clauses are less expected following indefinite NPs: speakers prefer longer forms when the information conveyed by that form is less contextually expected.

Finally, we tested whether similarity-based interference could account for the effects of both the modified and the embedded NP. Recent cross-linguistic work has found that speakers’ choice between different types of relative clauses (e.g., a passive subject-extracted relative clause, rather than active object-extracted relative clause; Gennari et al., Reference Gennari, Mirković and MacDonald2012) may be driven by a preference to produce semantically similar NPs further apart from each other. Similar similarity-based interference effects have been found in comprehension (Gordon, Hendrick, Johnson, & Lee, Reference Gordon, Hendrick, Johnson and Lee2006; Gordon, Hendrick, & Levine, Reference Gordon, Hendrick and Levine2002; Lewis & Nakayama, Reference Lewis, Nakayama and Nakayama2001).

Here we tested whether similarity-based interference affected Yucatec speakers’ preference to produce or omit ku-. First, we tested whether adding an interaction between the definiteness of the modified NP and the three-way contrast of the embedded NP improved the model reported above. This was not the case (χ²(2) = 2.54, p > .28). Second, we asked whether adding a predictor to the above model that coded whether the modified and embedded NP were similar (i.e., both definite or both indefinite and both not universally quantified) improved the model. This too was not the case (χ²(1) = 2.26, p > .13). The current data thus reveal no evidence for similarity-based interference.

appendix B: experimental stimuli

1. Juane’ tu yilah tuláakal artista (ku) póolik hun p’éel tùunich
Juane’ tu yilah le artista (ku) póolik hun p’éel tùunicho’
Juane’ tu yilah hun tuul artista (ku) póolik hun p’éel tùunich
‘Juan watched every/the/a artist that was carving a stone’
2. X-Maríae’ tu yáantah tuláakal pasahero (ku) b’isik hun p’éel maleta
X-Maríae’ tu yáantah le pasahero (ku) b’isik hun p’éel maletao’
X-Maríae’ tu yáantah hun túul pasahero (ku) b’isik hun p’éel maleta
‘Maria helped every/the/a passenger that was carrying a suitcase’
3. Pedroe’ tu ts’ú’uts’ah tuláakal xunáan (ku) k’áak’tik le b’ak’o’
Pedroe’ tu ts’ú’uts’ah le xunáan (ku) k’áak’tik le b’ak’o’
Pedroe’ tu ts’ú’uts’ah hun túul xunáan (ku) k’áak’tik le b’ak’o’
‘Pedro kissed every/the/a lady that was roasting the meat’
4. X-Laurae’ tu atendertah tuláakal máak (ku) manik anis
X-Laurae’ tu atendertah le máak (ku) manik aniso’
X-Laurae’ tu atendertah hun túul máak (ku) manik anis
‘Laura served every/the/a person who was buying anis’
5. X-Juliae’ tu p’o’ah tuláakal x-ch’úupal (ku) tóokik le si’o’
X-Juliae’ tu p’o’ah le x-ch’úupal (ku) tóokik le si’o’
X-Juliae’ tu p’o’ah hun túul x-ch’úupal (ku) tóokik le si’o’
‘Julia washed every/the/a girl that was burning the wood’
6. Robertoe’ tu entrevistartah tuláakal periodista (ku) ts’íib’tik hun p’éel articulo
Robertoe’ tu entrevistartah le periodista (ku) ts’íib’tik hun p’éel articuloo’
Robertoe’ tu entrevistartah hun túul periodista (ku) ts’íib’tik hun p’éel articulo
‘Roberto interviewed every/the/a journalist that was writing an article’
7. Jaimee’ tu chukah tuláakal h-ts’òon (ku) p’e’esik kéeh
Jaimee’ tu chukah le h-ts’òon (ku) p’e’esik kéeho’
Jaimee’ tu chukah hun túul h-ts’òon (ku) p’e’esik kéeh
‘Jaime discovered every/the/a hunter that was skinning deer’
8. X-Carlae’ tu yáalkab’tah tuláakal winik (ku) páanik chí’ikam
X-Carlae’ tu yáalkab’tah le winik (ku) páanik chí’ikamo’
X-Carlae’ tu yáalkab’tah hun túul winik (ku) páanik chí’ikam
‘Carla followed every/the/a man that was digging up jicama’
9. X-Gabrielae’ tu b’ó’otah tuláakal h-p’o’ (ku) tikinkúunsik nòok’
X-Gabrielae’ tu b’ó’otah le h-p’o’ (ku) tikinkúunsik nòok’o’
X-Gabrielae’ tu b’ó’otah hun túul h-p’o’ (ku) tikinkúunsik nòok’
‘Gabriela paid every/the/a laundryman that was drying clothes’
10. Miguele’ tu t’anah tuláakal doktor (ku) b’èetik le ts’àako’
Miguele’ tu t’anah le doktor (ku) b’èetik le ts’àako’
Miguele’ tu t’anah hun túul doktor (ku) b’èetik le ts’àako’
‘Miguel called every/the/a doctor that was preparing the medicine’
11. X-Adrianae’ tu ká’ansah tul´aakal xí’ipal (ku) presentartik hun p’éel examen
X-Adrianae’ tu ká’ansah le xí’ipal (ku) presentartik hun p’éel exameno’
X-Adrianae’ tu ká’ansah hun túul xí’ipal (ku) presentartik hun p’éel examen
‘Adriana taught every/the/a boy who was taking an exam’
12. Rafaele’ tu hé’elsah tuláakal choofer (ku) manehartik hun p’éel camioneta
Rafaele’ tu hé’elsah le choofer (ku) manehartik hun p’éel camionetao’
Rafaele’ tu hé’elsah hun túul choofer (ku) manehartik hun p’éel camioneta
‘Rafael stopped every/the/a driver that was driving a truck’
13. X-Valeriae’ tu b’estirtah tuláakal k’ohá’an (kuy) uk’ik le harabeo’
X-Valeriae’ tu b’estirtah le k’ohá’an (kuy) uk’ik le harabeo’
X-Valeriae’ tu b’estirtah hun túul k’ohá’an (kuy) uk’ik le harabeo’
‘Valeria dressed every/the/a patient that was taking the syrup’
14. Victore’ tu kastigartah tuláakal h-xòok (ku) ts’uutsik hun p’éel chamal
Victore’ tu kastigartah le h-xòok (ku) ts’uutsik hun p’éel chamalo’
Victore’ tu kastigartah hun túul h-xòok (ku) ts’uutsik hun p’éel chamal
‘Victor punished every/the/a student that was smoking a cigarette’
15. Raquele’ tu machah tuláakal h-ch’úuk (ku) grabartik le tsikb’alo’
Raquele’ tu chuukah le h-ch’úuk (ku) grabartik le tsikb’alo’
Raquele’ tu chuukah hun túul h-ch’úuk (ku) grabartik le tsikb’alo’
‘Raquel caught every/the/a spy that was recording the conversation’
16. Manuele’ tu yú’ub’ah tuláakal musiko (ku) paxik le marimbao’
Manuele’ tu yú’ub’ah le musiko (ku) paxik le marimbao’
Manuele’ tu yú’ub’ah hun túul musiko (ku) paxik le marimbao’
‘Manuel listened to every/the/a musician that was playing the marimba’
17. Luciae’ tu yawatah tuláakal pàal (kuy) óokoltik hun p’éel wáah
Luciae’ tu yawatah le pàal (kuy) óokoltik hun p’éel wáah
Luciae’ tu yawatah hun túul pàal (kuy) óokoltik hun p’éel wáah
‘Lucia shouted at every/the/a boy that was stealing a tortilla’
18. Rodrigoe’ tu k’eyah tuláakal x-meesera (ku) wekik le sopao’
Rodrigoe’ tu k’eyah le x-meesera (ku) wekik le sopao’
Rodrigoe’ tu k’eyah hun túul x-meesera (ku) wekik le sopao’
‘Rodrigo scolded every/the/a waitress that was spilling the soup’
19. Alejandroe’ tu kashtah tuláakal x-kó’olel (ku) hit’ik hun p’éel xàak
Alejandroe’ tu kashtah le x-kó’olel (ku) hit’ik hun p’éel xàako’
Alejandroe’ tu kashtah hun túul x-kó’olel (ku) hit’ik hun p’éel xàak
‘Alejandro found every/the/a woman that was weaving a basket’
20. Alvaroe’ tu poch’ah tuláakal xìib’ (ku) chakik k’éek’en
Alvaroe’ tu poch’ah le xìib’ (ku) chakik k’éek’eno’
Alvaroe’ tu poch’ah hun túul xìib’ (ku) chakik k’éek’en
‘Alvaro insulted every/the/a man that was boiling pig’
21. Celiae’ tu ts’onah tuláakal h-wàach (ku) kaláantik le hòolo’
Celiae’ tu ts’onah le h-wàach (ku) kaláantik le hòolo’
Celiae’ tu ts’onah hun túul h-wàach (ku) kaláantik le hòolo’
‘Celia shot every/the/a soldier that was guarding the entrance’
22. Josee’ tu tá’akah tuláakal h-kòonol (ku) konik xí’im
Josee’ tá’akah le h-kòonol (ku) konik xí’imo’
Josee’ tá’akah hun túul h-kòonol (ku) konik xí’im
‘Jose protected every/the/a vendor that was selling corn’
23. Rosae’ tu chíimpoltah tuláakal actor (ku) mèentik le teatroo’
Rosae’ tu chíimpoltah le actor (ku) mèentik le treatoo’
Rosae’ tu chíimpoltah hun túul actor (ku) mèentik le teatroo’
‘Rosa congratulated every/the/a actor that was performing the play’
24. Fernando tu ché’ehtah tuláakal turista (ku) t’anik maya’-t’àan
Fernando tu ché’ehtah le turista (ku) t’anik maya’-t’àan
Fernando tu ché’ehtah hun túul turista (ku) t’anik maya’-t’àan
‘Fernando laughed at every/the/a tourist that was speaking Maya’

Footnotes

1 The following abbreviations are used in the Yucatec Maya example glosses: ‘=’ = clitic boundary; 1/2/3 = 1st/2nd/3rd person; cmp = completive; def = definite determiner; full = full verb form; impv = imperfective; inc = incompletive; indef = indefinite determiner; part = particle; red = reduced verb form; sg = singular; top = topic particle, univ = universal quantifier.

2 In addition to signaling the onset of a relative clause, the full verb form also encodes referential information about the relative clause verb’s subject argument. Communicative efficiency accounts therefore predict that Yucatec speakers’ preference between the full and reduced form is also affected by the expectedness of the subject referent of the relative clause. Note that a similar prediction can be made for English that-mentioning, since the choice of that over, for example, who or which, also conveys additional referential and structural information. Here, we focus exclusively on the expectedness of the relative clause constituent.

3 While we know of no particular challenge to the assumption that speakers across languages would be guided by the same basic pragmatic principles at this general level, this assumption remains to be tested in future research (e.g., by collecting sufficiently large corpora of Yucatec).

4 To be specific, two different clause-final deictic particles are triggered by definite NPs: -o, which marks proximal deixis (in space and time), and -a, which marks distal deixis. The choice between the two in discourse has been connected to the ‘evidential status’ of the proposition (see Hanks, Reference Hanks1990). The precise deictic values of these particles is not relevant for present purposes, merely the fact that clauses are obligatorily followed by one or the other particle when that clause contains a definite NP. For clarity, we also note that there can be no more than one clause-final particle per clause. Thus, if a clause contains multiple definite noun phrases, this will still only trigger a single particle at the end of the clause. Note, finally, that in the examples in this section, as well as in our experimental stimuli (see ‘Appendix B’), the subject of the main clause (e.g., Manuel in (17)), occurs in a sentence-initial ‘topic’ position, which is syntactically external to the matrix clause. Such topics always trigger their own clause-final clitic (-e).

5 As we discuss below, universally quantified NPs can be combined with a definite determiner, in which case the absence of a deictic particle becomes informative. Our stimuli never combined universal quantification with a definite determiner, but participants occasionally produced this pattern.

6 In fact, right context effects are expected under ideal observer models of language processing and speech perception (e.g., Levy et al., Reference Levy, Bicknell, Slattery and Rayner2009; see also Dahan, Reference Dahan2010).

7 We chose to use a video distractor task after mixed success with alternative distractor tasks in pilot studies. While it is possible that the video descriptions may have interfered with the productions of the critical morpheme in the sentence recall phase of the trial, there is no reason to assume that this potential confound would have been unequally distributed across conditions. Moreover, the effects we report in this paper have been replicated in another paradigm that does not use a language-based distractor task (Norcliffe, Reference Norcliffe2009a).

8 Lucy (Reference Lucy1992) observes that the use of p’éel in place of túul with human referents can, in certain pragmatic contexts, reflect subtle differences in referent construal, suggesting, for example, that the human referent is not known to the speaker. The occasional shifts we observed from the use of hun túul to hun p’éel in our experiment may therefore reflect differences in construal. It is also possible that the shifts to p’éel were a consequence of priming from the classifier type of the embedded inanimate NP in the listen and repeat phase of the trials.

9 Following the procedure outlined at <http://hlplab.wordpress.com/2011/06/25/more-on-random-slopes/

10 In Jaeger (Reference Jaeger2006), the second author found independent effects of collocation and predictability on that omission. This is compatible with the idea that the pattern we observe in Yucatec Maya is at least partially driven by on-line pressures.

11 Manuscript under revision (2014), online <https://www.academia.edu/5142210/Kurumada_C._and_Jaeger_T.F._resubmitted_Communicative_efficiency_in_language_production_Optional_case-marking_in_Japanese._submitted_for_publication>.

References

references

Aissen, J. (1999). Agent focus and inverse in Tzotzil. Language, 75, 451–485.Google Scholar

Aissen, J. (2003). Differential coding, partial blocking and bi-directional OT. In Nowak, P. & Yoquelet, C. (Eds.), Proceedings of the 29th Annual Meeting of the Berkeley Linguistics Society (pp. 1–16). Berkeley: Berkeley Linguistics Society.Google Scholar

Arnold, J. (2008). Reference production: production-internal and addressee-oriented processes. Language and Cognitive Processes, 23(4), 495–527.Google Scholar

Aylett, M. P., & Turk, A. (2004). The smooth signal redundancy hypothesis: a functional explanation for relationships between redundancy, prosodic prominence and duration in spontaneous speech. Language and Speech, 47(1), 31–56.Google Scholar

Aylett, M. P., & Turk, A. (2006). Language redundancy predicts syllabic duration and the spectral characteristics of vocalic syllable nuclei. Journal of the Acoustical Society of America, 119, 3048–3058.CrossRef Google Scholar PubMed

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. (2013). Random effects structure for confirmatory hypothesis testing: keep it maximal. Journal of Memory and Language, 68(3), 255–278.Google Scholar

Bates, D., Maechler, M., & Bolker, B. (2012). Lme4: Linear mixed-effects models using s4 classes. R package version 0.999999-0, online <http://CRAN.R-project.org/package=lme4>..>Google Scholar

Bates, E., Masling, M., & Kintsch, W. (1978). Recognition memory for aspects of dialogue. Journal of Experimental Psychology: Human Learning and Memory, 4(3), 187–197.Google Scholar

Bell, A., Brenier, J. M., Gregory, M., Girand, C., & Jurafsky, D. (2009). Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language, 60(1), 92–111.Google Scholar

Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., & Gildea, D. (2003). Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America, 113(2), 1001–1024.Google Scholar

Bock, J. K. (1986). Syntactic persistence in language production. Cognitive Psychology, 18, 355–387.CrossRef Google Scholar

Bock, J. K., & Irwin, D. E. (1980). Syntactic effects of information availability in sentence production. Journal of Verbal Learning and Verbal Behavior, 19, 467–484.Google Scholar

Bock, J. K., & Levelt, W. J. M. (1994). Language production: grammatical encoding. In Gernsbacher, M. A. (Ed.), Handbook of psycholinguistics (pp. 945–984). London: Academic Press.Google Scholar

Bock, J. K., & Warren, R. K. (1985). Conceptual accessibility and syntactic structure in sentence formulation. Cognition, 21(1), 47–67.Google Scholar

Bod, R. (1998). Beyond grammar: an experience-based theory of language. Stanford: CSLI Publications & Cambridge University Press.Google Scholar

Bohnemeyer, J. (2002). The grammar of time reference in Yukatek Maya. Munich: LINCOM.Google Scholar

Bohnemeyer, J. (2009). Linking without grammatical relations in Yucatec Maya: alignment, extraction and control. In Helmbrecht, J.Nishina, Y.Shin, Y.-M.Skopeteas, S., & Verhoeven, E. (Eds.), Form and function in language research: papers in honour of Christian Lehmann (pp. 185–214). Berlin: Mouton de Gruyter.Google Scholar

Branigan, H. P., Pickering, J., & Tanaka, M. (2008). Contributions of animacy to grammatical function assignment and word order during production. Lingua, 118, 172–189.Google Scholar

Breslow, N., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(9), 9–25.Google Scholar

Bresnan, J., Cueni, A., Nikitina, T., & Baayen, H. (2007). Predicting the dative alternation. In Boume, G.Kraemer, I., & Zwarts, J. (Eds.), Cognitive foundations of interpretation (pp. 69–94). Amsterdam: Royal Netherlands Academy of Science.Google Scholar

Bresnan, J., & Hay, J. (2007). Gradient grammar: an effect of animacy of the syntax of give in New Zealand and American English. Lingua, 118(2), 245–259.Google Scholar

Brown, P., & Dell, G. S. (1987). Adapting production to comprehension: the explicit mention of instruments. Cognitive Psychology, 19(4), 441–472.CrossRef Google Scholar

Brown-Schmidt, S., & Konopka, A. E. (2008). Little houses and casas pequeñas: message formulation and syntactic form in unscripted speech with speakers of English and Spanish. Cognition, 109(2), 274–280.CrossRef Google Scholar PubMed

Butler, L., Bohnemeyer, J. B., & Jaeger, T. F. (2014). Plural marking in Yucatec Maya at the syntax-processing interface. In Machicao y Priemer, A., Nolda, A., & Sioupi, A. (Eds.), Zwischen Kern und Peripherie [Between core and periphery: studies on peripheral phenomena in language and grammar] (Studia Grammatica 76), (pp. 181–208). Berlin: Akademie-Verlag.Google Scholar

Bybee, J. (1988). The diachronic dimension in explanation. In Hawkins, J. A. (Ed.), Explaining language universals (pp. 350–379). Oxford: Blackwell.Google Scholar

Bybee, J., & Schiebman, J. (1999). The effect of usage on degree of constituency: the reduction of don’t in American English. Linguistics, 37, 575–596.CrossRef Google Scholar

Caballero, G., & Kapatsinski, V. (to appear). Perceptual functionality of morphological redundancy in Choguita Rarámuri (Tarahumara). In Norcliffe, E., Harris, A., & Jaeger, T. F. (Eds.), The cross-linguistic study of language understanding and production (Special Issue of Language, Cognition and Neuroscience).Google Scholar

Christianson, K., & Cho, H. (2009). Interpreting null pronouns (pro) in isolated sentences. Lingua, 119, 989–1008.Google Scholar

Christianson, K., & Ferreira, F. (2005). Planning in sentence production: evidence from a free word-order language (Odawa). Cognition, 98, 105–135.CrossRef Google Scholar

Clark, H. H., & Fox Tree, J. E. (2002). Using uh and um in spontaneous speech. Cognition, 84, 73–111.Google Scholar

Croft, W. (2000). Explaining language change: an evolutionary approach. London: Pearson.Google Scholar

Dahan, D. (2010). The time course of interpretation in speech comprehension. Current Directions in Psychological Science, 19(2), 121–126.CrossRef Google Scholar

Elsness, J. (1984). That or zero? A look at the choice of object clause connective in a corpus of American English. English Studies, 65, 519–533.Google Scholar

Ferreira, V. S. (1996). Avoid ambiguity! (if you can). CRL Technical Reports, 18, 3–13.Google Scholar

Ferreira, V. S. (2003). The persistence of optional complementizer mention: why saying a ‘that’ is not saying ‘that’ at all. Journal of Memory and Language, 48, 379–398.Google Scholar

Ferreira, V. S. (2008). Ambiguity, accessibility and a division of labor for communicative success. Psychology of Learning and Motivation, 49, 209–246.Google Scholar

Ferreira, V. S., & Dell, G. S. (2000). Effect of ambiguity and lexical availability on syntactic and lexical production. Cognitive Psychology, 40(4), 296–340.Google Scholar

Ferrer i Cancho, R. (2005). Zipf’s law from a communicative phase transition. European Physical Journal B – Condensed Matter and Complex Systems, 47(3), 449–457.CrossRef Google Scholar

Ferrer i Cancho, R., & del Prado Martín, F. M. (2011). Information content versus word length in random typing. Journal of Statistical Mechanics: Theory and Experiment, 2011(12), L12002.Google Scholar

Ferrer i Cancho, R., & Díaz-Guilera, A. (2007). The global minima of the communicative energy of natural communication systems. Journal of Statistical Mechanics: Theory and Experiment, 2007(06), P06009.Google Scholar

Fox, B., & Thompson, S. A. (2007). Relative clauses in English conversation: relativizers, frequency and the notion of construction. Studies in Language, 3, 293–326.CrossRef Google Scholar

Fox Tree, J. E., & Clark, H. H. (1997). Pronouncing ‘the’ as ‘thee’ to signal problems in speaking. Cognition, 62, 151–167.CrossRef Google Scholar PubMed

Frank, A., & Jaeger, T. F. (2008). Speaking rationally: uniform information density as an optimal strategy for language production. In Love, B. C.McRae, K., & Sloutsky, V. M. (Eds.), Proceedings of the 30th Annual Meeting of the Cognitive Science Society (CogSci08) (pp. 939–944). Austin, TX: Cognitive Science Society.Google Scholar

Gahl, S., & Garnsey, S. M. (2004). Knowledge of grammar, knowledge of usage: syntactic probabilities affect pronunciation variation. Language, 80(4), 748–775.Google Scholar

Gennari, S. P., Mirković, J., & MacDonald, M. C. (2012). Animacy and competition in relative clause production: a cross-linguistic investigation. Cognitive Psychology, 65, 141–176.CrossRef Google Scholar PubMed

Genzel, D., & Charniak, E. (2002). Entropy rate constancy in text. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (pp. 199–206). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Gibson, E., Piantadosi, S. T., Brink, K., Bergen, L., Lim, E., & Saxe, R. (2013). A noisy channel account of cross linguistic word order variation. Psychological Science, 24(7), 1079–1088.CrossRef Google Scholar

Givón, T. (1991). Markedness in grammar: distributional, communicative and cognitive correlates of syntactic structure. Studies in Language, 15(2), 335–370.Google Scholar

Givón, T. (1992). On interpreting text-distributional correlations: some methodological issues. In Payne, D. L. (Ed.), Pragmatics of word order flexibility (pp. 305–320). Philadelphia: John Benjamins.Google Scholar

Gordon, P. C., Hendrick, R., Johnson, M., & Lee, Y. (2006). Similarity-based interference during language comprehension: evidence from eye tracking during reading. Journal of Experimental Psychology: Learning, Memory, & Cognition, 32(6), 1304–1321.Google Scholar

Gordon, P. C., Hendrick, R., & Levine, W. H. (2002). Memory-load interference in syntactic processing. Psychological Science, 13(5), 425–430.Google Scholar

Graff, P., & Jaeger, T. F. (2009). Locality and feature specificity in OCP effects: evidence from Aymara, Dutch, and Javanese. In Bochnak, R.Nicola, N.Klecha, P.Urban, J.Lemieux, A., & Weaver, C. (Eds.), Proceedings of the Main Session of the 45th Meeting of the Chicago Linguistic Society (pp. 1–15). Chicago, IL: Chicago Linguistics Society.Google Scholar

Gries, S. T. (2005). Syntactic priming: a corpus-based approach. Journal of Psycholinguistic Research, 34(4), 365–399.Google Scholar

Gutiérrez-Bravo, R., & Monforte, J. (2009). Focus, agent focus and relative clauses in Yucatec Maya. In Avelino, H., Coon, J., & Norcliffe, E. (Eds.), New perspectives in Mayan linguistics (MIT Working Papers in Linguistics, 59).Google Scholar

Gutiérrez-Bravo, R., & Monforte, J. (2010). On the nature of word order in Yucatec Maya. In Camacho, J., Gutiérrez-Bravo, R., & Sánchez, L. (Eds.), Information structure in languages of the Americas (pp. 139–170). Berlin: Mouton de Gruyter.Google Scholar

Haiman, J. (1983). Iconic and economic motivation. Language, 59, 781–819.Google Scholar

Hale, J. (2003). Grammar, uncertainty and sentence processing. Unpublished PhD thesis, Johns Hopkins University.Google Scholar

Hanks, W. F. (1990). Referential practice, language and lived space among the Maya. Chicago: Chicago University Press.Google Scholar

Haspelmath, M. (1999). Optimality and diachronic adaptation. Zeitschrift für Sprachwissenschaft, 18(2), 180–205.Google Scholar

Haspelmath, M. (2004). Explaining the ditransitive person-role constraint: a usage-based account. Constructions, (2), online <http://www.digijournals.de/constructions/articles/35>.Google Scholar

Haspelmath, M. (to appear). On system pressure competing with economic motivation. In MacWhinney, B., Malchukov, A. L., & Moravcsik, E. A. (Eds.), Competing motivations.Google Scholar

Hawkins, J. A. (1978). Definiteness and indefiniteness: a study in reference and grammaticality prediction. New Jersey & London: Humanities Press & Croom Helm.Google Scholar

Hawkins, J. A. (2004). Efficiency and complexity in grammars. Oxford: Oxford University Press.Google Scholar

Heim, I. (1982). The semantics of definite and indefinite noun phrases. Unpublished PhD thesis, University of Massachusetts, Amherst.Google Scholar

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2/3), 61–83.Google Scholar

Houde, J. F. (1998). Sensorimotor adaptation in speech production. Science, 279(5354), 1213–1216.Google Scholar

Hume, E., & Mailhot, F. (2013). The role of entropy and surprisal in phonologization and language change. In Yu, A. (Ed.), Origins of sound patterns: approaches to phonologization (pp. 29–50). Oxford: Oxford University Press.Google Scholar

Jaeger, T. F. (2006). Redundancy and syntactic reduction in spontaneous speech. Unpublished PhD thesis, Stanford University.Google Scholar

Jaeger, T. F. (2008). Categorical data analysis: away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59, 434–446.CrossRef Google Scholar PubMed

Jaeger, T. F. (2010). Redundancy and reduction: speakers manage syntactic information density. Cognitive Psychology, 61, 23–62.Google Scholar

Jaeger, T. F. (2011). Corpus-based research on language production: information density and reducible subject relatives. In Bender, E. M. & Arnold, J. E. (Eds.), Language from a cognitive perspective: grammar, usage, and processing (pp. 161–197). Stanford, CA: CSLI Publications.Google Scholar

Jaeger, T. F. (2013). Production preferences cannot be understood without reference to communication. Frontiers in Psychology, 4, 230.Google Scholar

Jaeger, T. F., & Ferreira, V. S. (2013). Seeking predictions from a predictive framework. Behavioral and Brain Sciences, 36(4), 31–32.Google Scholar

Jaeger, T. F., Furth, K., & Hilliard, C. (2012). Incremental phonological encoding during unscripted sentence production. Frontiers in Psychology, 3.Google Scholar

Jaeger, T. F., & Norcliffe, E. (2009). The cross-linguistic study of sentence production: state of the art and a call for action. Language and Linguistic Compass, 3(4), 866–887.Google Scholar

Jaeger, T. F., & Tily, H. (2011). On language utility: processing complexity and communicative efficiency. Wiley Interdisciplinary Reviews: Cognitive Science, 2(3), 323–335.Google Scholar

Jaeger, T. F., & Wasow, T. (2006). Processing as a source of accessibility effects on variation. In Cover, R. & Kim, Y. (Eds.), Proceedings of the 31st Annual Meeting of the Berkeley Linguistics Society (BLS) (pp. 169–180). Ann Arbor, MI: Sheridan Books.Google Scholar

Keller, R. (1994). Language change: the invisible hand in language. London: Routledge.Google Scholar

Kurumada, C., & Jaeger, T. F. (2013). Communicatively efficient language production and case-marker omission in Japanese. In Knauff, M.Pauen, M.Sebanz, N., & Wachsmuth, I. (Eds.), The 35th Annual Meeting of the Cognitive Science Society (Cogsci13) (pp. 858–863). Austin, TX: Cognitive Science Society.Google Scholar

Lehmann, C. (1998). Possession in Yucatec Maya. Munich: LINCOM Europa.Google Scholar

Levelt, W. J. M. (1989). Speaking: from intention to articulation. Cambridge, MA: MIT Press.Google Scholar

Levy, R., Bicknell, K., Slattery, T., & Rayner, K. (2009). Eye movement evidence that readers maintain and act on uncertainty about past linguistic input. Proceedings of the National Academy of Sciences, 106(50), 21086–21090.Google Scholar

Levy, R., & Jaeger, T. F. (2007). Speakers optimize information density through syntactic reduction. In Schlökopf, B., Platt, J., & Hoffman, T. (Eds.), Advances in neural information processing systems (NIPS) 19 (pp. 849–856). Cambridge, MA: MIT Press.Google Scholar

Lewis, R., & Nakayama, M. (2001). Syntactic and positional similarity effects in the processing of Japanese embeddings. In Nakayama, M. (Ed.), Sentence processing in East Asian Languages (pp. 85–113). Stanford, CA: CSLI Publications.Google Scholar

Lindblom, B. (1990). Explaining phonetic variation: a sketch of the H&H theory. In Hardcastle, W. & Marchal, A. (Eds.), Speech production and speech modeling (pp. 403–439). Dordrecht: Kluwer.Google Scholar

Lockridge, C. B., & Brennan, S. E. (2002). Addressees’ needs influence speakers’ early syntactic choices. Psychonomic Bulletin and Review, 9(3), 550–557.Google Scholar

Lucy, J. (1992). Grammatical categories and cognition: a case study of the linguistic relativity hypothesis. Cambridge: Cambridge University Press.Google Scholar

MacDonald, M. C. (2013). How language production shapes language form and comprehension. Frontiers in Psychology, 4(226), 1–16.Google Scholar

Mahowald, K., Fedorenko, E., Piantadosi, S. T., & Gibson, E. (2013). Info/information theory: speakers choose shorter words in predictive contexts. Cognition, 126(2), 313–318.CrossRef Google Scholar PubMed

Mair, C. (2002). Three changing patterns of verb complementation in Late Modern English: a real-time study based on matching text corpora. English Language and Linguistics, 6(1), 105–131.Google Scholar

Manning, C. (2003). Probabilistic syntax. In Bod, R.Hay, J., & Jannedy, S. (Eds.), Probabilistic linguistics (pp. 289–341). Cambridge, MA: MIT Press.Google Scholar

Maurits, L., Perfors, A., & Navarro, D. (2010). Why are some word orders more common than others? A uniform information density account. Advances in Neural Information Processing Systems, 23, 1585–1593.Google Scholar

Mondloch, J. (1978). Disambiguating subjects and objects in Quiche Mayan. Journal of Mayan Linguistics, 1, 3–19.Google Scholar

Nichols, J. (1986). Head-marking and dependent-marking grammar. Language, 62, 56–119.Google Scholar

Nichols, J., & Bickel, B. (2011). Locus of marking: whole language typology. In Dryer, M. S. & Haspelmath, M. (Eds.), The world atlas of language structures online. Max Planck Digital Library, chapter 25.Google Scholar

Norcliffe, E. (2009a). Head-marking in usage and grammar: a study of variation and change in Yucatec Maya. Unpublished PhD thesis, Stanford University.Google Scholar

Norcliffe, E. (2009b). Revisiting agent focus in Yucatec. In Avelino, H.Coon, J., & Norcliffe, E. (Eds.), New perspectives in Mayan Linguistics (Vol. 59). MIT Working Papers in Linguistics.Google Scholar

Norcliffe, E., & Konopka, A. E. (in press). Vision and language in cross-linguistic research on sentence production. In Mishra, R. K., Srinivasan, N., & Huettig, F. (Eds.), Attention and vision in language processing. New York: Springer.Google Scholar

Norcliffe, E., Konopka, A. E., Brown, P., & Levinson, S. C. (forthcoming). Word order affects the time-course of sentence formulation in Tzeltal. In Norcliffe, E., Harris, A., & Jaeger, T. F. (Eds.), The cross-linguistic study of language understanding and production (Special Issue of Language, Cognition and Neuroscience).Google Scholar

Pellegrino, F., Coupé, C., & Marsico, E. (2011). A cross-language perspective on speech information rate. Language, 87(3), 539–558.Google Scholar

Piantadosi, S. T., Tily, H., & Gibson, E. (2011). Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences, 108(9), 3526–3529.Google Scholar

Pickering, M. J., & Branigan, H. P. (1998). The representation of verbs: evidence from syntactic priming in language production. Journal of Memory and Language, 39(4), 633–651.Google Scholar

Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2005). Lexical frequency and acoustic reduction in spoken Dutch. Journal of the Acoustical Society of America, 118, 2561–2569.Google Scholar

Post, M., & Jaeger, T. F. (2010). Word production in spontaneous speech: availability and communicative efficiency. Poster presented at The 23rd CUNY Sentence Processing Conference, NYC, NY.Google Scholar

Prat-Sala, M., & Branigan, H. P. (2000). Discourse constraints on syntactic processing in language production: a cross-linguistic study of English and Spanish. Journal of Memory and Language, 42(2), 168–182.Google Scholar

Prince, E. F. (1981). Toward a taxonomy of given–new information. In Cole, P. (Ed.), Radical pragmatics (pp. 223–256). New York: Academic Press.Google Scholar

Qian, T., & Jaeger, T. F. (2012). Cue effectiveness in communicatively efficient discourse production. Cognitive Science, 36(7), 1312–1336.Google Scholar

R Development Core Team (2005). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.Google Scholar

Resnik, P. (1996). Selectional constraints: an information-theoretic model and its computational realization. Cognition, 61, 127–159.Google Scholar

Roche, J., Dale, R., & Kreuz, R. J. (2010). The resolution of ambiguity during conversation: more than mere mimicry? In Ohlsson, S. & Catrambone, R. (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society (pp. 206–211). Austin, TX: Cognitive Science Society.Google Scholar

Rohdenburg, G. (2006). The role of functional constraints in the evolution of the English complementation system. In Dalton-Puffer, C.Kastovsky, D.Ritt, N., & Schendl, H. (Eds.), Syntax, style and grammatical norms (pp. 143–166). Bern: Peter Lang.Google Scholar

Roland, D., Dick, F., & Elman, J. L. (2007). Frequency of basic English grammatical structures: a corpus analysis. Journal of Memory and Language, 57(3), 348–379.Google Scholar

Roland, D., Elman, J. L., & Ferreira, V. S. (2006). Why is that? Structural prediction and ambiguity resolution in a very large corpus of English sentences. Cognition, 98(3), 245–272.CrossRef Google Scholar

Rosenbach, A. (2002). Genitive variation in English: conceptual factors in synchronic and diachronic studies (Topics in English Linguistics 42). Berlin & New York: Mouton de Gruyter.Google Scholar

Rosenbach, A. (2003). Aspects of iconicity and economy in the choice between the s-genitive and the of-genitive in English. In Rohdenburg, G. & Mondorf, B. (Eds.), Determinants of grammatical variation in English (Topics in English Linguistics) (pp. 379–411). Berlin: De Gruyter.Google Scholar

Sachs, J. S. (1967). Recognition memory for syntactic and semantic aspects of connected discourse. Perception and Psychophysics, 2, 437–442.Google Scholar

Santesteban, M., Pickering, M. J., & Branigan, H. P. (2013). The effects of word order on subject–verb and object–verb agreement: evidence from Basque. Journal of Memory and Language, 68(2), 160–179.Google Scholar

Sauppe, S., Norcliffe, E., Konopka, A. E., Van Valin, R. D. Jr., & Levinson, S. C. (2013). Dependencies first: eye tracking evidence from sentence production in Tagalog. In Knauff, M.Pauen, M.Sebanz, N., & Wachsmuth, I. (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (pp. 1265–1270). Austin, TX: Cognitive Science Society.Google Scholar

Shannon, C. (1948). A mathematical theory of communications. Bell Systems Technical Journal, 27(4), 623–656.CrossRef Google Scholar

Shriberg, E., & Stolke, A. (1996). Word predictability after hesitations: a corpus-based study. In Bunnell, H. T. & Idsardi, W. (Eds.), Proceedings of the Fourth International Conference on Spoken Language Processing (pp. 1868–1871). Philadelphia, PA: IEEE.CrossRef Google Scholar

Snow, C. E. (1977). Mother’s speech research: from input to interactions. In Snow, C. E. & Ferguson, C. A. (Eds.), Talking to children: language input and acquisition (pp. 31–49). Cambridge: Cambridge University Press.Google Scholar

Szmrecsányi, B. M. (2005). Language users as creatures of habit: a corpus-linguistic analysis of persistence in spoken English. Corpus Linguistics and Linguistic Theory, 1(1), 113–150.Google Scholar

Tanaka, M., Branigan, H. P., & Pickering, M. J. (2011). The production of head-initial and head-final languages. In Yamashita, H., Hisore, Y., & Packard, J. (Eds.), Processing and producing head-final structures (pp. 113–129). Dordrecht: Springer.Google Scholar

Tanenhaus, M. K. (2013). All P’s or mixed vegetables? Frontiers in Psychology, 4(234).Google Scholar

Temperley, D. (2003). Ambiguity avoidance in English relative clauses. Language, 79(3), 464–484.Google Scholar

Tily, H., & Kuperman, V. (2012). Rational phonological lengthening in spoken Dutch. Journal of the Acoustical Society of America, 132(6), 3935–3940.Google Scholar

Tily, H., & Piantadosi, S. (2009). Refer efficiently: use less informative expressions for more predictable meanings. In Proceedings of the Workshop on Production of Referring Expressions: Bridging the gap between computational and empirical approaches to reference, CogSci 2009.Google Scholar

Tottie, G. (1995). The man ø I love: an analysis of factors favouring zero relatives in written British and American English. In Melchers, G. & Warren, B. (Eds.), Studies in Anglistics (pp. 201–215). Stockholm: Almqvist and Wiksell.Google Scholar

Tourville, J., Reilly, K. J., & Guenther, F. H. (2008). Neural mechanisms underlying auditory feedback control of speech. NeuroImage, 39(3), 1429–1443.Google Scholar

Traugott, E. C., & Heine, B. (1991). Introduction. In Traugott, E. C. & Heine, B. (Eds.), Approaches to grammaticalization, Vol. I (pp. 1–14). Amsterdam: Benjamins.Google Scholar

van Son, R. J. J. H., & Pols, L. C. W. (2003). How efficient is speech? In Berkman, E. H. (Ed.), Proceedings of the Institute of Phonetic Sciences, Vol. 25 (pp. 171–184). Amsterdam: IFA.Google Scholar

van Son, R. J. J. H., & van Santen, J. P. H. (2005). Duration and spectral balance of intervocalic consonants: a case for efficient communication. Speech Communication, 47, 464–484.Google Scholar

van Summers, W., Pisoni, D. B., Bernacki, R. H., Pedlow, R. I., & Stokes, M. A. (1988). Effects of noise on speech production: acoustic and perceptual analyses. Journal of the Acoustical Society of America, 84(3), 917–928.Google Scholar

Verhoeven, E. (2007). Experiental Cconstructions in Yucatec Maya. Amsterdam & Philadelphia: John Benjamins Publishing Company.Google Scholar

Villacorta, V. M., Perkell, J. S., & Guenther, F. H. (2007). Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception. Journal of the Acoustical Society of America, 122(4), 2306–2319.Google Scholar

Wasow, T., Jaeger, T. F., & Orr, D. (2011). Lexical variation in relativizer frequency. In Simon, H. & Wiese, H. (Eds.), Expecting the unexpected: exceptions in grammar (pp. 175–195). Berlin/New York: De Gruyter.Google Scholar

Wedel, A., Jackson, S., & Kaplan, A. (2013). Functional load and the lexicon: evidence that syntactic category and frequency relationships in minimal lemma pairs predict the loss of phoneme contrasts in language change. Cognition, 128(2), 179–186.Google Scholar

Zipf, G. K. (1949). Human behavior and the principle of least effort. Cambridge, MA: Addison-Wesley Press.Google Scholar

table 1. Counts of determiner types actually produced by the determiner type presented in the recall stimulus

Fig. 1. Relative clause verb productions by the relative clause verb form and the NP head type presented during encoding.

table 2. Mixed logit regression of relative clause verb productions by design factors

Fig. 2. Relative clause verb productions depending on relative clause verb form presented during encoding and the NP head type actually produced.

table 3. Mixed logit regression of relative clause verb productions depending on definiteness and universal quantification of modified NP produced by speaker

table 4. Counts of actually produced determiner types of embedded NP by the determiner type presented in the recall stimulus

table 5. Mixed logit regression of relative clause verb productions depending on definiteness and universal quantification of modified NP and definiteness of embedded NP produced by speaker

Article contents

Predicting head-marking variability in Yucatec Maya relative clause production

Abstract

Keywords

1. Introduction

1.1. morphosyntactic variation in yucatec maya

1.2. predictions

2. Experimental design

3. Method

3.1. materials

3.2. participants

3.2.1. Procedure

3.2.2. Scoring and exclusions

3.3. analysis

3.4. results

3.4.1. By design factors

3.4.2. By modified NP type produced by speakers

4. Discussion

4.1. could production ease explain our results?

4.2. conventionalized preferences or on-line pressures?

4.3. what is the mechanism?

5. Conclusion

appendix A

appendix B: experimental stimuli

Footnotes

References

references

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests