Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-02-11T11:00:28.816Z Has data issue: false hasContentIssue false

Variation in English auxiliary realization: A new take on contraction

Published online by Cambridge University Press:  22 March 2013

Laurel MacKenzie*
Affiliation:
University of Manchester
Rights & Permissions [Opens in a new window]

Abstract

English auxiliary contraction has received much attention in the linguistic literature, but our knowledge of this variable has remained limited due to the absence of a thorough corpus study. This paper examines contraction of six auxiliaries in two corpora, considering three distinct phonological shapes in which they occur and the implications for an analysis of the grammatical processes that underlie the surface alternation in form. I argue that the data best support a two-stage analysis of contraction, one under which variation in the morphology is followed by phonetic and phonological processes. Moreover, I show that this particular analysis explains a number of patterns in the data that would otherwise be accidental. In this way, I underscore the importance of approaching the study of variable phenomena with both quantitative data and formal analysis.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2013

Integrating quantitative data and formal analysis can provide valuable insights into the nature of variable linguistic phenomena. This has been effectively demonstrated by works such as Guy (Reference Guy1991) and Labov (Reference Labov1969). Each investigator, having identified the factors conditioning application of a variable phenomenon (t/d deletion in the former case; copula deletion in the latter), provided a formal model of that phenomenon that explains why surface forms show the distribution they do (patterns of t/d deletion reflect the application of a deletion rule at multiple levels of the phonology; patterns of copula deletion reflect a previous stage of contraction). In this way, these investigators are able to provide a fuller picture of these phenomena than would be possible were quantitative data and grammatical analysis not combined.

In this paper, I detail a new study in this same vein. I document the range and distribution of surface forms of English auxiliaries and provide a model of auxiliary contraction that accounts for these forms and their distribution. The study presented here relies on natural-speech corpus data, making it a much-needed addition to the existing body of work on contraction, which has drawn data primarily from native-speaker judgments (e.g., Inkelas & Zec, Reference Inkelas, Zec, Moore and Bradlow1993; Kaisse, Reference Kaisse1983; Zwicky, Reference Zwicky1970). Even though simple introspection can lead us to correctly enumerate the possible forms in which auxiliaries may surface, I show that the quantitative data takes us beyond this, adjudicating between multiple possible analyses of these forms. Moreover, I demonstrate that without an appropriate model of the processes underlying auxiliary realization, certain asymmetries observed in the data are inexplicable. The quantitative data and the formal analysis thus inform each other in ways that result in a more comprehensive picture of this phenomenon than has been provided to date.

The present paper does not constitute the first corpus study of auxiliary contraction, but it is the broadest one thus far. The low frequency of some auxiliaries in natural speech has led many previous researchers to focus exclusively on contraction of the copula (Labov, Reference Labov1969; Walker & Meechan, Reference Walker, Meechan, Jensen and Van Herk1999), overlooking any questions about unity of process that are raised by the contraction of other auxiliaries. Work that does consider other auxiliaries (McElhinny, Reference McElhinny1993) suffers from low token counts. But the recent advent of massive speech corpora has facilitated the study of low-frequency morphosyntactic phenomena, making newly available the tools for a large-scale corpus study of auxiliary realization such as the one described here.

The phenomenon of contraction, in which an auxiliary surfaces with some phonological material missing, implicates several levels of a grammatical derivation. Accordingly, the analyses that have been put forth in prior work range from treating contraction as solely a phonological process (e.g., Labov, Reference Labov1969; Zwicky, Reference Zwicky1970) to treating it as a morphosyntactic one (e.g., Kaisse, Reference Kaisse1983). Here, I argue that the quantitative data I present are best explained by an analysis under which variation is localized in two places. Specifically, I propose that a variable allomorphic alternation is followed by phonetic and phonological processes. Analyses that propose only one locus of variation fail to provide a sufficient account of the quantitative findings.

There are two major conclusions to be drawn from this study. The first is that an analysis of contraction necessitates an articulated model of the grammar, one in which objects pass through multiple levels of representation. The second is that quantitative data and formal analysis must go hand in hand in the study of linguistic variation. Only with a holistic approach like the one taken here can we move beyond simply documenting patterns of variation and provide an explanation for their existence.

DEFINING THE VARIABLE

Overview of the phenomenon

The phenomenon under study in this paper is the variation in phonological shape of English auxiliaries. The basic gist of the phenomenon is that a form of an auxiliary with all its segmental material intact alternates with one or more forms that are missing phonological material. This variation in auxiliary shape is pervasive in spontaneous speech and affects a number of different verbs. The examples in (1) show six auxiliaries variably surfacing without at least their initial consonant, if not also their vowel.Footnote 1

  1. (1) Variation in auxiliary form attested in the Switchboard corpus (Godfrey, Holliman, & McDaniel, Reference Godfrey, Holliman and McDaniel1992)

    1. a. had: We [əd] secured the tents real well, even if we[d] done it in the dark at eleven o'clock at night. (sw_1181)Footnote 2

    2. b. has: Well, I'm sure it[s] been done! I'm sure it [həz] been done. (sw_1060)

    3. c. have: They [əv] locked their benefits in to the point that, once they[v] served two terms, they're on gravy train anyway. (sw_1180)

    4. d. is: Yeah, Salzburg[z] nice. Austria[z] nice. Europe [ɪz] nice! (sw_1151)

    5. e. will: If I walk, it [əl] be ten degrees warmer, but it [wəl] last twenty minutes. (sw_1146)

    6. f. would: And she [wʊd] do things, and she [wʊd] donate little things, and she[d] help clean up the tables. (sw_1121)

My interest in this paper is in the process or processes that cause an auxiliary to surface with phonological material missing in this way. I will refer to this alternation between phonologically intact and phonologically deficient forms pretheoretically as contraction, but I will argue later that a number of different processes contribute to this alternation in auxiliary shape.

The verbs that exhibit contraction in Standard English, as it is defined here, are had, has, have, will, would, is, are, am, does, and did (these last two only in wh-questions). In this paper, I focus on only the first six of these. The decision to narrow the scope of this work to these six verbs alone was made because these six verbs are the only ones that show variation of the type exhibited in (1) after both pronoun and nonpronoun (that is, full NP) subjects. This subsequently allows an examination of the effect of subject type on contraction, which will form a large part of the analysis to be presented herein. Am, does, and did, by contrast, surface after a set of environments too limited to allow such an analysis of subject type. Similarly, are is not treated here because it was found to show no variation of the type exhibited in (1) after nonpronoun subjects, where it surfaces near categorically with its vowel present (McElhinny, Reference McElhinny1993). Its variation in form is thus limited to postpronoun contexts, again precluding an analysis of the effect of subject type on contraction for this auxiliary.

To reiterate, contraction is defined here as the variable absence of phonological material, as exemplified by the alternations in (1). Accordingly, I do not examine variation in an auxiliary's vowel quality, such as the alternation between [hæv] and [həv]. For this reason, the modals can, could, and should are not considered here. The only alternation in form these modals display is between a full (e.g., [kæn]) and a reduced vowel (e.g., [kən]); they do not surface in forms that are missing one or more segments (e.g., *[æn], *[n]) and are thus not relevant to the study at hand.

Classification of surface forms

Because I have restricted the phenomenon at issue to the variable surfacing of auxiliaries with phonological material absent, three types of surface forms are logically possible. An auxiliary may surface with no segments missing (such as the forms attested in (1b), (1d), (1e), and (1f)). An auxiliary may also surface with only its initial consonant missing (as attested in (1a), (1c), and (1e)). And, an auxiliary may surface with both its initial consonant and its vowel missing (as attested in (1a), (1b), (1c), (1d), and (1f)). Though the complete deletion of some auxiliaries (namely, is and are) has been attested in some dialects of English (Labov, Reference Labov1969), the speakers whose data were used for the present study do not have this variant.

The three distinct types of surface forms will be referred to in the subsequent discussion as follows. Forms with no segments missing I term full; forms with the initial consonant missing, intermediate; and forms with both initial consonant and vowel missing, contracted.Footnote 3 For completeness, the phonological shape of each form type for each auxiliary under study is enumerated in (2) to (4).

  1. (2) Full forms: no segments missing

    a. had: [hæd], [həd]Footnote 4   b. has: [hæz], [həz]   c. have: [hæv], [həv]

    d. is: [ɪz], [əz]   e. will: [wɪl], [wəl]   f. would: [wʊd], [wəd]

  2. (3) Intermediate forms: lacking an initial consonantFootnote 5

    a. had: [əd]   b. has: [əz]   c. have: [əv]

    d. will: [əl]   e. would: [əd]

  3. (4) Contracted forms: lacking their initial consonant and their vowel

    a. had: [d]   b. has: [z], [s]   c. have: [v]

    d. is: [z], [s]   e. will: [l]   f. would: [d]

Note that the auxiliaries in (b) and (d) have two contracted forms whose appearance is predictable from the voicing of the preceding segment.

As the discussion will reveal, the fact that these three types of forms are distinguishable on the surface is not necessarily evidence that they are distinct in any deeper linguistic sense. On the contrary, I argue that surface intermediate forms are not uniquely represented underlyingly. Instead, the three-way alternation in form observed on the surface will be argued to be traceable back to an underlying two-way distinction, between one short and one long allomorph for each auxiliary. Surface contracted forms will be argued to have their source in underlying short allomorphs, and surface full forms in underlying long allomorphs, but intermediate forms will be shown to be best represented as derived from these two underlying forms, not as a distinct allomorph underlyingly.

METHODOLOGY

A total of 4524 tokens of the auxiliaries had, has, have, is, will, and would, uttered by 330 unique speakers, were pulled at random from Switchboard.Footnote 6 Switchboard is a corpus consisting of 240 hours (3 million words) of short (5 to 10 minutes) telephone conversations between strangers on assigned topics, such as sports, movies, travel, and political issues such as gun control. Conversation dyads were paired by a robotic operator; no two speakers were paired more than once. Data collection was carried out between 1991 and 1992. A total of 542 unique speakers participated in Switchboard, of whom 55% were men, 60% were under age 40, and 89% were college-educated. The Switchboard project was carried out by Texas Instruments, and many participants were TI employees, resulting in a somewhat skewed geographical distribution of participants: 29% were from the South Midland dialect region, where TI is located.

Tokens for the present study were located by searching transcripts, but were hand-coded based on audio. Initially, 500 randomly selected tokens of each of the six auxiliaries under study were extracted. The decision was made to select tokens at random rather than code all tokens for particular speakers because many speakers did not participate in more than a few conversations, meaning that it was very rare for a speaker to have participated in more than approximately 20 minutes of conversation. With so little data from each individual speaker, large numbers of auxiliaries uttered by a single speaker would be difficult to come by. For that reason, the decision was made simply to see what factors conditioning contraction were visible in the larger community of native English speakers, rather than on a speaker-specific level.

Once 500 randomly selected tokens of each auxiliary were coded from Switchboard, tokens of each auxiliary after certain subjects that were not well represented in the random sample (for instance, full NPs) were targeted. (This targeting was accomplished by a script that searched for auxiliaries when not following pronoun, quantifier, or wh-word subjects; the output was then culled through by hand to leave only post-NP auxiliaries.) As a result, the final body of tokens coded is not an accurate representation of the distribution of auxiliaries after particular subjects in Switchboard overall, but instead is biased toward boosting the token counts of auxiliaries in less common contexts.

To address concerns raised by (a) the fact that acoustic cues may be difficult to hear in telephone speech and (b) the artificiality of the conversational situation in Switchboard, another set of data was collected from the Philadelphia Neighborhood Corpus of LING560 Studies (henceforth PNC; Labov & Rosenfelder, Reference Labov and Rosenfelder2011). The PNC comprises 40 years of sociolinguistic interviews with native Philadelphians carried out by students at the University of Pennsylvania beginning in 1972. Rather than selecting tokens at random from this corpus, as was done for the Switchboard corpus, I identified a demographically diverse set of speakers who participated in conversations exceeding 30 minutes in length, and all tokens of each auxiliary under study were coded for each selected speaker.Footnote 7 This resulted in a database of 4685 tokens, collected from 40 unique speakers. Again, all tokens were hand-coded based on audio, either by the author or by a linguistically trained undergraduate research assistant.

The environments in which contracted forms are blocked from surfacing have been the subject of much discussion in the linguistic literature. An auxiliary is blocked from surfacing in its contracted (i.e., single-consonant, as in (4)) form when it precedes a movement or a deletion site (King, Reference King1970; Labov, Reference Labov1969) or appears in a comparative subdeletion construction (Anderson, Reference Anderson2008; Bresnan, Reference Bresnan1975) or a pseudo-cleft (Kaisse, Reference Kaisse1983). Though seldom discussed in this literature, intermediate forms of auxiliaries are also blocked in these environments (and see MacKenzie, Reference MacKenzie2012b, for corpus data supporting these intuitions). Consequently, no variation of the type under study in this paper is attested in these environments. Auxiliaries may surface only in their full form. These environments are accordingly outside the envelope of variation, and all tokens in such environments have been omitted from study.

There exists another set of environments in which variation in auxiliary form is restricted. Auxiliaries may not surface in their contracted form when this would create an unacceptable cluster or a geminate (5a).Footnote 8 Additionally, contracted forms of the auxiliaries had, have, will, and would are illicit when preceded by a coordinated or embedded pronoun (5b). Some investigators (e.g., Kaisse, Reference Kaisse1983) have also judged contracted forms of these four auxiliaries to be illicit after nonpronoun subjects, even where they would be phonotactically acceptable (5c), though this judgment is not universally shared (e.g., McElhinny, Reference McElhinny1993; Zwicky, Reference Zwicky1970). The corpus data presented herein will confirm these intuitions concerning the inadmissibility of contracted forms in environments (5a) to (5c), but they will demonstrate that intermediate forms are nonetheless permitted there. Because some variation in phonological shape is thus attested, I retain the environments enumerated in (5) in this study; they are within the envelope of variation.

  1. (5) Environments in which contracted forms are illicit, but intermediate forms are not

    1. a. It *[l] ~ [əl] ~ [wɪl] be a while before we get there.

    2. a′. The cheese *[z] ~ [əz] ~ [hæz] gone bad.

    3. b. John and I *[v] ~ [əv] ~ [hæv] got it.

    4. b′. The guy next to you *[l] ~ [əl] ~ [wɪl] speak first.

    5. c. Sue *[d] ~ [əd] ~ [wʊd] go.

    6. c′. John *[d] ~ [əd] ~ [hæd] gone.

    7. c′′. Sue *[l] ~ [əl] ~ [wɪl] go.

    8. c′′′. All three *[v] ~ [əv] ~ [hæv] made it.

Finally, all negated tokens were excluded from study, because some auxiliaries show a wider envelope of variation when negated. For instance, negated is can surface as is not, isn't, and 's not (on which see Tagliamonte & Smith, Reference Tagliamonte and Smith2002; Yaeger-Dror, Hall-Lew, & Deckert, Reference Yaeger-Dror, Hall-Lew and Deckert2002). Also excluded were tokens in which the auxiliary was contrastively stressed, tokens in which the auxiliary was fronted to begin a yes-no question, tokens of nonfinite have (either following a modal, as in would have, or in a nonfinite construction such as seem to have), auxiliaries with an elided subject, and tokens in which a pause or adverb separated the auxiliary from its subject.

FINDINGS

As discussed, contracted forms are somewhat restricted in their distribution. Most auxiliaries do not surface in their contracted form after subjects that are not personal pronouns. For that reason, I separate the data for presentation as follows. Tokens with personal pronoun subjects are kept distinct from tokens with full NP subjects.Footnote 9 Additional motivation for doing so comes from previous findings that subject type—specifically, pronoun versus NP—has an effect on rate of copula contraction (Labov, Reference Labov1969; McElhinny, Reference McElhinny1993). In other words, subject type appears to have both a categorical and a gradient effect on contraction, depending on which auxiliary is at issue.

Figures 1A and 1B show the distribution of each form coded for each auxiliary under study after personal pronouns in Switchboard and the PNC, respectively. Figures 2A and 2B show the same for tokens after NP subjects.

Figure 1. Distribution of forms after pronoun subjects, in the Switchboard (A) and Philadelphia Neighborhood (B) corpora. Pronoun subjects were defined as detailed in the text.

Figure 2. Distribution of forms after NP subjects, in the Switchboard (A) and Philadelphia Neighborhood (B) corpora.

Only environments that allow the full range of forms of an auxiliary to surface are included in Figures 1A and 1B. This means that:

  • No auxiliary is included that follows an embedded or coordinated pronoun, environments in which contracted forms for several auxiliaries are illicit (5b).

  • The auxiliaries had, will, and would were not coded after the pronoun it, the only consonant-final subject pronoun in English. Their full range of variants is not possible in this environment, as contracted forms may not surface (5a).Footnote 10

  • All forms of have followed by got have been omitted, as they are found to surface categorically in their contracted forms after pronouns in this corpus.

These decisions were made to prevent certain host-auxiliary combinations from biasing the results. For instance, will after a vowel-final pronoun subject (e.g., he) may surface in the full range of attested forms: contracted [l], intermediate [əl], full [wɪl]. On the other hand, after the pronoun it, the contracted form may not surface for phonotactic reasons (as addressed in (5a)), but the intermediate and full forms are allowed. Tokens of will after it are thus not directly comparable to tokens of will after vowel-final pronouns. So, to keep tokens of it will from artificially inflating the distribution of variants, they are not included in Figure 1 (though I will return to the distribution of forms of will after it in a later section).

The criteria for inclusion in Figures 2A and 2B were less stringent, as follows:

  • All tokens of had, have, will, and would after NPs were examined, as these auxiliaries were all expected to show variation between full and intermediate forms (5c).

  • Tokens of has and is were omitted after sibilant-final NPs. Again, this keeps from biasing the results, as this context shows a narrower range of variation than NPs that do not end in sibilants, with contracted forms of is and has being phonotactically illicit after sibilants (5a).

ANALYSIS OF FORMS

We have now seen the phonological shapes auxiliaries may surface in, and we have an idea of their distribution after two different types of subjects. This section discusses how to incorporate these findings into an analysis of the grammatical processes underlying contraction.

The majority of recent analyses of contraction have treated the surface alternation between single-consonant contracted forms and phonologically intact full forms as stemming from an underlying alternation between two suppletive allomorphs (Anderson, Reference Anderson2008; Close, Reference Close2004; Inkelas & Zec, Reference Inkelas, Zec, Moore and Bradlow1993; Kaisse, Reference Kaisse1983; Wilder, Reference Wilder, Black and Motapanyane1997; and see Kaisse, Reference Kaisse1985, for arguments against an analysis under which contracted forms are derived via the phonology). There are thus two forms stored in memory for each auxiliary, which I term here the short allomorph and the long allomorph. Long allomorphs are generally understood to be the phonologically complete form of the auxiliary in question, for instance, /hæv/ for have. The surface full form is thus a faithful representation of this allomorph (modulo any vowel reduction to schwa). Researchers have disagreed on whether the other, short allomorph is represented underlyingly as a syllabic consonant (/əv/ for have) or as a single consonant (/v/ for have). Either way, the short allomorph has less phonological material than the long one does.

This earlier work has provided little analysis of our intermediate forms. However, as demonstrated in Figures 1 and 2, intermediate forms are very frequent in actual use and surface in a number of different environments. This necessitates a new analysis of contraction, one that takes intermediate forms and, particularly, their distribution into account.

Several treatments of these intermediate forms are conceivable. Intermediate forms could represent the faithful surface representation of an allomorph distinct from an auxiliary's short and long allomorphs, meaning that contraction is underlyingly a three-way alternation. This resembles the approach taken by Ogden (Reference Ogden1999), who proposed one unique stored form for each attested surface form of an auxiliary. Or, intermediate forms could represent the faithful surface representation of an auxiliary's short allomorph, if each auxiliary showed an underlying alternation between a phonologically complete form (e.g., /hæv/) and a syllabic consonant (e.g., /əv/). Under this two-allomorph approach, surface contracted forms (e.g., [v]) would be derived from short allomorphs via a vowel-deletion process (though not from long ones; see Kaisse, Reference Kaisse1985). Alternatively, intermediate forms could be the derivative ones, if the underlying alternation were between a phonologically complete form and a single-consonant form (e.g., /v/). Under this approach, two ways of deriving intermediate forms are conceivable. Processes of initial consonant deletion and vowel reduction could derive them from an auxiliary's long allomorph (e.g., /hæv/ → [əv]), whereas a process of schwa epenthesis could derive them from an auxiliary's short allomorph (e.g., /v/ → [əv]).

I will show that the quantitative data argues for one analysis of intermediate forms in particular. Under the proposed analysis, each auxiliary has two underlying allomorphs—one that is phonologically complete and one that consists of only a single consonant—and intermediate forms derive from each. Under this treatment, intermediate forms are the output of phonological processes that have operated on the allomorph inserted by the morphology; auxiliary realization is thus the output of two stages of processes. The next section describes this analysis in detail; I then demonstrate how this analysis accounts for a number of distributional patterns evident in the data presented in Figures 1 and 2.

Analysis

I follow previous investigators in proposing an underlying alternation for each auxiliary under study, between one long and one short allomorph. Each auxiliary's long allomorph contains all segmental material; each short allomorph consists of a single consonant (Table 1).Footnote 11

Table 1. Long and short allomorphs for six auxiliaries

This analysis implements the idea, following Kaisse (Reference Kaisse1985), that surface contracted forms cannot be generated by phonological rules having operated on the long allomorph. In other words, surface contracted forms presume underlying insertion of a short allomorph. I also assume that surface full forms presume underlying insertion of the long allomorph. No phonological processes add material to a short allomorph to generate a surface full form.

To see how intermediate forms emerge from this analysis, the first step is to refer back to their distribution in the surface data. A close inspection of Figures 1 and 2 reveals that not every host/auxiliary combination permits intermediate forms to surface. Descriptively, intermediate forms are found:

  • As forms of had (and, infrequently, has, have, and will) after vowel-final pronouns.

  • As forms of all auxiliaries (except is, which has no intermediate form) after NPs.

These environments can be sorted into two categories, which I call CF-YES (as exemplified in (6)) and CF-NO (as exemplified in (7)):

  1. (6) CF-YES: Intermediate forms that surface where contracted forms are also possible:

    1. a. Pronoun + had: intermediate he [əd] surfaces alongside contracted he'd [hid]

    2. b. NP + has: intermediate Sue [əz] surfaces alongside contracted Sue's [suz]

  2. (7) CF-NO: Intermediate forms that surface where contracted forms are not possible . . . for reasons of phonotactics, seen previously in (5a):

    1. a. Consonant-final subject + had/have/will/would: for example, intermediate it [əl] has no contracted counterpart it'll *[ɪtl] . . . for reasons that cannot be attributed to phonotactics, seen previously in (5b, 5c):

    2. b. Vowel-final NP subject + had/have/would/will: for example, intermediate Sue [əd] has no contracted counterpart Sue'd *[sud]

I will lay out an analysis under which CF-YES-type intermediate forms are of a different source than CF-NO-type intermediate forms. Specifically, CF-NO intermediate forms derive from a short allomorph that fails to syllabify with its host, whereas CF-YES intermediate forms do not have their source in short allomorphs at all. They are long allomorphs that have lost their initial consonant. The next section describes CF-NO intermediate forms in more detail; the subsequent section addresses CF-YES intermediate forms.

Under this analysis, intermediate forms constitute a hybrid category, with surface intermediate forms derived from each of the two underlying allomorphs. This differentiates intermediate forms from the other attested surface forms. Surface contracted forms come only from short allomorphs; surface full forms come only from long allomorphs. I will subsequently demonstrate that the hybrid analysis of intermediate forms accounts for the quantitative facts in a way that models that attribute intermediate forms to one allomorph only cannot.

Intermediate forms from underlyingly short allomorphs

An important point to recognize in our analysis of intermediate forms is that an auxiliary's failure to surface in its contracted form does not imply that its short allomorph was not inserted. In other words, presence of a contracted form on the surface implies insertion of a short allomorph, but not the other way around. Insertion of a short allomorph does not imply the presence of a surface contracted form. This is because phonological processes may operate subsequent to the insertion of the short allomorph, changing its surface form.

Specifically, I propose, extending an idea from Ogden (Reference Ogden1999), that a process of schwa epenthesis is operative whenever a short form is inserted that cannot syllabify with its host. This then results in that short allomorph surfacing as an intermediate form. Short allomorphs of auxiliaries are thus parallel in their behavior to the English regular present and past tense suffixes, traditionally analyzed as single-consonant /-z/ and /-d/, respectively (Anderson, Reference Anderson1973; Baković, Reference Baković2005; Benus, Smorodinsky, & Gafos, Reference Benus, Smorodinsky and Gafos2004; Fromkin, Reference Fromkin2000; Pinker & Prince, Reference Pinker and Prince1988; Yip, Reference Yip1988). After sibilants and alveolar stops, respectively, these suffixes cannot surface as is, and so the single consonant gains a preceding epenthetic schwa that allows it to surface (giving, e.g., plural church[əz] and past tense patt[əd]).Footnote 12 When we see intermediate forms where contracted forms are not phonotactically licit, then, we are seeing a short allomorph that has been phonologically altered, via schwa epenthesis, to surface as an intermediate form.

Taking this a step further, this analysis can be extended to other instances in which contracted forms fail to surface, even when phonotactics are not the cause. Forms like three [əv] and Sue [əd], which have no counterparts *[θriv] and *[sud] (but cf. well-formed grieve and sued), may again be the result of a contracted form failing to syllabify with its host, with a [ə] repair. An obvious question here is why the short allomorph should fail to surface in its contracted form in these cases in which the contracted form should be phonotactically acceptable. It may be related to the similar failure of contracted forms to surface after conjoined and embedded pronouns (5b): John and I've (*[v]) got it and The guy next to you'll (*[l]) speak first are both illicit with a contracted form, but acceptable with an intermediate form instead.Footnote 13 For instance, under a treatment of morphology that incorporates cyclic spell-out (Embick, Reference Embick2010), it could be proposed that a short allomorph must be spelled out in the same cycle as its host in order to surface as a contracted form. Pronouns and short allomorphs would be spelled out in the same cycle, but full NPs—including those that contain embedded pronouns—would be spelled out in a separate cycle from the auxiliary that follows them, such that host and short allomorph would not be spelled out in the same cycle. This failure of the short allomorph to get a host to syllabify to would then require schwa epenthesis as a repair. Whatever the precise source of the effect, the judgment is clear and is corroborated by the quantitative data. Intermediate forms surface in CF-NO environments, but contracted forms are conspicuously absent. This complementary distribution is accounted for when intermediate forms of the CF-NO type are analyzed as short allomorphs that have undergone schwa epenthesis.

Intermediate forms from underlyingly long allomorphs

In the previous section, I argued that intermediate forms in environments in which a contracted form is illicit could still be traced to an underlyingly short allomorph. This was used to account for intermediate forms of the CF-NO type. But CF-YES intermediate forms are attested as well, surfacing alongside contracted forms.

One possible explanation for CF-YES intermediate forms is that they are of the same source as CF-NO forms. That is, a short allomorph has been inserted, and a schwa has been epenthesized. There is no obvious reason why schwa epenthesis would be necessary in this type of environment, though, because contracted forms are able to surface here. Short allomorphs are thus clearly able to syllabify to their host without a schwa repair, and I assume that schwa epenthesis applies only when necessary for syllabification. The quantitative facts to follow will also provide additional evidence that these intermediate forms should not be treated as short allomorphs with schwa epenthesis.

Instead, these intermediate forms should be treated as underlying long allomorphs that have been phonologically reduced such that they resemble intermediate forms derived from a short allomorph plus schwa epenthesis. This can be effected by two independently attested phonetic reduction processes: (i) /h/-deletion, which deletes /h/ when word-initial in an unstressed syllable, and (ii) vowel reduction, which reduces unstressed vowels to schwa (Kaisse, Reference Kaisse1985). Both processes commonly affect function words in conversational speech, giving us another way of deriving intermediate forms, this time from long allomorphs of /h/-initial auxiliaries.

A crucial component of this analysis, first put forward by Kaisse (Reference Kaisse1985) and corroborated by the quantitative data presented herein, is that there is no process that deletes initial /w/ in a fashion analogous to /h/-deletion. Intermediate forms of the /w/-initial auxiliaries will and would must have their source only in schwa epenthesis on short allomorphs. That is, they are CF-NO type only. Where the long allomorph of these auxiliaries is inserted, I assume, it uniformly surfaces with its initial consonant intact. Evidence supporting this will follow.

Summary

The analysis developed here posits two sources of surface intermediate forms: phonetic /h/-deletion on long allomorphs, and schwa epenthesis on short allomorphs. This account of intermediate forms maintains an underlying bipartite distinction between long and short allomorphs, despite there being a tripartite distinction in phonological shape on the surface. It does so by making reference to two stages of processes: the first, the alternation in the morphology; the second, a set of phonetic and phonological processes that act on the allomorph inserted at the first stage. Figures 3 and 4 summarize the source of each auxiliary's surface forms after pronoun and NP subjects, respectively.Footnote 14

Figure 3. Sources of surface forms after personal pronoun subjects.

Figure 4. Sources of surface forms after noun phrase subjects.

Note that, after NP subjects (Figure 4), intermediate forms of have and had can come from both schwa epenthesis on short allomorphs and /h/-deletion on long allomorphs. In other words, they are of ambiguous origin. This fact will come into play when we examine the effect of the length of a NP subject on auxiliary realization.

EVIDENCE SUPPORTING THE PRESENT ANALYSIS

The analysis presented herein explains several facts concerning the distribution of forms in Figures 1 and 2. Here, I go through them one by one and identify the aspects of the present analysis that account for them.

The findings presented in the next two sections are attested in both Switchboard and the PNC; accordingly, data from both are presented here. However, due to the infrequent occurrence of several auxiliaries with NP subjects in the PNC (as evidenced by the low token counts in Figure 2B), not enough data is available to permit PNC replication of the subject length findings presented in the third section.

Intermediate forms after pronouns

The current analysis accounts for two facts concerning the distribution of intermediate forms after pronouns (Figure 1):

  • The more full forms an /h/-initial auxiliary displays, the more intermediate forms it displays, with had displaying more full and intermediate forms and have and has markedly fewer full and intermediate forms. This follows naturally under the current analysis, under which intermediate forms of /h/-initial auxiliaries after pronouns are derived exclusively from /h/-deletion on the long allomorph (Figure 3). The more long allomorphs have been inserted, the more intermediate forms there will be.

  • Would conspicuously displays no intermediate forms after vowel-final pronouns, and the same is effectively true for will, with the exception of three tokens (see note 14). This follows naturally from the fact that there is no process that would generate intermediate forms of /w/-initial auxiliaries after a vowel-final pronoun (Figure 3). Short allomorphs of will and would after vowel-final pronouns will surface as contracted forms with no need for a schwa epenthesis repair. Long allomorphs of will and would will surface as full forms, with no process of /w/-deletion attested that would remove their initial consonant and cause them to surface as intermediate forms.Footnote 15

Rates of occurrence of forms

The present analysis also explains certain parallels in the rates of occurrence of forms:

  • After vowel-final pronouns (Figure 1), would surfaces in its contracted form at a much lower rate than all other auxiliaries except had.Footnote 16 This low rate of contracted forms of would is echoed after NPs (Figure 2), where would surfaces in its intermediate form at a very low rate compared with had, have, and will (the other auxiliaries that alternate only between full and intermediate forms after NPs).Footnote 17 The present analysis, which treats postpronoun contracted forms of would as underlyingly of the same source as post-NP intermediate forms of would, can account for this parallelism. All that is required is a general dispreference for short allomorphs of this auxiliary, and each surface form will appear at a low rate as expected.

  • Along the same lines, another auxiliary that appears in its contracted form at a particularly low rate after pronoun subjects is had. By the same reasoning as was employed for would—that the relative rate of use of an allomorph will be consistent regardless of the nature of its host and hence regardless of its eventual surface form—we would expect intermediate forms of had after NPs to surface at a comparably low rate. The rate of intermediate forms of had after NPs, in fact, does not appear as low as expected, particularly when compared with that of would. However, if we take into account the fact that intermediate forms of had have an additional source—namely, /h/-deletion applying to the long allomorph (Figure 4)—we can account for the unexpectedly high number of intermediate forms of post-NP had. Some are the short allomorph, having been inserted at a low rate, as it was after pronouns, and gaining a vowel through schwa epenthesis; others are the long allomorph, having lost its initial consonant to surface as a homophonous intermediate form. The relative rate of intermediate form occurrence for post-NP had is thus in keeping with there being two derivational sources of this form.

  • The distribution of forms of had after pronouns is quite comparable to the distribution of forms of has after NPs. Namely, each surfaces with full forms appearing at roughly 1.5 times the rate of intermediate forms. In each case, intermediate forms are hypothesized to be the output of /h/-deletion on the long allomorph. Conversely, full forms are hypothesized to be the long allomorph having not undergone /h/-deletion (the process being variable). If /h/-deletion is indeed a phonetic process (Kaisse, Reference Kaisse1985), we would expect it to apply at a consistent rate irrespective of an auxiliary's identity. The comparable rates of intermediate form appearance—that is, of /h/-deletion application—across these two different environments support this. In neither corpus is the ratio of full to intermediate forms found to be significantly different between postpronoun had and post-NP has (Switchboard: χ2 = .424, p = .492; PNC: χ2 < .001, p = .98), as we would expect if a process of /h/-deletion were applying at a consistent rate in each context to generate those intermediate forms.

Under the present analysis, intermediate forms surface where contracted forms are phonotactically impossible. Furthermore, if intermediate forms are simply the phonological exponent of short allomorphs, then, if two hosts are equivalent in all but their final consonant, we should expect a short allomorph to be inserted at the same rate after each of them, with those short allomorphs being realized as intermediate forms in the case where they cannot surface as contracted, and as contracted in the other environment. Put simply, intermediate and contracted forms should be in complementary distribution, surfacing at the same rate, when two hosts differ only in their phonology. Tests of this proposal follow.

  • The short allomorph of will, /l/, can surface with no phonological modification (i.e., as the contracted form) after vowel-final pronouns, but after the consonant-final pronoun it, the short allomorph would be phonotactically illicit in its contracted form. If schwa epenthesis is indeed a way of resolving this phonotactic incompatibility, we should expect to see intermediate forms of will appearing after it at a comparable rate to contracted forms of will after vowel-final personal pronouns (e.g., he, she, I). This is borne out in both corpora, as shown in Figures 5A and 5B. Contracted forms of will are prevalent after vowel-final pronouns; after it, it is intermediate forms that are prevalent.Footnote 18

  • The short allomorph of has, /z/, can surface as the contracted form after all NPs except those that end in a sibilant. After a sibilant, only two realizations of has are acceptable: the intermediate form and the full form. The full form, under the present analysis, is the faithful surface reflex of the long allomorph, whereas the postsibilant intermediate form has two sources: (i) the short allomorph, having gained a vowel via schwa epenthesis in order to surface; and (ii) the long allomorph, having lost its initial consonant via /h/-deletion. Once again, all other things being equal, we expect the short allomorph to be inserted at the same rate in each environment: that is, after sibilant-final NPs as well as after NPs that end in other segments that do not necessitate schwa epenthesis. Likewise, /h/-deletion, as a phonetic process, is predicted to apply at the same rate in each environment. From this follows the prediction that, for has, the rate of intermediate forms after sibilant-final NPs will equal the rate of intermediate forms plus the rate of contracted forms after non-sibilant-final NPs. This is clearly borne out in Figure 62 = .03, p = .863). (There is not enough data to permit replication of this finding in the PNC.)

Figure 5. Distribution of forms of will after pronoun subjects, in the Switchboard (A) and Philadelphia Neighborhood (B) corpora.

Figure 6. Distribution of forms of has after NP subjects in the Switchboard corpus.

Patterning of forms by length of NP subject

Further support for the analysis presented here comes from the distribution of forms after subjects of varying length. The hypothesis under examination is that an auxiliary's short allomorph is less likely after longer subjects. (Evidence that this hypothesis is on the right track comes from the clear difference in short allomorph insertion rate between pronoun and NP subjects, found in this study as well as many others, beginning with Labov, Reference Labov1969.) However, because we cannot directly see the rate of insertion of the short allomorph, we have to extrapolate it from the rate of occurrence of the corresponding surface forms. This can be done by way of the correspondences laid out in Figure 4.

This investigation was carried out only for the auxiliaries has, have, is, and will, after NP (i.e., nonpronoun) subjects only. The remaining two auxiliaries, had and would, surface at such a high rate of full forms —the long allomorphs—that there are not enough tokens of the short allomorph to merit study.

Figure 7 opposes, for each of these four auxiliaries, the hypothesized surface manifestation of its short allomorph to the hypothesized surface manifestation(s) of its long allomorph. For instance, of the three attested surface forms of the auxiliary has, the contracted one is hypothesized to derive from the underlying short allomorph, whereas the full and intermediate ones are both proposed to derive from the long allomorph, given variable /h/-deletion (Figure 4). Accordingly, contracted forms have been opposed to full and intermediate forms for that auxiliary. Each point plotted in Figure 7 represents a single token, coded for the number of orthographic words in its subject.Footnote 19

Figure 7. Distribution of surface forms of four auxiliaries after NP subjects. Each point represents one token, coded for phonological shape (cont. = contracted, interm. = intermediate) and number of words in its subject. Smoothing line fit via generalized linear modeling. Values on the y-axis represent the fitted proportion of contraction for a given subject length. The choice of which forms are opposed to which differs by auxiliary for reasons explained in the text.

Three auxiliaries—has, is, and will—show a clear effect of subject length, with the surface reflex of each auxiliary's short allomorph tapering off in use as its subject increases in word count (p < .01 for each auxiliary based on regressions with predictors as detailed in note 17). This occurs regardless of whether an auxiliary's short allomorph surfaces as a contracted form (as it does for is and has) or as an intermediate form (as it does for will).

However, the plot for have, in which intermediate forms have been opposed to full, is a clear outlier in this set of four, with no tapering off of intermediate forms as subject length increases (p = .167). This is fully expected under the current analysis, which attributes intermediate forms of have to two sources. As shown in Figure 4, the short allomorph of have surfaces as intermediate after NPs; additionally, the long allomorph of have may also surface as intermediate, as its surface reflex is subject to /h/-deletion. Intermediate forms of have are thus of ambiguous origin, and there is no way to separate their two sources on the surface. As a result, they fail to show the same subject length effect, with a number of them—by hypothesis, those that were full underlyingly—continuing to surface after long subjects. This plot, then, can be taken as clear evidence of a fundamental difference between intermediate forms of have and those of will.

Another finding that can be gleaned from Figure 7 is that speakers' use of /h/-deletion does not taper off with increasing subject length: those intermediate forms of have that surface after long subjects are hypothesized to come from /h/-deletion, meaning that it must still be operative. This hypothesis is corroborated by the patterning of full and intermediate forms of post-NP has, plotted in Figure 8. By hypothesis, intermediate forms of has after NPs are (exclusively) the long allomorph having undergone /h/-deletion. If /h/-deletion is not disfavored with increasing subject length, there should be no evidence of intermediate forms of has tapering off as subject length increases, and Figure 8 confirms this (p = .834). Again, this is in keeping with Kaisse's (1985) characterization of /h/-deletion as a phonetic process. We would expect it to be local in its conditioning, and an effect of subject length would be surprising.

Figure 8. Distribution of full and intermediate forms of has after NP subjects.

Finally, the different patterns displayed by intermediate forms of will and those of have offer confirmation that no phonetic rule of /w/-deletion exists to turn full forms of will into intermediate ones. If intermediate forms of will came from two sources, we should see them patterning like those of have. The fact that they do not, and that they instead pattern precisely like contracted forms of is and has, lends support to the proposal that these three surface forms are analogously represented.

ALTERNATIVE ANALYSES

The analysis of auxiliary realization presented in this paper proposes that surface intermediate forms of auxiliaries are derivative, generated via the application of phonetic and phonological processes to the form output by the morphology. Now, I consider alternatives to this analysis and show that they fail to provide sufficient explanation of the quantitative findings.

Alternative analysis: One underlying form per auxiliary

One conceivable alternative analysis is that each auxiliary is represented underlyingly by one form only, rather than an alternation between two forms. This would effectively reduce contraction to a cascade of phonological rules (initial consonant deletion, vowel deletion) deriving single-consonant forms from forms with all segmental material intact. But the finding that will and would fail to surface in their intermediate form after pronouns but do so after NPs would require complicated entailments to be set up between the required rules. For instance, when initial consonant deletion applies to postpronoun would, vowel deletion must categorically apply afterward (to account for the nonexistence of intermediate forms of would after pronouns [Figure 1]), but when initial consonant deletion applies to post-NP would, vowel deletion must never apply afterward (to account for the lack of contracted forms in this environment [Figure 2]). The subject length effect illustrated in Figure 7 is also less amenable to a phonological analysis, which would require the length effect to be localized in two places—in vowel deletion (to account for the patterns displayed by is and has) as well as in /w/-deletion (to account for the patterns displayed by will)—rather than simply in the insertion of the short allomorph, as in the analysis argued for here.

Alternative analysis: Three underlying forms per auxiliary

An additional alternative analysis is effectively the opposite of the one given in the previous section and is reminiscent of Ogden (Reference Ogden1999). Each surface form has its source in an independently stored underlying form, such that contraction is an underlying three-way alternation. Under this approach, the differential behavior of intermediate forms by subject length (Figure 7) becomes an accident: there is no explanation for why intermediate forms of will pattern differently from those of have. Under the analysis argued for here, which allows for two ways of deriving intermediate forms, this finding has a principled explanation. The two auxiliaries' intermediate forms are generated in different ways. The distribution of intermediate forms given in Figures 5 and 6 also loses meaning under a three-form model, which has no explanation for why intermediate forms and contracted forms surface in complementary distribution, with intermediate forms appearing where contracted forms are illicit. Deriving intermediate and contracted forms from the same source—the short allomorph—in this case again eliminates the accidental nature of this finding.

Alternative analysis: Two underlying forms per auxiliary; intermediate forms from only one

A final alternative analysis is the “nonhybrid” version of the one put forth here. Specifically, this would treat each auxiliary as displaying a bipartite alternation underlyingly between short and long allomorphs but would derive intermediate forms from only one of those allomorphs. This again fails to provide a coherent explanation of the subject length facts. If intermediate forms are derived exclusively from long allomorphs for all auxiliaries, then intermediate forms of will would not be predicted to show the subject length effect that they do, because the length effect, based on data from is and has, is operative only on short allomorphs. Conversely, if intermediate forms are derived exclusively from short allomorphs for all auxiliaries, we are left to explain why intermediate forms of have do not show an effect of subject length. Again, the hybrid model of intermediate forms provides a simple explanation for the difference in patterning of intermediate forms of will and have.

In essence, alternative approaches are left to stipulate the sources of patterns that are easily accounted for by the analysis proposed here.

CONCLUSIONS AND EXTENSIONS

This paper has provided a novel analysis of auxiliary realization in English, one that is based on consideration of the phonological forms in which auxiliaries occur in spontaneous speech and the distribution of those forms. I have developed a model that treats contraction—that is, the surface alternation in auxiliary shape—as the output of two stages of processes. Contraction has its source in an alternation between allomorphs, but later phonological and phonetic processes may obscure the shape of these allomorphs, such that an underlying bipartite distinction becomes tripartite on the surface.

This analysis makes sense of a number of distributional patterns observed in the quantitative data that would otherwise have no clear explanation. The patterning of forms with regard to subject length, the nonexistence of intermediate forms of some auxiliaries in certain environments, and the rate of occurrence of intermediate forms in others all receive a principled explanation given a two-stage model of contraction under which intermediate forms have two derivational sources. The quantitative data, in turn, has allowed us to come down decisively in favor of one analysis in particular. As such, the present study serves as an important demonstration of the value of quantitative data for linguistic theory and vice versa. With both quantitative data and formal analysis, we are able to explain patterns of variation, rather than simply documenting them.

This study also serves as a reminder of the complexity of variable phenomena that implicate multiple levels of a grammatical derivation. Auxiliary contraction lends itself to morphosyntactic, morphological, and phonological analyses, as evidenced by the variety of approaches adopted in previous literature (see MacKenzie, Reference MacKenzie2012b, for a review). A model of the surface alternation that attributed variation to only one of those stages would fail to provide a satisfactory account of the patterns that we find in the corpus data. Modeling other such higher-level grammatical variables as exhibiting variation at multiple levels of the grammar (for instance, t/d-deletion: see Fruehwald, Reference Fruehwald and Fruehwald2012) may similarly give us new insight into why they show the conditioning patterns that they do.

Additionally, data on auxiliary realization can speak to questions about the architecture of the grammar. For instance, the present analysis relies on a process of phonetic /h/-deletion to generate intermediate forms of some auxiliaries. A phonetic process that affects multiple lexical items like this speaks to questions of grammatical modularity. Does this process show lexical specificity that would contradict proposals that the phonetics operates independently of the morphology (e.g., Bermudez-Otero, Reference Bermudez-Otero2010)? In fact, the rate of application of /h/-deletion is remarkably steady across auxiliaries, hovering around .37 in both corpora (Table 2).

Table 2. Rate of /h/-deletion in two environments and two corpora

Note: Intermediate forms are hypothesized to be the output of /h/-deletion applying to the long allomorph; full forms, the faithful surface manifestation of the long allomorph. Accordingly, /h/-deletion rate is calculated as the number of intermediate forms out of the combined number of full and intermediate forms.

Additional data on the rate of /h/-deletion in other, nonauxiliary environments—for instance, as it affects the pronouns he and himself—will speak to this question further (and see MacKenzie & Yang, Reference MacKenzie and Yang2012, for preliminary results on this). An integrated approach to the study of variation—one that combines grammatical analysis and quantitative data (see also MacKenzie, Reference MacKenzie2012b)—will thus provide new and important insights into the nature of variable phenomena.

Footnotes

1. The examples in (1) include the verb would, which is technically a modal, but I refer to these verbs as auxiliaries for convenience.

2. Numbers in parentheses are speaker identification numbers from the Switchboard corpus, which is described in greater detail in the Methodology section.

3. Intermediate forms are so named because they are phonologically in between an auxiliary's full and contracted forms.

4. As indicated previously, I am not investigating differences in auxiliary vowel quality in the present paper; accordingly, any token that retains its initial consonant and its vowel was coded as full regardless of its vowel quality.

5. There is no intermediate form of is distinct from its full form to be coded, hence the gap here.

6. Specifically, this was implemented as follows. All tokens matching a particular search query were identified in the corpus; they were shuffled using the random module in Python; and then a fixed number of tokens from the beginning of that shuffled set were selected and analyzed.

7. This method of collecting data was used to enable examination of social correlates of contraction, which will not be taken up in the present paper (but see MacKenzie, Reference MacKenzie2012b).

8. This fact is unsurprising given the nature of English phonotactics, but it should not be neglected in a precise delimitation of the scope of variation.

9. Expletive, wh-word, demonstrative pronoun, and quantifier subjects are not included in either group. Though these elements are pronounlike, they were set aside to keep from complicating the analysis. See MacKenzie (Reference MacKenzie2012b) for discussion of these environments.

10. This glosses over the fact that had and would are found to surface—on occasion—in what appears to be their contracted form after it, accompanied by consonant cluster simplification: so, [ɪd] for it'd. As an additional variable process affecting auxiliary realization, this deserves further examination, but it cannot be addressed in the present study. To keep things simple, I omit this environment for these auxiliaries.

11. This is not the only way of representing the short allomorph; see note 12 for discussion.

12. There is another way of analyzing the single consonant–syllabic consonant alternation seen in English past and present morphology. Bloomfield (Reference Bloomfield1933) and Borowsky (Reference Borowsky1986, Reference Borowsky1987) treat these suffixes as syllabic underlyingly—as /-əz/ and /-əd/—with a deletion process operating to remove the schwa where applicable. Such an analysis would work equally well for contraction; likewise, most investigators who discuss the past and plural alternations acknowledge that either analysis would be feasible.

13. This fact constitutes important evidence that contraction is sensitive to syntactic structure. The range of phonological forms after embedded pronouns is restricted in a way that it is not after nonembedded pronouns. Contraction is thus affected by more than just surface strings.

14. Figure 3 allows no way of generating intermediate forms of will after vowel-final pronouns, yet Figure 1A reveals that this combination does occur in Switchboard, although extremely infrequently. The precise count is 3 of 427 times. Of these three tokens, one sounds slightly as if there may be something [w]-like; this requires closer acoustic inspection. The other two are uttered with some hesitation; it may be that the speaker is drawing out the form so that it sounds like two syllables. No intermediate forms of will were found after vowel-final pronouns in the Philadelphia Neighborhood Corpus (N = 539).

15. One alternative explanation that could be put forth to account for the lack of intermediate forms of would is one that implicates homophony avoidance: intermediate forms of would are disfavored because they would be homophonous with intermediate forms of had. But this is contradicted by the fact that homophonous contracted forms of the two do nonetheless surface. Additionally, because would is by necessity followed by an infinitive, whereas had is by necessity followed by a past participle, the context surrounding these auxiliaries consistently disambiguates them, with the rare exception of verbs that display homophony between the infinitive and the past participle (e.g., cost, hit).

16. This is confirmed by separate mixed-effects logistic regression analyses, one per corpus, with fixed effects of speaker sex, subject pronoun, and auxiliary identity, as well as random effects of speaker identity and following word. With contracted form appearance as the dependent variable and would as the reference level for the factor auxiliary identity, positive coefficients are returned for aux = has, have, is, and will and negative coefficients for aux = had (p ≤ .002 in all cases).

17. This is confirmed by separate mixed-effects logistic regression analyses, one per corpus, with fixed effects of speaker sex, subject length in words, preceding consonant versus vowel, grammatical class of preceding word, and auxiliary identity, as well as random effects of speaker identity, preceding word, and following word. With rate of intermediate form occurrence as the dependent variable and would as the reference level for the factor auxiliary identity, positive coefficients are returned for aux = had, have, and will (p ≤ .002 in all cases).

18. In Switchboard, the ratio of full forms to contracted forms after vowel-final pronouns does differ significantly from the ratio of full forms to intermediate forms after it: χ = 7.534, p = .006. However, an effect of pronoun identity on contraction has been found for other auxiliaries, see Krug (Reference Krug1998). So, all other things being equal, intermediate forms of will after it should surface at the same rate as contracted forms of will after vowel-final pronouns, but all other things are not necessarily equal, because pronoun identity may be having an effect that is impossible to disentangle from phonology. That being said, the rate of short allomorph insertion is still comparable across the two environments, and perhaps the rates will equalize with additional tokens of it will. Additionally, the analogous comparison with the PNC data (Figure 5B) does not find a significant difference between the two environments: χ = .006, p = .938.

19. See MacKenzie (Reference MacKenzie2012b) for tests of other measures of subject length, including counts of syllables, prosodic words, and syntactic nodes. Orthographic word count was found in that work to be the strongest predictor of contraction out of all measures of subject length tested; accordingly, I use it here.

References

REFERENCES

Anderson, Stephen R. (1973). Remarks on the phonology of English inflection. Language and Literature 1:3352.Google Scholar
Anderson, Stephen R. (2008). English reduced auxiliaries really are simple clitics. Lingue e Linguaggio 7:169186.Google Scholar
Baković, Eric. (2005). Antigemination, assimilation and the determination of identity. Phonology 22:279315.CrossRefGoogle Scholar
Benus, Stefan, Smorodinsky, Iris, & Gafos, Adamantios. (2004). Gestural coordination and the distribution of English geminates. Penn Working Papers in Linguistics 10:3346.Google Scholar
Bermudez-Otero, Ricardo. (2010). Morphologically conditioned phonetics? Not proven. Paper presented at On Linguistic Interfaces II; University of Ulster, December 2–4, 2010.Google Scholar
Bloomfield, Leonard. (1933). Language. Chicago: University of Chicago Press, 1984.Google Scholar
Borowsky, Toni. (1986). Topics in the lexical phonology of English. Ph.D. dissertation, University of Massachusetts, Amherst.Google Scholar
Borowsky, Toni.. (1987). Antigemination in English phonology. Linguistic Inquiry 18:671678.Google Scholar
Bresnan, Joan. (1975). Comparative deletion and constraints on transformations. Linguistic Analysis 1:2574.Google Scholar
Close, Joanne. (2004). English auxiliaries: A syntactic study of contraction and variation. Ph.D. dissertation, University of York.Google Scholar
Embick, David. (2010). Localism versus globalism in morphology and phonology. Cambridge: The MIT Press.CrossRefGoogle Scholar
Fromkin, Victoria A., ed. (2000). Linguistics: An introduction to linguistic theory. Malden: Blackwell.Google Scholar
Fruehwald, Josef. (2012). Redevelopment of a morphological class. In Fruehwald, J. (ed.), University of Pennsylvania Working Papers in Linguistics 18(1):7786.Google Scholar
Godfrey, John J., Holliman, Edward C., & McDaniel, Jane. (1992). SWITCHBOARD: Telephone speech corpus for research and development. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 1:517520.Google Scholar
Guy, Gregory R. (1991). Explanation in variable phonology: An exponential model of morphological constraints. Language Variation and Change 3:122.CrossRefGoogle Scholar
Inkelas, Sharon, & Zec, Draga. (1993). Auxiliary reduction without empty categories: A prosodic account. In Moore, C. & Bradlow, A. (eds.), Working Papers of the Cornell Phonetics Laboratory 8:205253.Google Scholar
Kaisse, Ellen M. (1983). The syntax of auxiliary reduction in English. Language 59:93–22.CrossRefGoogle Scholar
Kaisse, Ellen M.. (1985). Connected speech: The interaction of syntax and phonology. New York: Academic Press.Google Scholar
King, Harold V. (1970). On blocking the rules for contraction in English. Linguistic Inquiry 1:134136.Google Scholar
Krug, Manfred. (1998). String frequency: A cognitive motivating factor in coalescence, language processing, and linguistic change. Journal of English Linguistics 26:286320.CrossRefGoogle Scholar
Labov, William. (1969). Contraction, deletion, and inherent variability of the English copula. Language 45:715762.CrossRefGoogle Scholar
Labov, William, & Rosenfelder, Ingrid. (2011). The Philadelphia Neighborhood Corpus.Google Scholar
MacKenzie, Laurel. (2012a). English auxiliary contraction as a two-stage process: Evidence from corpus data. In Choi, J., Hogue, E. A., Punske, J., Tat, D., Schertz, J., & Trueman, A. (eds.), Proceedings of WCCFL 29. Somerville: Cascadilla Press. 152160.Google Scholar
MacKenzie, Laurel. (2012b). Locating variation above the phonology. Ph.D. dissertation, University of Pennsylvania.Google Scholar
MacKenzie, Laurel. (forthcoming). Locating linguistic variation: A case study of English auxiliary contraction. In LaCara, N., Fainleib, L., & Park, Y. (eds.), Proceedings of NELS 41.Google Scholar
MacKenzie, Laurel, & Yang, Charles. (2012). English auxiliary realization and the independence of morphology and phonetics. Paper presented at NWAV 41, Indiana University, October 25–28, 2012.Google Scholar
McElhinny, Bonnie S. (1993). Copula and auxiliary contraction in the speech of White Americans. American Speech 68:371399.CrossRefGoogle Scholar
Ogden, Richard. (1999). A declarative account of strong and weak auxiliaries in English. Phonology 16:5592.CrossRefGoogle Scholar
Pinker, Steven, & Prince, Alan. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition 28:73193.CrossRefGoogle ScholarPubMed
Tagliamonte, Sali, & Smith, Jennifer. (2002). “Either it isn't or it's not”: Neg/aux contraction in British dialects. English World-Wide 23:251281.CrossRefGoogle Scholar
Walker, James A., & Meechan, Marjory. (1999). The decreolization of Canadian English: Copula contraction and prosody. In Jensen, J. & Van Herk, G. (eds.), Actes du Congrès annuel de l'Association canadienne de linguistique 1998/Proceedings of the 1998 Annual Conference of the Canadian Linguistic Association. 431441.Google Scholar
Wilder, Chris. (1997). English finite auxiliaries in syntax and phonology. In Black, J. R. & Motapanyane, V. (eds.), Clitics, pronouns and movement. Philadelphia: John Benjamins Publishing Co. 321362.CrossRefGoogle Scholar
Yaeger-Dror, Malcah, Hall-Lew, Lauren, & Deckert, Sharon. (2002). It's not or isn't it? Using large corpora to determine the influences on contraction strategies. Language Variation and Change 14:79118.CrossRefGoogle Scholar
Yip, Moira. (1988). The obligatory contour principle and phonological rules: A loss of identity. Linguistic Inquiry 19:65100.Google Scholar
Zwicky, Arnold M. (1970). Auxiliary reduction in English. Linguistic Inquiry 1:323336.Google Scholar
Figure 0

Figure 1. Distribution of forms after pronoun subjects, in the Switchboard (A) and Philadelphia Neighborhood (B) corpora. Pronoun subjects were defined as detailed in the text.

Figure 1

Figure 2. Distribution of forms after NP subjects, in the Switchboard (A) and Philadelphia Neighborhood (B) corpora.

Figure 2

Table 1. Long and short allomorphs for six auxiliaries

Figure 3

Figure 3. Sources of surface forms after personal pronoun subjects.

Figure 4

Figure 4. Sources of surface forms after noun phrase subjects.

Figure 5

Figure 5. Distribution of forms of will after pronoun subjects, in the Switchboard (A) and Philadelphia Neighborhood (B) corpora.

Figure 6

Figure 6. Distribution of forms of has after NP subjects in the Switchboard corpus.

Figure 7

Figure 7. Distribution of surface forms of four auxiliaries after NP subjects. Each point represents one token, coded for phonological shape (cont. = contracted, interm. = intermediate) and number of words in its subject. Smoothing line fit via generalized linear modeling. Values on the y-axis represent the fitted proportion of contraction for a given subject length. The choice of which forms are opposed to which differs by auxiliary for reasons explained in the text.

Figure 8

Figure 8. Distribution of full and intermediate forms of has after NP subjects.

Figure 9

Table 2. Rate of /h/-deletion in two environments and two corpora