1. INTRODUCTION
The future temporal reference sector has long been of interest to scholars as an area in which competing forms exist to express the same meaning – i.e. to refer to an event which has not yet taken place. The main competing forms in modern French are the inflected future (IF), e.g. je chanterai, and the periphrastic future (PF), e.g. je vais chanter. The present tense is also used with future temporal reference (usually with a disambiguating temporal adverb), e.g. demain je chante. Many studies have explored the meaning of the forms, frequently attempting to uncover a unique meaning or function for each (e.g. Imbs, Reference Imbs1968; Vet, Reference Vet1980; Wales, Reference Wales1983, Reference Wales, Pensalfini, Turpin and Guillemin2014; Confais, Reference Confais1995; Abouda and Skrovec, Reference Abouda and Skrovec2015, inter alia). Others seek to examine the evidence for change in this sector (e.g. Blondeau, Reference Blondeau2006; Poplack and Dion, Reference Poplack and Dion2009; Wagner and Sankoff, Reference Wagner and Sankoff2011; Poplack, Lealess and Dion, Reference Poplack, Lealess and Dion2013, inter alia), with the periphrastic future argued to be in the process of replacing the inflected future (Söll, Reference Söll and Hausmann1983; Blondeau, Reference Blondeau2006; Poplack and Dion, Reference Poplack and Dion2009; Wagner and Sankoff, Reference Wagner and Sankoff2011).
Most studies tend to look at patterns for all verbs together. There are indications, however, that not all verbs behave in the same way: individual preferences for some verbs may be quite different from overall patterns. While it is possible to take this into account in statistical modelling, the extent to which some verbs may differ in their behaviour merits further attention. In particular, avoir and être are often mentioned as behaving quite differently from the majority of verbs (see e.g. Söll, Reference Söll and Hausmann1983; Sundell, Reference Sundell1991; Poplack and Turpin, Reference Poplack and Turpin1999 and discussion in Section 2.2). These two verbs, two of the most highly frequent in the French language, also exhibit exceptional behaviour in other domains of morphosyntax: for example, in relation to past tenses and the persistence of the passé simple (see below). This study investigates future temporal reference with avoir and être to establish how they differ from other verbs with respect to FTR, to attempt to understand what might lie behind this exceptionality, and to shed light on their behaviour in other domains of morphosyntax.
2. CONTEXT AND VARIANTS
The French inflected future, e.g. je chanterai, evolved from a Vulgar Latin periphrastic form consisting of an infinitive followed by an inflected form of habere (‘to have’), e.g. cantare habeo. It can be found in some of the earliest texts of French (such as the Serments de Strasbourg of 842AD; see Comeau Reference Comeau2015: 345 for examples and discussion). This inflected future has co-existed for centuries alongside a periphrastic form constructed with aller + infinitive (e.g. je vais chanter). Fleischman (Reference Fleischman1982: 82) notes that the temporal meaning of PF dates from around the thirteenth or fourteenth century, becoming generalized in colloquial speech in the fifteenth century, and admitted to polite conversation and literary discourse during the sixteenth and seventeenth centuries.
In grammars and many scholarly studies of French, the approach has typically been to seek a unique meaning for each competing form – or to look for ‘form-function symmetry’ (Poplack and Dion, Reference Poplack and Dion2009). In grammars, a common and enduring argument has been that the periphrastic future is used for events which are temporally closer, and the inflected future for events that are more distant – hence the term ‘futur proche’ or ‘futur prochain’ (a term first used by the Abbé Antonini in his grammar of Reference Antonini1753). In fact, a huge range of meanings have been suggested in grammars for each form, with much overlap; a meaning proposed for the PF in one volume may be assigned to IF in another (see Poplack and Dion Reference Poplack and Dion2009 for a thorough survey of comments in grammars on FTR). In some key studies of FTR in French, the difference between IF and PF has been framed in terms of the ‘visée prospective’ that the PF offers: while the IF is said to be detached from the moment of speaking, the PF retains a connection to the speaker’s present (see, for example, Blanche-Benveniste et al., Reference Blanche-Benveniste, Bilger and Rouget1990; Fleischman, Reference Fleischman1982; Jeanjean, Reference Jeanjean, Blanche-Benveniste, Chervel and Gross1988; also Abouda and Skrovec, Reference Abouda and Skrovec2017 for a recent fine-grained semantic analysis of these forms).
Work on FTR in the variationist paradigm takes a different approach. The competing forms are viewed as variants of the same underlying variable, and the realisation of one or other form in a given context is affected by a number of linguistic and extra-linguistic factors. D. Sankoff (Reference Sankoff and Newmeyer1988: 153–156) argues that while it may be possible for a linguist to examine closely particular examples and detect some nuance of meaning which distinguishes them, in many contexts, these nuances are not relevant either in the intentions of the speaker, or in the interpretation of the listener. Thus, the forms are argued to be semantically equivalent in these contexts – i.e. differences in referential value are neutralized (see Poplack and Dion, Reference Poplack and Dion2009: 569).
Establishing equivalence of meaning is unproblematic at the level of phonology, since variants are not meaning-bearing. But where morphosyntactic (or lexical, or discourse-pragmatic) variants are concerned, meaning is necessarily involved, and the difficulty is in establishing that they are synonymous in every context. This issue gave rise to fierce debate in sociolinguistics from the late 1970s onwards, which will not be rehearsed here (cf. Blanche-Benveniste, Reference Blanche-Benveniste1977; Lavandera, Reference Lavandera1978; Romaine, Reference Romaine1984; García, Reference García1985; Cheshire, Reference Cheshire1987; Winford, Reference Winford, Arnold and Al1996; Gadet, Reference Gadet1997; see also Coveney, Reference Coveney2007). Since this time, the number of studies looking at variation ‘above and beyond the level of phonology’ (Sankoff, Reference Sankoff and Sankoff1980) has grown significantly. It is now generally accepted that sociolinguistic variables can exist at any level of language, and that, with proper circumscribing of the variable context (see below), and careful examination of the variants in the contexts in which they occur, the study of them provides vital information about the ways in which variation at different levels of language is structured.
Tagliamonte (Reference Tagliamonte2012) notes that another way of establishing that variants are equivalents is to look at whether they are in complementary distribution: that is, where more of one is found, proportionally less of the other is found. This is demonstrably the case with the future variants (see below for details of quantitative studies). What is of interest, then, is to uncover the structured patterns of variation in usage of the competing forms, and to understand the factors that influence which variant is realised in a given context. Though variation does not always imply change, it is a necessary precondition for change to be possible (Weinreich, Labov and Herzog, Reference Weinreich, Labov and Herzog1968: 188). Hence, in this sector of long-standing variation, many studies have also been interested in establishing whether there is change taking place.
2.1. Previous studies
Quantitative studies of FTR in the French of France include Kahn (Reference Kahn1954), which looks at spoken Parisian French and reports overall proportions of 72.3% IF and 27.3% PF. In another study of spoken Parisian French, François (Reference François1974) reports a slightly more elevated use of PF of 38.3% (with IF at 61.7%). Lorenz (Reference Lorenz1989) examines the same variety, reporting 57.1% IF and 42.9% PF.Footnote 1 Gougenheim et al. (Reference Gougenheim1964) report 64.5% IF and 35.5% PF in a study which uses data obtained from speakers from a variety of locations (including a small number from outside France). Jeanjean’s (Reference Jeanjean, Blanche-Benveniste, Chervel and Gross1988) results concern the spoken French of Aix-en-Provence, where proportions of 57.8% IF and 42.2% PF are found. While these studies can give an idea of variation in the FTR sector in the French of France, and a sense of whether change might be happening over time (note the evolution of Parisian usage observable from 1954 [72.3% IF, Kahn] to 1974 [61.7% IF, François] to 1989 [57.1% IF, Lorenz]), a degree of caution is necessary when making comparisons. These studies take very different methodological approaches (in particular, it is not always clear whether modal uses of FTR variants are excluded), and the corpora used are also relatively small and unrepresentative (for example, only 101 tokens in total in Kahn, and 47 in François).
More recent work has addressed the question of change in FTR using larger oral corpora. Abouda and Skrovec’s (Reference Abouda and Skrovec2015) study uses data drawn from the ESLO (Enquêtes Sociolinguistiques à Orléans) corpus of spoken French (see also Abouda and Skrovec, Reference Abouda and Skrovec2017). They establish that PF has indeed become more common over the forty-year period examined (with IF accounting for 58% of uses in the earlier period examined, but only 28% in the later period), but that it is not a straightforward case of PF taking over all functions of IF. Rather, each form is argued to have certain temporal and modal functions. The authors argue that while PF is taking over some of the modal functions of IF, the latter retains some semantic ‘niches’. For example, IF is favoured for uses termed générique by the authors, where the future ‘[…] présente le procès comme une prédication constante, caractérisant une classe d’individus’ (Abouda and Skrovec, Reference Abouda and Skrovec2015: 15). The authors stress that in their account this category is separate from typicalisation, though elsewhere the label générique may be used for both of these (Damourette and Pichon, Reference Damourette and Pichon1911–1939), while still other accounts call such uses illustratif (Bres and Labeau, Reference Bres and Labeau2013). While examples are given, the precise definition of these categories remains somewhat opaque, and it is also worth noting that in many variationist studies, such uses would be excluded, as they do not refer to future time (being categorized instead as gnomic or habitual uses; see Section 3.2).
The first variationist study of the French of France is Roberts (Reference Roberts2012), based on the Beeching corpus of spoken French from the 1980s/1990s (Beeching, Reference Beeching2002; with data from a variety of locations, including Brittany, Paris, and southern France). With overall proportions of 58.8% PF and 41.2% IF, Roberts’ study attests to the continued vitality of IF in this variety, despite the preference for PF overall (note the reversal of the proportions of PF/IF compared to the older studies described above). In another variationist study, Villeneuve and Comeau (Reference Villeneuve and Comeau2016) report on Vimeu French, a variety spoken in northern France (alongside Picard), with overall proportions of 62.2% PF and 37.8% IF.
Evidence from across the Atlantic strongly suggests that change is taking place in Canadian varieties. Synchronic studies of Canadian varieties spoken around the St Lawrence River, in Quebec and Ontario, have repeatedly shown that use of PF in spoken language far outweighs IF, to the extent that PF is now described as the default variant (Poplack and Turpin, Reference Poplack and Turpin1999; Poplack, Reference Poplack, Bybee and Hopper2001; Poplack and Dion, Reference Poplack and Dion2009). For example, Emirkanian and Sankoff (Reference Emirkanian, Sankoff and Bouvier1986) found proportions of 71% PF and 29% IF in Montréal.Footnote 2 Poplack and Turpin (Reference Poplack and Turpin1999) report similar proportions for Ottawa-Hull: 78.4% PF and 21.6% IF.Footnote 3 In Hawkesbury, Ontario, Grimm and Nadasdi (Reference Grimm and Nadasdi2011) found 86.5% PF and 13.5% IF based on data from 1978, rising to 89.5% PF and 10.5% IF in data from 2005 (Grimm, Reference Grimm2010).
Studies taking a diachronic perspective show that this preference is getting stronger over time. Poplack and Dion (Reference Poplack and Dion2009) compare data from the Ottawa-Hull corpus and the nineteenth-century Récits du français québecois d’autrefois (Poplack and St-Amand Reference Poplack and St-Amand2007), and demonstrate that the proportion of PF used has increased from 61.3% in the nineteenth century to 78.4% in the twentieth century.Footnote 4 A note of caution is sounded by Wagner and Sankoff (Reference Wagner and Sankoff2011), however. In their panel study of Montréal French, they uncovered an age-grading effect which seems to be slowing the pace of change in this variety: two-thirds of the speakers on the panel (n=60) actually increased their use of IF over time. Wagner and Sankoff view these results not as ‘vitiating an apparent time interpretation, [but as] indicating that the rate of change may be slightly overestimated if age grading acts in a retrograde direction’ (Reference Wagner and Sankoff2011: 275). Blondeau (Reference Blondeau2006) reports similar results for a cohort of 12 Montréal speakers interviewed at three different times over a 25-year period.
In other Canadian varieties, however, the IF is still in widespread use. Indeed, Acadian varieties of French (those spoken in the Atlantic Provinces of New Brunswick, Newfoundland and Labrador, Prince Edward Island, and Nova Scotia) frequently show a strong preference for IF. For example, Comeau, King and LeBlanc (Reference Comeau, King and LeBlanc2016: 25) note that for the varieties spoken in Baie Sainte-Marie (Nova Scotia), L’Anse-à-Canards (Newfoundland), and the Iles de la Madeleine (Quebec), IF occurs at a rate of 38%, 24% and 39% respectively. King and Nadasdi (Reference King and Nadasdi2003), examining usage in two communities on PEI, Abram-Village and Saint-Louis, and l’Anse-à-Canards, Newfoundland, found an overall preference for IF of 53%, leading them to conclude that claims of the decline of the inflected future ‘in Canadian French in general seem premature’ (Reference King and Nadasdi2003: 332; emphasis in original).
2.2. Lexical differences
The studies discussed above present data on a range of verbs. A small number of studies discuss the distribution of future variants with individual verbs. In Söll’s (Reference Söll and Hausmann1983) analysis of alternation of IF and PF in child language, the overall distribution of forms was 34% IF (n=155) and 66% PF (n=299).Footnote 5 These figures are reversed for être, with 67% IF (n=24) and 33% PF (n=12). Söll attributes this strong preference for IF to the extremely high frequency of être and its use as an auxiliary, two factors which, he argues, are related to its meaning. Avoir showed no strong preference either way in Söll’s study – which begs the question, why not?, if an explanation for this pattern lies in high frequency and use as an auxiliary. Söll does not interrogate this further and it must be noted that the observation lacks explanatory power.
Sundell (Reference Sundell1991) is an investigation of written French (the corpus consists of 50 contemporary novels), thus not directly comparable to many of the studies here. Nevertheless, it has some intriguing findings in relation to individual verbs: in particular, Sundell uncovers a strong preference for IF with both être and avoir. The overall distribution of variants in Sundell’s study was 70% IF (n=4362) and 30% PF (n=1914). Sundell separates his tokens into three different types: ‘Futurs non déterminés’, ‘Complément de temps’ and ‘Négation’. He focuses particularly on the effect of grammatical person on the distribution of forms. Overall, he observes a much higher frequency of IF for futures within the ‘Complément de temps’ and ‘Négation’ categories (85% and 83% IF respectively), but the figures for grammatical person follow more or less the same lines: Sundell observes that third person singular is generally found with a much higher rate of IF. This is particularly the case for avoir and être, where IF accounts for 90% and 82% of third person singular tokens respectively.
Sundell breaks this down further by examining the nature of the subject in tokens with third person singular. Impersonal il has the highest rate of IF use, 77% – while on has the highest rate of PF use, 50% (overall figures for third person singular are 68% IF and 32% PF). For impersonal il, Sundell finds that in fact PF is extremely rare with anything other than falloir (‘il va falloir’) and weather verbs. The three forms which account for the largest number of IF forms are il y aura, il faudra and il sera (accounting for almost 90% of IF forms with impersonal il between them). Sundell’s data point to the existence of certain ‘formulaic’ uses, whereby one variant is vastly more likely than the other by virtue of the ‘pre-formed’ nature of these sequences (cf. Coveney, Reference Coveney1996, in relation to ‘pre-formed sequences’ in negation). Third person forms are among the most frequent, and are found with much higher rates of IF overall, and especially avoir and être. Certain forms account for very high proportions of these IF tokens: e.g. il y aura, il faudra and il sera in the case of IF with impersonal il.
Similarly, Poplack and Turpin observe that a small set of ‘highly frequent and morphologically-irregular verbs’ (Reference Poplack and Turpin1999: 155–156) including vouloir, pouvoir, savoir, revenir, être, avoir show an association with IF (albeit not very strong) in their data, and that these are the same verbs which are ‘associated with the (otherwise non-productive) use of subjunctive morphology in the same corpus’ (Reference Poplack and Turpin1999: 156). Roberts (Reference Roberts2012: 105) also observes that ‘a number of verbs appear to be selected almost categorically with the IF (e.g., avoir ‘to have’ 80%, n=31) and the PF (e.g., travailler ‘to work’: 88%, n=7)’.Footnote 6
Though these studies note certain preferences for individual verbs, including a strong preference for IF with both avoir and être (perhaps more pronounced for the latter), in-depth analysis of what might underlie these patterns is lacking.Footnote 7 This study seeks to address this gap by focusing specifically on avoir and être, using data drawn from the ESLO corpus of spoken French. It seeks to address the following questions:
Are the patterns of usage of the main FTR variants with avoir and être the same as those found in other studies which average data across all verbs?
If not, how are the patterns different?
Do the factors identified as significant in other studies have the same effect?
If not, how is it different?
What could explain any exceptional behaviour of these two verbs?
Can this shed any light on the morphosyntactic behaviour of avoir and être in other domains?
3. METHODOLOGY
3.1. Corpus
The ESLO corpus of spoken French is composed of interviews recorded in Orléans in two periods: between 1968 and 1974 (ESLO1), and between 2008 and 2014 (ESLO2).Footnote 8 We henceforth refer to the 1968 corpus and 2008 corpus. The details of the two sub-corpora are presented in Table 1. The search interface allows the selection of different types of situation (interviews, family meals, shop interactions, debates, etc.); for the purposes of comparability, only interviews (entretiens) were selected for this study.
Table 1. ESLO corpus details

3.2. Circumscribing the variable context
As in other variationist studies of FTR in French (e.g. Blondeau, Reference Blondeau2006; Comeau, Reference Comeau2015; Poplack and Turpin, Reference Poplack and Turpin1999; Poplack and Dion, Reference Poplack and Dion2009; Roberts, Reference Roberts2012, Reference Roberts2016; Villeneuve and Comeau, Reference Villeneuve and Comeau2016; Wagner and Sankoff, Reference Wagner and Sankoff2011), this study considers variation in the forms used to express future meaning; it does not examine the totality of meanings that can be conveyed using morphologically future forms. Edmonds et al. (Reference Edmonds, Gudmestad and Donaldson2017) advocate a concept-oriented approach to analysing FTR, in order to capture all verb forms used to refer to future time, arguing that such an approach avoids a priori assumptions based on morphological form. While such an approach is clearly valuable, in this study, to allow comparability with other variationist work on FTR, we focus on the main variants used to express future time, i.e. the inflected and the periphrastic future.
A small number of previous studies also include the Futurate Present (FP), but this form does not feature in the analysis here, for two main reasons. First, it is marginal in quantitative terms (other studies have found it accounts for less than 10% of tokens referring to future time; e.g. Poplack and Turpin Reference Poplack and Turpin1999 – 7%; Poplack and Dion Reference Poplack and Dion2009 – 9%; though see Tremblay et al. Reference Tremblay, Blondeau and Labeau2019 for a discussion of FP in text messages). Second, it seems to be relatively stable over time. As Comeau and Villeneuve note, ‘if there is ongoing linguistic change in how French expresses FTR, it primarily affects the IF and PF’ (Reference Comeau and Villeneuve2016: 234). In addition, as noted by Blondeau (Reference Blondeau2006: 74), FP appears virtually categorically with an accompanying future adverbial (one of the factor groups considered here), which serves to disambiguate the usage of a present tense form referring to future time.
Searches of the ESLO (sub) corpora were conducted for all forms of avoir and être in the inflected future and periphrastic future (taking account of any material which may intervene between auxiliary and infinitive, e.g. il va en avoir). Since the number of tokens in the third person singular was far greater than for any other grammatical person/number, with a total of 1,190 tokens, the analysis concerns only data for this category.Footnote 9
The first step in circumscribing the variable context is to eliminate any tokens of IF and PF which do not refer to future time. These exclusions fall into a number of categories, illustrated in the examples below: habitual actions (1); spatial movement (2); ‘gnomic’ utterances, i.e. timeless truths (3) (Fleischman, Reference Fleischman1982: 132 following Greenberg, Reference Greenberg1978, as cited in Wagner and Sankoff, Reference Wagner and Sankoff2011: 280; see also Imbs, Reference Imbs1960: 47); fixed expressions/sayings (4); hypothetical utterances (5).Footnote 10 Other exclusions included tokens where the interviewer primed the speaker; mis-assigned tokens (error in transcription); fragments, repetitions and self-corrections; quoted speech; otherwise unclear tokens which could not be reliably categorized. After these exclusions, 649 tokens remained.
(1) il y a des jours où on travaillera beaucoup plus euh que d’habitude et puis il y a certains jours où euh ce sera très très calme très relax (1968-MK532)
(2) on fera construire par là mais là on va être ici un bon moment (1968-PY94)
(3) euh disons que dans une maison de jeunes où y a beaucoup de fils d’ouvriers y aura peu de fils de de bourgeois (1968-RF211)
(4) on dit toujours ça sera toujours Paris hein (2008-FJ30)
(5) parce que si on compte sur les élus sur la mairie sur les institutions y aura rien (2008-MF363)
A second step is to exclude any invariant contexts (i.e. which are found with 0% or 100% of one or other variant, and thus no variation). In many studies of FTR in Laurentian French,Footnote 11 negative contexts have been found to be categorical, in that only IF is selected in such contexts. In the present study, both IF and PF were found in negative contexts – i.e. there was no categorical effect – and negative contexts are therefore included in the analysis. Certain phrases with avoir, however, were revealed to be categorical (no such phrases were identified for être). Utterances which contained the phrase avoir besoin (‘to need’; n=16) occurred only with IF, in both sub-corpora. Utterances with the phrase avoir … ans (‘to be … years old’; n=25) occurred only with PF, with one exception. For other phrases (e.g. avoir du mal, ‘to have trouble’, avoir envie de ‘to want/wish to’, avoir lieu ‘to happen’), the number of tokens was too low to establish categoricity. Tokens were also coded for whether they occurred within the phrase il y a (‘there is/there are’), in its various permutations (e.g. il y aura, il va y avoir, with pronouns y or en etc.). No categorical pattern was discernible here, and behaviour with il y a did not seem exceptional (i.e. proportions of IF/PF were similar to those in corpus as a whole).Footnote 12 Contexts which did not admit variation (total n=41) were excluded from the subsequent analysis, leaving 608 tokens.
3.3. Coding
The next step is to code the data for factors of potential interest. One of the aims here was to establish whether any of the factors commonly investigated in connection with the topic have an unusual effect on distributional patterns with avoir and être. Factors identified as of interest in previous literature were therefore selected for testing here (see references below for each factor; findings vary across studies as to whether the factor was significant, but if it was identified as being of potential interest, it was included here). Can any differences in the effect of factors on these two verbs help to explain their exceptionality? The factors investigated and brief details of coding procedure are as follows:
Linguistic factors
Corpus (i.e. date of recording): tokens were coded as either 1968 or 2008. (See e.g. Poplack and Dion, Reference Poplack and Dion2009).
Verb: avoir or être. (See e.g. Söll, Reference Söll and Hausmann1983; Sundell, Reference Sundell1991; Poplack and Turpin, Reference Poplack and Turpin1999; Roberts, Reference Roberts2012).
Proximity: Coding for this factor was into five categories, for events occurring within the hour, day, week, longer than a week, or which were ‘continual’ – i.e. existed before the time of the utterance and would continue to exist in the future envisaged in the utterance. To avoid circularity and subjectivity, coding was done on the basis of adverbs or other time-indicating phrases in the immediate or wider context. (See e.g. Poplack and Turpin, Reference Poplack and Turpin1999; Poplack and Dion, Reference Poplack and Dion2009; Grimm and Nadasdi, Reference Grimm and Nadasdi2011; Roberts, Reference Roberts2012).
Sentential polarity: Tokens were coded as positive or negative. (See e.g. Emirkanian and Sankoff, Reference Emirkanian, Sankoff and Bouvier1986; Poplack and Turpin, Reference Poplack and Turpin1999; Poplack and Dion, Reference Poplack and Dion2009; Wagner and Sankoff, Reference Wagner and Sankoff2011; Roberts, Reference Roberts2012, inter alia). Presence/absence of ne and the negative item (e.g. pas, plus, jamais etc.) were also noted.
Contingency: Events were coded as either ‘contingent’, where the future eventuality was dependent on some other condition being fulfilled, or ‘assumed’, where there was no such condition and the event was assumed to be valid. This is the approach used in Poplack and Turpin (Reference Poplack and Turpin1999: 153), which itself is based on Fleischman (Reference Fleischman1982). (See e.g. Poplack and Turpin, Reference Poplack and Turpin1999; Grimm and Nadasdi, Reference Grimm and Nadasdi2011).
Adverbial modification: Tokens were coded for adverbial modification in three categories: absence of adverbial specification; presence of non-specific adverbial; and presence of specific adverbial, for example ‘lundi prochain’ or ‘en fevrier’. (See e.g. Emirkanian and Sankoff, Reference Emirkanian, Sankoff and Bouvier1986; Poplack and Turpin, Reference Poplack and Turpin1999; Poplack and Dion, Reference Poplack and Dion2009; Grimm and Nadasdi, Reference Grimm and Nadasdi2011).
Presence of quand ‘when’: Tokens in the present study were coded for presence or absence of quand. (See e.g. Emirkanian and Sankoff, Reference Emirkanian, Sankoff and Bouvier1986; Grimm and Nadasdi, Reference Grimm and Nadasdi2011).
Presence of certain phrases: During initial inspection of the data, it was noted that a number of phrases occurred quite frequently, e.g. avoir … ans (‘to be … years old’) and avoir besoin (‘to need’). These and other regularly-occurring phrases were coded for, as there seemed to be tendencies for a particular phrase to appear with either IF or PF exclusively.
Sociolinguistic factors
Age: This was initially coded into seven categories (10–15 years old, 15–25, 25–35, 35–45, 45–55, 55–65, 65+, plus ‘Not stated’). These were subsequently collapsed into three main groups (15–35, 35–55, 55+) to correspond to major life stages of young adulthood, middle age, and older/retirement age, as in some cases an age band was empty or contained few speakers.Footnote 13 (See e.g. Emirkanian and Sankoff, Reference Emirkanian, Sankoff and Bouvier1986; Poplack and Turpin, Reference Poplack and Turpin1999).
Gender: This was coded as either male or female, following the labels used in the original corpus. (See e.g. Villeneuve and Comeau, Reference Villeneuve and Comeau2016).
Socio-economic status: The categorization schema for SES in the ESLO corpus (useful for fine-graded distinctions but less of a concern here) was simplified into two categories, Working Class (WC) and Middle Class (MC), on the basis of the INSEE professions et categories socioprofessionnelles categorization schema.Footnote 14 (See e.g. Emirkanian and Sankoff, Reference Emirkanian, Sankoff and Bouvier1986).
Education level: The ESLO education levels were also simplified into a three-way distinction: Bac- (no qualifications beyond the end of compulsory schooling), Bac (baccalauréat), and Bac+ (university-level qualifications). (See e.g. Roberts, Reference Roberts2012; Villeneuve and Comeau, Reference Villeneuve and Comeau2016).
In the next section, we first present a distributional analysis of the two variants in the corpus to examine the general tendencies and effect of the different factors, with chi-square tests for significance where appropriate. Second, we present the results of a multiple logistic regression which examines the relative weight of the different factors. Factors which were not significant or for which insufficient numbers of tokens were obtained are not discussed any further in the quantitative analysis. These were: presence of quand (insufficient tokens); contingency, age, gender, education level and socio-economic status (not significant). Social factors are discussed briefly in a qualitative analysis of tokens of PF found in negative contexts (end of section 4.3).
4. RESULTS
4.1. Overall
The overall distribution of IF and PF in the whole dataset reveals a marked preference for IF with avoir and être (69.1%; Table 2), in contrast to the findings of most previous studies looking at all verbs together, in which a preference for PF is found (with the exception of some Acadian varieties, as discussed above). The proportions seen here more closely resemble those reported in the older quantitative studies of spoken French discussed above (e.g. Kahn Reference Kahn1954; François Reference François1974; Lorenz Reference Lorenz1989; Gougenheim et al. Reference Gougenheim1964) and those on written French (e.g. Sundell Reference Sundell1991 as discussed above; see also Lesage and Gagnon Reference Lesage, Gagnon, Chrochetière, Boulanger and Ouellon1992). Thus, the behaviour of these two verbs looks highly conservative when compared to overall patterns of IF/PF usage with all verbs.Footnote 15
Table 2. Overall distribution of future variants

When the overall data are split by verb (Figure 1), it is evident that avoir has a much stronger preference for IF than être, with almost 80% IF and 20% PF compared to around 60% IF and 40 PF% for être.

Figure 1. Distribution of variants by verb.
4.2. Change
Given the conservative behaviour of avoir and être overall, it is all the more interesting to observe that there is nonetheless evidence of diachronic change taking place: the proportion of PF has increased in the later period, at the expense of IF. Figure 2 shows that the two verbs behave similarly in the earlier period, with both strongly favouring IF: avoir 87.5%, être 72.7%. In the later period, this effect has weakened, especially for être, for which an almost 50/50 distribution can be observed. Avoir, meanwhile, still strongly prefers IF (70.0%). These results were significant in a chi-square test (avoir: χ2=10.676, df=1, p=0.001; être: χ²=19.215; df=1; p<0.0001).

Figure 2. Distribution of future variants by verb and corpus.
4.3. Multivariate analyses
A mixed-effects multiple logistic regression analysis was carried out on these data using Rbrul (Johnson, Reference Johnson2009).Footnote 16 An advantage of this type of analysis is that it allows all factor groups to be included simultaneously in one statistical model and can give an indication of their relative strength. Mixed-effects (as opposed to fixed-effects) models can also take into account potential random effects such as speaker and word-level variation, and factors are only selected as statistically significant when their effect rises above inter-speaker variation (Johnson, Reference Johnson2009: 365). The results of such an analysis provide ‘three lines of evidence’ (Tagliamonte, Reference Tagliamonte2012: 122; Poplack and Tagliamonte, Reference Poplack and Tagliamonte2001: 92; Tagliamonte, Reference Tagliamonte, Chambers, Trudgill and Schilling-Estes2002: 731) with which to understand and explain the variation in question. These are: (i) statistical significance of the effect (at the p=0.05 level); (ii) magnitude of the effect, evident from the range between the highest and lowest factor weight in a factor group; and (iii) direction of effect, shown by the hierarchy of factor weights within a factor group (Poplack and Dion, Reference Poplack and Dion2009: 572).
A separate analysis was carried out for each verb and the results are shown in Table 3 (avoir) and Table 4 (être). The tables show the factor groups which were significant for each verb. IF is selected as the application value in order to allow comparability with previous studies. Regression coefficients are expressed as both a log-odd and a factor weight (weighted probability). A positive log-odds value indicates that the factor favours the application value; a negative log-odds value indicates a disfavouring effect, while a value of 0 is neutral. The log-odds value gives an indication of the magnitude of the effect. A factor weight greater than 0.5 indicates that the factor favours the application value variant (i.e. IF), while a factor weight of less than 0.5 indicates that it disfavours it.
Table 3. Multivariate analysis results for avoir

Application value = IF
n = 228, df = 8, log likelihood = −104.199, overall proportion = 0.798, centred input probability = 0.897
Factors not selected as significant: Contingency; Adverbial modification; Age; Sex; SES; Education
Table 4. Multivariate analysis results for être

Application value = IF
n = 380, df = 10, log likelihood = −223.028, overall proportion = 0.626, centred input probability = 0.58
Factors not selected as significant: Contingency; Age; Sex; SES; Education
The multivariate analyses indicate that of all the linguistic factors examined, only three had a significant effect in selection of future variants with avoir: sentential polarity (p=0.00157), corpus (i.e. date of recording; p=0.00208) and proximity (p=0.0377). For être, these three factors were also significant (proximity p=0.0000129; corpus p=0.0000248; polarity p=0.0298), as well as adverbial modification (p=0.0279). Of the social factors examined, none was significant for either verb – indicating a lack of social conditioning for this variable, which has also been the case in many previous studies (see for example those cited above), though a larger sample would clarify this result.
The results for ‘corpus’ confirm what we have already seen above: that these two verbs are participating in the more general change that is taking place, with PF becoming more common for both verbs (though for avoir, in the 2008 corpus, IF is still the majority variant).
The results for proximity indicate that the effect of this factor group was broadly similar for both verbs: for events longer than a week away and those which are ‘continual’, IF is preferred. Events which are set to occur within the day also prefer IF, though this effect is weaker, and in the case of être, very slight indeed (i.e. factor weight very close to 0.5). For both verbs, events within an hour and within a week disfavour IF – i.e. are more likely to be found with PF. However, due to the low token numbers in the hour, day and week categories (most acute for avoir), caution must be exercised when interpreting the ordering of the levels within this factor group. We can perhaps be most confident about the result for the ‘longer than week’ category, since this has the largest number of tokens. This factor group has the largest range for both verbs, and for être, the lowest p-value (though sample size can be a crucial factor when calculating p-values, therefore we will use them here with caution). These results are therefore broadly in line with previous studies where proximity was found to be a significant factor: events which are further away in time favour IF.
Adverbial modification was significant only for être. Contexts where there was a specific adverbial favoured IF; contexts where there was a non-specific adverb, or no adverbial modification, favoured PF, though for non-specific contexts this effect was very weak. This is broadly in line with previous studies, where the presence of an adverb (specific or non-specific) has been found to favour IF. It is somewhat difficult to explain why this factor should be significant for être only; it could possibly be an artefact of sample size (the number of tokens with être being larger).
Polarity was a significant factor for both verbs; more so for avoir, with a larger range and lower p-value. In negative contexts, IF was strongly favoured; in positive contexts, IF was less strongly favoured, though it is still the majority variant for both verbs (Figure 3 and Figure 4). Nevertheless, we do find some examples of PF in negative contexts (see below).

Figure 3. Distribution of variants with avoir in positive and negative contexts in the two corpora.

Figure 4. Distribution of variants with être in positive and negative contexts in the two corpora.
Given that in some previous studies PF is entirely absent from negative contexts, it is interesting to examine more closely the tokens of PF that we find in negative contexts in this study. There are eight tokens of PF which occur in negative contexts. The majority of these are with être, as shown in Table 5 and examples (6) to (13).
Table 5. Negative tokens with PF

Tokens of PF in negative contexts with être
(6) alors le CES c’est un enseignement court qui se qui conduit l’enfant à arrêter ses études ou entendez non c’est le CEG c’est je me suis trompée voulez-vous faire une correction alors je parle des C bon oui ça va ça va pas être très brillant le CEG euh conduit l’enfant euh à un enseignement court qui s’arrêtera à la fin de la scolarité (1968-IG298)
(7) vous savez ça ça ça ça vient de soi hein c’est toujours pareil alors moi j’ai jamais lu de livre comme ça ça peut être utile comme ça va pas être utile ça” (1968-MH539)
(8) ça ça va pas être possible hein (2008-BI58)
(9) si on va sur un autre chantier on va être de trop on va se gêner ça va pas être utile (2008-BV1)
(10) alors le Tibet on va laisser tomber parce que je suis objective euh ça va pas être euh demain qu’on va pouvoir remettre les pieds au Tibet facilement (2008-JR18)
(11) [il] va continuer à faire des petits articles en tant que pigiste mais il ne va plus être euh rédacteur en chef (2008-UZ57)
Tokens of PF in negative contexts with avoir
(12) il faut faire un même procédé parce qu’on a une usine d’entretien qui est à la Source on va pas avoir un deuxième (2008-HV753)
(13) je j’ai su y a deux jours que en fait le transport euh on on va pas avoir de transports pendant les deux mois de vacances (2008-MX953FEM)
Following Poplack and Dion (Reference Poplack and Dion2009), we could view the tokens of PF in negative contexts as the residue of an incomplete change involving the embedding of IF as the only acceptable form in these contexts. However, this interpretation is not strongly supported by the evidence available for the French of France (see above), and especially since there are actually more tokens of PF in negative contexts in the later corpus.
A qualitative analysis of the linguistic context of these examples, in an attempt to characterize the conditioning of such examples (cf. Poplack and Dion Reference Poplack and Dion2009: 575, fn13), does not reveal any strong patterns, except perhaps that most are found with an impersonal subject, either ‘ça’ (with être) or ‘on’ (with avoir) (cf. discussion of Sundell, Reference Sundell1991 above). The only personal subject is il in (11). The following context features a variety of different word types: mostly adjective, but also noun, adverb. Several are preceded by periphrastic futures with other verbs (e.g. (9), (10), (11)) which could prime for PF in the tokens examined. The negative item in the majority of cases is pas, and ne is usually not present (presence of ne has been shown in other studies to inhibit PF; Roberts, Reference Roberts2012). One token, (11), stands out as different from the others. This token has personal il as the subject, ne is present, and the negative item is ‘plus’. Despite these rather exceptional features, some of which mark a more formal context (IF having been associated with formal contexts – see e.g. Roberts, Reference Roberts2012), PF is still selected.
As far as the social characteristics of the speakers are concerned, amongst the speakers who produced a token with PF are speakers from all age groups, both sexes, all SES groups and all education levels (Table 6). They all produce IFs elsewhere (with avoir, être or other verbs), therefore are not categorical PF users (cf. Wagner and Sankoff, Reference Wagner and Sankoff2011: 303). Thus it seems that these examples cannot be explained away as somehow ‘exceptional’ (cf. Wagner and Sankoff, Reference Wagner and Sankoff2011 – the only negative tokens with PF found in their study were hesitations or reformulations), and must be viewed as part of the normal workings of the variable grammar underlying alternation between IF and PF in this variety of French.
Table 6. Profiles of speakers who produced a negative token with PF

Following Roberts (Reference Roberts2012: 102), a further analysis was conducted on the results for sentential polarity. Table 7 confirms Roberts’ findings, i.e. that the incidence of IF increases as formality increases (the presence of ne being a well-known marker of formality in spoken Metropolitan French; Ashby, Reference Ashby1981; Coveney, Reference Coveney1996; Armstrong, Reference Armstrong2001).
Table 7. Distribution of variants by sentential polarity, three-way analysis

5. DISCUSSION
The results here show that these two verbs have a much stronger preference for IF than most other verbs, when compared to the results of previous studies, which have looked at patterns for all verbs together.Footnote 17 However, the effect of the factors involved in determining the distribution of IF and PF is much the same here as has been found in other studies. In the light of this, what can explain the exceptionally high rate of IF use with avoir and être?
These two verbs are highly frequent and morphologically irregular.Footnote 18 We know from Bybee’s work (Bybee and Thompson, Reference Bybee and Thompson1997; Bybee, Reference Bybee, Joseph and Janda2003) that while for phonology, high frequency promotes (reductive) change, it may – paradoxically – help to preserve conservative morphosyntax.Footnote 19 Bybee (Reference Bybee, Joseph and Janda2003: 621) notes that ‘repetition affects morphosyntax by ensuring the retention of older characteristics’. In this case, avoir and être favour an older morphosyntactic variant, IF, over a newer one, PF.
Table 8. ‘Emplois stéréotypes’ noted in Bilger (Reference Bilger2001: 188)

Other studies of future temporal reference which have noted this tendency include Blondeau and Labeau (Reference Blondeau and Labeau2016), who examine FTR in French television weather bulletins, and found that the only significant linguistic factor constraining variation in their data was ‘type of verb’. Irregular verbs (which tend to be the most highly frequent) favoured IF, while regular verbs disfavoured it. The authors note that irregular forms tend to be more conservative, but do not pursue this further. They do however note that studies of other sociolinguistic variables have found similar tendencies: Poplack (Reference Poplack, Bybee and Hopper2001) in relation to the subjunctive, and Labeau (Reference Labeau2015) in relation to the passé simple. In other areas of morphosyntactic variation, there are similar trends: Tristram (Reference Tristram2014), examining the case of verbal agreement with collective nouns in French in both oral and written French, shows that more conservative patterns are observed with the most frequent items (e.g. majorité). Lindqvist’s (Reference Lindqvist1979) study of the imperfect subjunctive in written French observes that the choice between present or imperfect subjunctive after a main clause past tense verb is conditioned by a number of factors, one of which is the identity of the verb taking the subjunctive. When this is avoir or être, the conservative form, imperfect subjunctive, occurs at a rate of 65%, dropping to 48% for all other verbs.
Table 9. Frequency calculations for avoir and être

But an explanation along the lines of mere frequency does not suffice. It is helpful to look to other areas of morphosyntactic variation for insights. Waugh and Monville-Burston’s work on past tenses in data from newspaper usage, while not directly comparable as it concerns written language, nonetheless provides support for the hypothesis that highly frequent items tend to associate with older morphosyntactic forms. In relation to the competition between the newer compound past (passé composé) and older, simple past (passé simple), they note that:
The result of the competition between SP [sc. simple past] and CP [sc. compound past] has been […] that the newer CP has expanded its terrain, taking on the more general meaning (present and past perfect). Correspondingly, the older SP has become more specialized: it is lower in frequency, restricted to specific kinds of texts, associated with certain verbs of high frequency (such as être), and characterized by […] semantic density […]. (Waugh and Monville-Burston, Reference Waugh and Monville-Burston1986: 874)Footnote 20
Engel (Reference Engel1989) also found in her study of past tenses with être that a number of other factors were at play in the choice between passé simple and passé compose, i.e. fut and a été. While Engel’s study is also principally concerned with written language (journalistic texts), she also carries out cloze tests with francophone participants, and the results can help shed light on what may be happening in the present study. In Engel’s results, être was the only verb to be found with more tokens of the passé simple (PS) than the passé composé (PC) in the journalistic passages. In the cloze tests, Engel found that fut seemed to be used in specific, well-defined contexts, whereas a été could often be replaced by other tense forms. The factors that seemed to be involved in the preference for être in the PS included phonological (e.g. number of syllables, phonological quality), stylistic (word order, inversion) and syntactic factors (e.g. ce + être, passive constructions, PC with être as auxiliary).Footnote 21 The most relevant for our purposes are the first group, phonological factors. Relative number of syllables was a highly significant factor, in that shorter forms were strongly preferred – which in the case of être meant the PS form, fut. Engel notes that ‘les effets de contraste […] resortissent plus au style des auteurs qu’un emploi automatique chez la plupart des francophones instruits’ (Reference Engel1989 : 6); i.e. it is not necessarily a deciding factor for the average speaker, though in her results these effects did play a role. In relation to our results here, for future temporal reference, the IF forms sera and aura are shorter than their PF equivalents va être and va avoir.Footnote 22
Another factor investigated by Engel is qualité phonologique, phonological quality, which includes assonance, alliteration and avoidance of cacophony (by ‘l’interruption d’une sequence de mots à voyelle intiale’; Reference Engel1989: 7). This last may be relevant here: more ‘euphonic’ forms (i.e. those avoiding cacophony) were preferred in Engels’ study, and for être, this was again the PS form fut. Here, the IF forms have an advantage over the PF forms, in that they phonologically more euphonic, and avoid hiatus, i.e. two vowels coming together in va être and va avoir.
Further investigation is needed to confirm number of syllables and phonological quality (cacophony/avoidance of vowels in hiatus) as factors in the choice of FTR forms with avoir and être, but Bilger’s (Reference Bilger2001) study of FTR in oral French contains some insights which lend weight to the hypothesis that these might be significant. Bilger found that IF was highly likely to appear with the verbs être, avoir, pouvoir, devoir, falloir and faire – i.e. the forms sera, aura, pourra, devra, faudra, fera. Bilger’s explanation is more concerned with verb type and lexical semantics. She notes:
D’une part, la valeur stative ou non-stative des verbes semble effectivement sélectionner l’emploi du futur périphrastique ou du futur simple ; d’autre part, au regard de leur fréquence, le sémantisme des verbes modaux semble être en parfaite adéquation avec la « tension modale » du futur simple et du conditionnel. (2001: 185)
However, it is surely of note that all the preferred forms here are shorter and more easily retrieved due to their high frequency. Bilger hints at this when she notes (in relation to the second part of her study, based on the GARS corpus): ‘le futur périphrastique utilise une variété plus grande de lexique verbal que le futur simple qui concentre ses emplois sur une petite série de verbes, et notamment dans les deux cas sur être et avoir ainsi que sur les verbes modaux pouvoir et falloir’ (Reference Bilger2001: 187). Bilger (Reference Bilger2001: 188) describes certain uses as ‘emplois stéréotypes’, noting the following strong tendencies set out in Table 8.
Additional support for the hypothesis that these verbs’ IF forms are in some sense stereotypical or formulaic comes from B Lorenz’s Reference Lorenz1989 oral corpus study. Lorenz finds that while PF tends to combine with dynamic, lengthy and reflexive verbs, IF tends to be found with stative and short verbs. Thus there is a small group of highly frequent IF forms which are also shorter and more euphonic than their PF counterparts. They seem to be more easily accessed than PF forms for certain verbs, including avoir and être. These factors all contribute to higher rate of IF selection.
Two recent studies offer further evidence to support this hypothesis: Côté (Reference Côté2018) and Tremblay et al. (Reference Tremblay, Blondeau and Labeau2019). Côté (Reference Côté2018) looks at FTR in L2 French. The authors hypothesize that, given the analytical and (they argue) cognitively simpler construction of the periphrastic future, this will be the preferred form among L2 speakers in their study, especially for irregular verbs, whose more complex morphology, it is predicted, will discourage the use of IF. This is indeed what is found overall. However, avoir and être did not conform to this pattern: participants used IF with these two verbs more frequently than with any other verb, regular or irregular (2018: 540). The authors attribute this to learners’ earlier exposure to, and thus greater familiarity with, these two verbs, as well as to their high frequency overall. In their study of FTR in text messages, Tremblay et al. (Reference Tremblay, Blondeau and Labeau2019) also found that verb type was a significant factor, with avoir, être and modal verbs being much more likely to appear with IF.Footnote 23 They argue that this is related to morphological complexity: the more irregular the paradigm, the more likely the use of IF. The authors suggest that this effect may be due to accessibility of forms in the mental lexicon, with highly frequent irregular forms being more easily retrievable.
6. CONCLUSION
This investigation set out to investigate the distribution of FTR forms with two verbs in particular, avoir and être, because in some previous studies, these (and a handful of other verbs) have been identified as exhibiting different distributional patterns to the majority of verbs. The results showed that avoir and être strongly prefer IF, though this tendency has weakened somewhat for être in the more recent data. It is of course possible to take lexical differences into account when using mixed-effects statistical models to evaluate the effect of different factors on selection of variants. Nevertheless, to do this does not further our understanding of why this might be the case; it merely ensures that the statistical model is as accurate as possible regarding the effect of the independent factor groups investigated (see Johnson, Reference Johnson2009, for discussion of the use of mixed effects in variable rule analysis of linguistic data).
Complementing previous studies of FTR in the variationist paradigm, and others which take a different approach (e.g. Abouda and Skrovec, Reference Abouda and Skrovec2015, Reference Abouda and Skrovec2017), this study has shed some light on what lies behind the somewhat exceptional behaviour of avoir and être. In doing so, we have underlined the importance not only of taking lexical differences into account when conducting analyses which throw all verbs (or other word categories) in together, but also of using a combination of quantitative and qualitative analysis to explore this variation further.
APPENDIX Frequency calculations for avoir and être
The frequency of the infinitive form of each verb was calculated for the ESLO corpus as a rough guide (it is not possible in ESLO to construct a simple search query which will capture all inflected forms). As a comparison, frequency in the FRANTEXT corpus of written French was also calculated, both for the infinitive alone, and for the infinitive plus inflected forms (the FRANTEXT interface allows for simpler searching for inflected forms).Footnote 24