1. Introduction
In past decades, the causative–anticausative alternation (CAA), as in (1), has been the focus of many studies in various subfields of linguistics, such as linguistic typology (see Nedyalkov & Silnitsky Reference Nedyalkov, Silnitsky and Kiefer1973; Haspelmath Reference Haspelmath1987, Reference Haspelmath, Comrie and Polinsky1993; Kulikov Reference Kulikov2003; Nichols, Peterson & Barnes Reference Nichols, Peterson and Barnes2004; Comrie Reference Comrie2006; Haspelmath et al. Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014); theoretical linguistics (see Labelle Reference Labelle1992; Levin & Rappaport Hovav Reference Levin and Rappaport Hovav1995; Piñón Reference Piñón, Hastings, Jackson and Zvolenszky2001; Härtl Reference Härtl2003; Alexiadou, Anagnostopoulou & Schäfer Reference Alexiadou, Anagnostopoulou, Schäfer and Frascarelli2006; Kallulli Reference Kallulli2006; Koontz-Garboden Reference Koontz-Garboden2007, Reference Koontz-Garboden2009; Schäfer Reference Schäfer2008; Alexiadou Reference Alexiadou, Rappaport Hovav, Doron and Sichel2010, Reference Alexiadou2014; Labelle & Doron Reference Labelle and Doron2010), and language-specific descriptive work (see for French e.g. Rothemberg Reference Rothemberg1974; Zribi-Hertz Reference Zribi-Hertz1987; Labelle Reference Labelle1992; Ben Salah-Tlili Reference Ben Salah-Tlili, François and Brahim2007; Kupferman Reference Kupferman, Danblon, Kissine, Martin, Michaux and Vogeleer2008; Heidinger Reference Heidinger2010, Reference Heidinger2012; Legendre & Smolensky Reference Legendre, Smolensky, Gerdts, Moore and Polinsky2010; Kailuweit Reference Kailuweit and Nakamura2011, Reference Kailuweit, von Bellosta Colbe and García García2012; Martin & Schäfer Reference Martin and Schäfer2014).
On the basis of their syntactic and semantic properties, the two parts of the alternation, the causative alternant and the anticausative alternant, can be characterized as follows: The causative alternant describes a change of state and both the actor that brings about and the undergoer that undergoes the change of state are expressed as arguments; the actor is expressed as a subject, while the undergoer is expressed as a direct object. The anticausative alternant also describes a change of state, but does not express or semantically imply an actor that brings about the event; the sole argument, namely the undergoer, is expressed in subject position.Footnote [2] These relations are illustrated in (2).
The terms encoding or encodingof the alternation, as in the title of the present article, refer to the form of the alternating verb. In the English examples in (1), the change between the causative and the anticausative alternant does not result in a formal change in the verb. Cross-linguistically, however, the CAA is often encoded in ways that involve a formal change in the alternating verb. In his typological work, Haspelmath (Reference Haspelmath, Comrie and Polinsky1993) distinguishes five types of encoding of the CAA: (i) the causative type, where the causative alternant is formally marked compared to the anticausative alternant (exemplified by Georgian in Table 1); (ii) the anticausative type, where the anticausative alternant is formally marked compared to the causative (as in the Polish example); (iii) the labile type, where no formal change in the verb occurs (as in (1) above); (iv) the equipollent type, where both the causative and the anticausative alternant bear special morphology that is attached to a common stem (as in the Japanese example); and (v) the suppletive type, where the causative and the anticausative alternant are expressed by verbs which are formally not related (as in the Russian example).
Table 1 Encoding types of the causative–anticausative alternation (Haspelmath Reference Haspelmath, Comrie and Polinsky1993, adapted).

Besides cross-linguistic variation, the CAA also often involves variation within a single language, as is the case for French and Spanish. In both languages, the causative and the anticausative alternant come in two variants: a formally marked and a formally unmarked variant (see Table 2).Footnote [3]
Table 2 Encoding of the causative–anticausative alternation in French and Spanish.

As for the causative alternant, the unmarked variant is formed with a plain transitive verb, as in (3a), while the marked variant is formed with the lexical verb and a causative auxiliary (French faire ‘make’, as in (3b), and Spanish hacer ‘make’).
In the case of the anticausative, the unmarked variant is formed with a plain intransitive verb (as in (4a)) and the marked variant is formed in both French and Spanish with the lexical verb and the reflexive clitic se (as in (4b)).
The existence of these four types of encoding within a single language raises several research questions about the distribution of verbs in the four types and about the semantic and syntactic differences between the marked and the unmarked variant of an alternant (see Schäfer Reference Schäfer2009 for a recent survey of the literature and the issues treated there).Footnote [5] The specific research question that will be answered in the present paper is whether the causalness of the verb, i.e. the quantitative relation between the causative and the anticausative use, is a factor in the encoding and whether French and Spanish differ in this respect. Thereby the paper contributes in at least the following two ways to the study of the causative–anticausative alternation:
The choice of these two languages is motivated by several reasons. First, French and Spanish are languages with a variation in the encoding of the alternants (marked vs. unmarked variants, see Table 2 above), which is a prerequisite for verifying a prediction on the relation between causalness and encoding. Second, while the alternation in these two languages has received considerable attention in the linguistic literature, the inclusion of the factor causalness is an innovation. Finally, French and Spanish are closely related languages and show the same encoding types (marked anticausatives with se; marked causatives with a causative auxiliary); it is thus interesting to investigate whether the similarities between the two languages include the relation between causalness and encoding.
The paper is structured as follows. In Section 2, the concept of causalness and its presumed relevance for the encoding of the alternation is introduced, ways in which the notion can be fruitfully used in the analysis of the alternation are discussed and a specific prediction concerning the relation between causalness and the encoding of the CAA is formulated. Section 3 is the empirical core of this paper and devoted to a corpus study of alternating verbs in which the prediction is tested against French and Spanish data. The main outcome is that there is in fact a strong correlation between causalness and encoding in both languages.
2. Causalness and encoding
Alternating verbs may differ with respect to how often they are used as a causative and as an anticausative. The French alternating verb améliorer ‘improve’, for example, is used much more often as a causative than as an anticausative, while the verb grandir ‘make/become big’ is used much more often as an anticausative than as a causative (see Table 3).
Table 3 Frequency of causative and anticausative use for French améliorer ‘improve’ and grandir ‘make/become big’ (corpus source: see Section 3.1).

Following Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014), I use the term causalness to refer to the dimension that distinguishes améliorer ‘improve’ and grandir ‘make/become big’. The degree of causalness of an alternating verb is calculated as in (5).
The number of the causative uses of a verb multiplied by 100 is divided by the sum of its causative and anticausative uses. The number of causative uses of a verb is the sum of its unmarked and its marked causative uses; the number of its anticausative uses is the sum of its unmarked and its marked anticausative uses.Footnote [6] Verbs that are used more often as causatives (compared to anticausatives) have a high degree of causalness, while verbs that are used more often as anticausatives (compared to causatives) have a low degree of causalness. The causalness value, which is a value between 0 and 100, is 79.66 for the verb améliorer ‘improve’ and 5.39 for the verb grandir ‘make/become big’.
In this paper, the prediction in (6) on the relation between causalness and the encoding of the causative alternant and the relation between casualness and the anticausative alternant is tested.
The basic expectation is a covariation between the variables causalness and encoding. It is expected that the encodings of the two alternants covary with causalness in two different ways: Firstly, for the encoding of the anticausative, a positive correlation between causalness and marked anticausatives is predicted; secondly, for the encoding of the causative alternant, a negative correlation between causalness and marked causatives is predicted.Footnote [7] According to this prediction, a high degree of causalness increases the likelihood that the anticausative is marked and the causative is unmarked, and a low degree of causalness increases the likelihood that the anticausative is unmarked and the causative is marked.Footnote [8]
To illustrate the prediction, I use again the French verbs améliorer ‘improve’ and grandir ‘make/become big’. Recall that améliorer has a higher degree of causalness than grandir (79.66 vs. 5.39). Thus, améliorer should mark the anticausative equally or more often and the causative equally or less often than grandir. As the encodings of the two verbs in Tables 4 and 5 show, the prediction is borne out: Améliorer marks the anticausative more often than grandir (100% > 0%) and améliorer marks the causative less often than grandir (8.93% < 18.75%).
Table 4 Encoding of améliorer ‘improve’ (corpus source: see Section 3.1).

Table 5 Encoding of grandir ‘make/become big’ (corpus source: see Section 3.1).

In order to test the prediction against a larger set of verbs a statistical method is needed. To verify the extent to which the prediction is fulfilled by the French and Spanish data Spearman’s rank correlation coefficient is calculated for the encoding of the causative and the anticausative in both languages. In the case of the encoding of the anticausative, the degree of causalness and the percentage of marked anticausatives are set in relation (predicting a positive correlation). For the encoding of the causative, the causalness and the percentage of the marked causatives are set in relation (predicting a negative correlation).
Given the above definition of causalness, the prediction only relates a verb’s frequency of use in the causative and the anticausative alternant and the encoding of the alternants. It does not relate (lexical) semantic properties of the verbs and the encoding of these verbs in the causative–anticausative alternation. Note however that in the literature, several statements of the latter type have been made. More precisely, the spontaneity of a verb (i.e. the probability of the event denoted by the verb occurring with or without an external force) has been considered as a factor for the encoding of the alternation. Nedjalkov (Reference Nedjalkov and Vardul’1969) argues that the more spontaneous an event is, the more probable is its expression with a marked causative (Nedjalkov Reference Nedjalkov and Vardul’1969, cited after Letuchiy Reference Letuchiy, Brandt and García García2010: 239). The same basic idea, but from a different perspective, can be found in Croft (Reference Croft and Tsohatzidis1990) and Haspelmath (Reference Haspelmath, Comrie and Polinsky1993):
The difficulty that these approaches face is that a verb’s spontaneity is hardly accessible; in this respect spontaneity differs from other lexical semantic properties of verbs such as aktionsart, which can be determined on the basis of a number of well-established diagnostics. The prediction to be tested in the present paper (see (6) above) is thus more modest in that it does not refer to the verb’s semantics, but it is also less speculative since it focuses on the directly observable facts.
The concept of causalness has recently been applied in two empirical works on the causative–anticausative alternation: Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) and Samardžić & Merlo (Reference Samardžić and Merlo2012). Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) test several predictions on the relation between causalness and causative prominence (i.e. the tendency to formally mark the causative alternant and leave the anticausative alternant unmarked) for 20 verb meanings in seven languages.Footnote [9] One of the main results of their corpus-based study is that there is a strong negative correlation between causalness and causative prominence: Verb meanings with a high degree of causative prominence (i.e. they cross-linguistically tend to be encoded with marked causatives) tend to have a low degree of causalness and verb meanings with a low degree of causative prominence tend to have a high degree of causalness. For example, in Haspelmath et al.’s (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) sample of 20 verb meanings, ‘sink’ is one of the verb meanings with the highest degree of causative prominence and has a causalness value of 17 (across all seven languages), while ‘close’ has the second lowest degree of causative prominence and a causalness value of 80 (see Haspelmath et al. Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014: 611). The same method as in Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) has already been applied in Samardžić & Merlo (Reference Samardžić and Merlo2012) – with the important difference that Samardžić & Merlo (Reference Samardžić and Merlo2012) use all 31 verb meanings from Haspelmath (Reference Haspelmath, Comrie and Polinsky1993), but only apply it to English. Samardžić & Merlo (Reference Samardžić and Merlo2012) show that a strong negative correlation (r = .84,p < .01; with one outlier removed) exists between the causalness of the English verbs and the causative prominence of the verb meanings (based on Haspelmath’s Reference Haspelmath, Comrie and Polinsky1993 data). English verbs with meanings which cross-linguistically tend to be encoded with a marked causative tend to have a lower degree of causalness than English verbs with meanings which cross-linguistically tend not to be encoded with a marked causative.Footnote [10]
In the present paper, the same notion of causalness as in Samardžić & Merlo (Reference Samardžić and Merlo2012) and Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) is applied: Causalness is reflected in the frequency of the causative and the anticausative use of a verb. In the present paper, however, causalness is related to the frequency of formal encodings of individual verbs in individual languages – unlike as in Samardžić & Merlo (Reference Samardžić and Merlo2012) and Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014). One novelty of the present contribution is thus that it relates causalness to the frequency of marked and unmarked variants of the alternants. Another novelty is that French and Spanish are two languages which so far have not been investigated with respect to the relation between causalness and encoding. Finally, the presentation of the results in Section 3 includes a detailed comparison of the two languages with respect to the relation between causalness and encoding; such a comparison is neither part of Samardžić & Merlo’s (Reference Samardžić and Merlo2012) analysis of English (due to the fact that only one language is investigated) nor of Haspelmath et al.’s (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) study (probably due to large number of investigated languages).
To sum up, causalness has been defined in this section as a property of verbs which is based on how often a verb appears in the causative and the anticausative part of the alternation. Further, a specific prediction on the relation between causalness and the encoding of the alternation has been formulated. Finally, two recent applications of the notion of causalness have been briefly presented and the difference between these existing applications and that in the present contribution has been described.
3. Empirical study
3.1 Material and method
To test the prediction that there is a correlation between the causalness and the encoding of alternating verbs (as defined in (6), Section 2 above) the following 20 alternating French and 20 alternating Spanish verbs were analyzed:
In the preparation for this study, much attention has been paid to the compilation of these sets. As already stated in Section 2, the main goal of this study is to investigate the relation between causalness and encoding. Therefore, it is desirable to have sets composed of verbs which vary in both causalness and encoding. Since causalness has not yet been investigated for the two languages, it could not be used for verb selection. Thus, the main idea behind the compilation of the two sets is that they show variation with reference to the encoding of the alternation. Following statements from the literature, verbs which presumably differ with respect to their encoding have been selected. For example, the Spanish set includes verbs such as crecer ‘grow’, which forms unmarked anticausatives, but not unmarked causatives (see Mendikoetxea Reference Mendikoetxea, Bosque and Demonte1999: 1597–1598). The set also includes verbs such as ablandar ‘make/become soft’, which forms marked anticausatives and unmarked causatives (see Mendikoetxea Reference Mendikoetxea, Bosque and Demonte1999: 1589–1590).
The sets are also intended to cover the range of alternating verbs with respect to criteria such as morphological form (derived and underived verbs) or aktionsart (punctual and durative, telic and atelic verbs). Although the sets are too small in order to systematically control for morphological form and aktionsart as factors, i.e. to detect their impact on the encoding, it is nevertheless desirable to have variation in the sets with respect to these dimensions. A further aim governing the selection of the verbs was to achieve concordance between the French and the Spanish set. In fact, 14 of the 20 verbs have a counterpart in the other language’s set. These ‘corresponding’ verbs are listed in Table 6.
Table 6 Corresponding verbs.

The main challenge in the actual compilation of the sets was that individual verbs always combine several of these criteria, e.g. French casser ‘break’ is both telic and underived; assécher ‘dry up’ is both telic and derived. The addition of a given verb to the sets does not only have consequences for one, but for several properties of the sets. Therefore, the overall balance within the sets always had to be considered during the compilation of the sets. As a consequence it is not possible to give one decisive reason why a given verb has been added to a set.
The limitation to 20 verbs in each language has practical reasons, namely the fact that all relevant data had to be analyzed manually; in addition to the relevant data, much irrelevant data had to be examined in order to single out the relevant data. Details on coding decisions and on how irrelevant data has been singled out are given in Appendix A.
The French data, which has partly been used in Heidinger (Reference Heidinger2012), comes from the French text corpus Frantext, a corpus consisting mainly of literary texts from the 16th century onwards, but only data from 1950 to 2000 was considered. In cases of verbs where the corpus queries led to too many hits, only a selection of the hits was considered (the selection was randomized in the sense that the selection criteria did not involve any of the factors to be analyzed in the study). The corpus queries were lemma-based, i.e. they were not specified for any grammatical form of the verb. For French, a total 3946 examples were analyzed. This number does not include the irrelevant examples that had to be sorted out manually. Only marked and unmarked causatives and anticausatives (as in (3) and (4) above) were considered, while for example stative and eventive passives, marked causatives of marked anticausatives and absolute uses of transitive verbs were discarded.
The Spanish data comes from the text corpus Corpus de Referencia del Español Actual (CREA) which is a pan-Hispanic text corpus with texts from 1975 onwards. In this study I did not search the whole corpus but only the subpart with novels from Spain (thus the Spanish data is limited to Iberian Spanish). Since the corpus does not allow for lemmatized searches, the queries had to be based on verb forms.Footnote [11] For each verb, I searched the forms for all three persons, for singular and plural and for three different tenses in the indicative mood (present, simple perfective past (indefinido), simple imperfective past (imperfecto)). For Spanish, a total of 1859 examples were analyzed using the same criteria as in the selection of the French data.
Table 7 Causalness and encoding in French.

%mAC = percentage of marked anticausatives of total of anticausatives
%mC = percentage of marked causatives of total of causatives
3.2 Results
Tables 7 and 8 present the results of the corpus study of French and Spanish data, respectively. For each of the investigated verbs the tables indicate the causalness, the encoding of the anticausative alternant and the encoding of the causative alternant. The verbs are ordered with increasing causalness. To indicate the encoding of the two alternants, the percentage of the marked variant (as opposed to the unmarked variant) is given: %mAC is the percentage of marked anticausatives on the total of anticausatives and %mC is the percentage of marked causatives on the total of causatives. Taking an item from Table 7 as an example, the tables read as follows: The French verb grandir ‘make/become big’ has a degree of causalness of 5.39; 0% of its anticausative uses are marked (and 100% are unmarked), and 18.75% of its causative uses are marked (and 81.25% are unmarked). Tables with the underlying absolute frequencies are given in Appendix B.
Table 8 Causalness and encoding in Spanish.

%mAC = percentage of marked anticausatives of total of anticausatives
%mC = percentage of marked causatives of total of causatives
3.2.1 Encoding of the anticausative alternant
Beginning with the encoding of the anticausative in French, Table 7 presents for each of the 20 French verbs the causalness and the percentage of the marked anticausative (as opposed to the unmarked anticausative). The value in the second column of Table 7 indicates the causalness of the respective verb and the verbs are ordered with increasing causalness. The third column indicates the encoding of the anticausative alternant for that verb. Figure 1 represents the same data; each of the data points stands for one of the 20 verbs.
Taking all 20 French verbs together, the overall impression is that the encoding of the anticausative correlates with the causalness: Verbs with a low degree of causalness (located towards the left end of the X-axis) tend to have unmarked anticausatives, while verbs with a higher degree of causalness (located towards the right end) tend to have marked anticausatives. In order to analyze the overall correspondence of the French data with the prediction that causalness and the percentage of marked anticausatives correlate, Spearman’s rank correlation coefficient has been calculated. The coefficient amounts to 0.675 (level of significance = .01 (one-sided)), which indicates a strong correlation between causalness and the encoding of the anticausative in French.
A further result that becomes immediately apparent by looking at Figure 1 is the strong preference of the verbs for only one of the two encoding variants: French alternating verbs typically encode the anticausative either with the marked or with the unmarked variant.Footnote [12] Examples for the first type of verb are ouvrir ‘open’, assécher ‘dry’, endurcir ‘make/become hard’ and intensifier ‘intensify’; examples for the second type are grandir ‘make/become big’, maigrir ‘make/become thin’ and jaunir ‘make/become yellow’. Note that only three out of 20 verbs show variable behavior in the sense that both variants have at least a percentage of 10%. These variable verbs are refroidir ‘cool down’ (42.11% marked and 57.89% unmarked) (see (9)), gonfler ‘inflate/swell’ (69.70% marked and 30.30% unmarked) (see (10)) and casser ‘break’ (58.49% marked and 41.51% unmarked) (see (4) above).
Turning to the encoding of the anticausative in Spanish, the same values as for French are correlated: the causalness and the percentage of marked anticausatives. The respective values for the 20 Spanish verbs are given in Table 8 above and represented in Figure 2.

Figure 1 Causalness and the encoding of the anticausative alternant in French.

Figure 2 Causalness and the encoding of the anticausative alternant in Spanish.
As in French, we observe a correlation between causalness and the encoding of the anticausative. As shown in Figure 2, the percentage of marked anticausatives tends to increase with the degree of causalness. The correlation coefficient which amounts to 0.540 (level of significance = .01 (one-sided)) confirms this impression and indicates a robust correlation between causalness and the encoding of the anticausative in Spanish. This correlation is due to the fact that verbs with a causalness value of 40 or higher tend to mark the anticausative and that verbs that do not mark the anticausative typically have a causalness value of below 40.
A further commonality with French, besides the robust correlation between causalness and encoding, is that Spanish alternating verbs also show a strong preference for only one of the encoding variants. Verbs that formally mark the anticausative are, for example, congelar ‘freeze’, agrandar ‘make/become big’ and cerrar ‘close’; verbs that do not mark the anticausative are crecer ‘grow’, amarillear ‘make/become yellow (pale)’ and hervir ‘boil’. Crucially, adelgazar ‘make/become thin’ is the only verb in the Spanish set where both variants have at least a percentage of 10% (with 45% marked and 55% unmarked).
Examples of marked and unmarked anticausatives with adelgazar ‘make/become thin’ are given in (11) and (12).
In our data, a clear division of labor between the marked and the unmarked variant can be observed: While the unmarked variant is used for the expression of the meaning ‘to lose weight’ as in (11), i.e. the physical process of becoming thinner of an animate being, the marked anticausative is used in other types of events, which can be conceptualized as the becoming thinner of an entity: the diminishing of a whisper or a voice or the narrowing of a road into a trail (see (12)).
In summary, both languages show a strong correlation between causalness and the encoding of the anticausative in terms of our prediction. Further, in both languages, verbs tend to encode the anticausative alternant either with the marked or with the unmarked variant.
3.2.2 Encoding of the causative alternant
In order to analyze the encoding of the causative, the causalness and the percentage of marked causatives are set in relation. As will become evident in the following, the encoding of the causative differs substantially in both languages from the encoding of the anticausative. Beginning with the situation in French, the causalness and the percentage of marked causatives given in Table 7 above are represented in Figure 3.
As Figure 3 shows, the encoding of the causative differs from the encoding of the anticausative in that the marked variant appears less often than in the case of the anticausative (as seen in Figure 1 above). Even the verbs with a rather low degree of causalness (located towards the left end of the X-axis) rarely formally mark the causative (the verb maigrir ‘make/become thin’ being the exception). Despite the low frequency of marked causatives, Figure 3 shows at first sight that there is a correlation between causalness and the encoding of the causative. Crucially, there is a clear tendency for the percentage of marked causatives to increase with the decrease in causalness. This impression is confirmed by the calculated correlation coefficient which amounts to –0.607 (level of significance = .01 (one-sided)).
French verbs that show variation in the encoding of the causative alternant are, for example, grossir ‘make/become big’ (see (3) above), augmenter ‘increase’ (see (13)) and refroidir ‘cool down’ (see (14)).

Figure 3 Causalness and the encoding of the causative alternant in French.
The relevant values for the encoding of the causative in Spanish are given in Table 8 above and illustrated in Figure 4. As in the case of French, marked causatives are less frequent than marked anticausatives. Only one of the verbs with a low degree of causalness, namely crecer ‘grow’, is formally marked in all causative uses. In the case of the other verbs with a low degree of causalness, the percentage of marked causatives is rather low. But despite the low frequency of marked causatives, we find, as in French, a robust correlation between causalness and the encoding of the causative alternant. The correlation coefficient is slightly lower than in French and amounts to –0.470 (level of significance = .05 (one-sided)).

Figure 4 Causalness and the encoding of the causative alternant in Spanish.
Spanish verbs that show variation in the encoding of the causative alternant are, for example, aumentar ‘increase’ (see (15)), enrojecer ‘make/become red’ (see (16)), adelgazar ‘make/become thin’ and engordar ‘make/become fat’.
To sum up, the encoding of the causative alternant is in correspondence with our prediction in both languages: Verbs with a low degree of causalness tend to mark the causative alternant more often than verbs with a high degree of causalness. In addition to that we have seen that the encoding of the causative alternant differs in both languages from the anticausative: Marked causatives are considerably less frequent than marked anticausatives.
3.3 Comparisons
In this section we look at the relation between causalness and encoding from a comparative perspective. We begin by taking another look at the correlation coefficients for both languages and for both parts of the alternation. Table 9 shows that in all four cases we observe a strong correlation between causalness and encoding.
Table 9 Correlation coefficients (Spearman’s rho) for causalness and marked encodings.

As concerns the comparison between the two languages, Table 9 shows that the correlations are slightly stronger in French than in Spanish; but the differences are too small to be further interpreted. As concerns the comparison between the two parts of the alternation, the correlations are slightly stronger in the encoding of the anticausative alternant; but again the differences between the correlation coefficients are too small to be further interpreted. The conclusion with respect to the correlation coefficients is thus that no relevant differences exist between the two languages and the two alternants.
Besides these commonalities, we also detect differences in the data. The first difference is one between the causative and the anticausative alternant. Both in French and in Spanish, marked anticausatives are considerably more frequent than marked causatives. While the anticausative alternant is nearly always formally marked if the verb has a high degree of causalness, the causative is only rarely formally marked even if the verb has a very low degree of causalness; there are many verbs with a low degree of causalness where the causative alternant is predominantly unmarked and not marked (e.g. jaunir ‘make/become yellow’, refoirdir ‘cool down’ and grandir ‘make/become big’ in French and derrumbar ‘collapse’, amarillear ‘make/become yellow (pale)’ and engordar ‘make/become fat’ in Spanish).Footnote [13]
The second difference concerns the encoding of the anticausative alternant in the two languages. In both languages the use of the marked and unmarked variant of the anticausative alternant depends on the degree of causalness of the respective verb. However, the cut-off point for the use of the anticausative has different locations on the causalness scale. In Spanish, verbs with a causalness value of 40 and higher tend to form marked anticausatives only, while in French this cut-off point is higher: Verbs with a causalness value of 50 and higher tend to form marked anticausatives only. This difference with respect to the cut-off point for the encoding of the anticausative is in line with statements from the literature according to which only a very small number of Spanish alternating verbs form unmarked anticausatives (see Levy Reference Levy, Alonso and Garza Cuarón1994, Sánchez Lopez Reference Sánchez Lopez and Sánchez López2002), while in French the number of verbs which form unmarked anticausatives is relatively high (Rothemberg (Reference Rothemberg1974) counts about 300 such verbs; see also Heidinger Reference Heidinger2014).Footnote [14]
Table 10 Causalness and encoding for corresponding verbs.

%mAC = percentage of marked anticausatives of total of anticausatives
%mC = percentage of marked causatives of total of causatives
Numbers in parentheses indicate the causalness rank of the Spanish verbs.
Finally I take a closer look at the verbs which have a correspondent in both languages, e.g. French maigrir and Spanish adelgazar, both meaning ‘make/become thin’ (see Table 6 above). The question is whether corresponding verbs behave similarly with respect to causalness and encoding. The relevant data for the 14 verb pairs is given in Table 10 (next page). The pairs are ordered in increasing degree of causalness of the French verb and the values in parentheses following the Spanish verbs indicate the causalness rank of the Spanish verbs. For both the French and the Spanish verbs the causalness, the encoding of the anticausative alternant (%mAC) and the encoding of the causative alternant (%mC) are indicated.
In order to interpret the data with respect to similarities or differences between corresponding verbs, I have categorized the 14 verb pairs on the basis of two binary features: similarity in causalness and similarity in encoding.Footnote [15] The possible combinations of the two features yield the following four categories: (i) similar causalness and similar encoding, (ii) similar causalness but different encoding, (iii) different causalness and different encoding, and (iv) different causalness but similar encoding. Table 11 indicates the number of verb pairs in each of the four categories.
Table 11 Similarity between French and Spanish corresponding verbs.

In eight of the 14 pairs, the corresponding verbs have similar causalness and similar encoding, in four pairs they have different causalness and different encoding, and in two pairs they have different causalness but similar encoding. None of the verb pairs has similar causalness, but different encoding. The respective verb pairs for the three attested categories are given in (17)–(19).
Although the total number of verb pairs that has been investigated is rather small, the distributions in Table 11 show nevertheless that corresponding verbs tend to behave similarly with respect to causalness and encoding: Verbs with similar encoding tend to have similar causalness and verbs with similar causalness tend to have similar encoding.Footnote [16] This agrees well with the above-mentioned general commonalities between French and Spanish (robust correlations between causalness and encoding, marked anticausatives are more frequent than marked causatives, verbs tend to form either unmarked or marked anticausatives).
If we only consider the verb pairs with similar causalness and verify whether they are also similar with respect to encoding, we see that eight of the eight verbs with similar causalness have similar encoding (see Table 11). Conversely, if we only consider the verb pairs with similar encoding and verify whether they behave similarly with respect to causalness, we see that out of the 10 verb pairs with similar encoding eight pairs have similar and only two pairs have different encodings (see Table 11). We can thus conclude that (i) verb pairs with similar causalness tend to have similar encoding and that (ii) verb pairs with similar encoding tend to have similar causalness.
As concerns the four verb pairs with different causalness and different encoding, it should be noted that their behavior is in line with the overall result of this study, namely that causalness and encoding correlate. It is expected that verbs which differ with respect to causalness also differ with respect to encoding, and vice versa. The fact that the verbs of these pairs differ with respect to both causalness and encoding relates to an issue other than the tested prediction, namely, whether verbs of different languages which have the same or similar meaning also have similar causalness and encoding.Footnote [17] Thus, the only verb pairs that do not fit the expectations are the two pairs which have different causalness but similar encoding.
4. Conclusions and outlook
In this paper I have investigated the encoding of French and Spanish verbs that participate in the causative–anticausative alternation. The main question was whether the encoding is related to the causalness of the verbs, i.e. their frequency of use as a causative and anticausative. On the basis of a corpus study of 20 French and 20 Spanish verbs I have shown that causalness and encoding correlate in both languages: Verbs with a high degree of causalness tend to form marked anticausatives and unmarked causatives more often than verbs with a low degree of causalness, and verbs with a low degree of causalness tend to form marked causatives and unmarked anticausatives more often than verbs with a high degree of causalness. Further commonalities between the two languages that have been identified are that marked anticausatives are more frequent than marked causatives, and that the verbs tend to form either unmarked or marked anticausatives. The only difference between the two languages that has been detected is that marked anticausatives (as opposed to unmarked anticausatives) are more frequent in Spanish than in French. The comparison of 14 French–Spanish corresponding verbs has shown that verb pairs with similar causalness tend to have similar encoding and that verb pairs with similar encoding tend to have similar causalness.
The main empirical result of the study – namely the correlation between causalness and encoding – raises the question whether the observed correlation corresponds to a causal relation. This question is relevant because of the fact that a correlation between two variables does not imply a causal relation between the two variables.
In several recent publications, Haspelmath (Reference Haspelmath2006, Reference Haspelmath2008) and Haspelmath et al. (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) have argued that causalness is causally related to the encoding of the alternation via predictability. Haspelmath (Reference Haspelmath2008: 5) assumes that ‘[t]he more predictable a sign is, the shorter it is’, and since frequency implies predictability he concludes that ‘[t]he more frequent a sign is, the shorter it is’. The consequence for the encoding is that ‘[w]hichever member of the pair [causative or anticausative] occurs more frequently tends to be zero-coded, while the rarer (and hence less expected) member tends to be overtly coded’ (Haspelmath Reference Haspelmath2008: 13). The causal chain that results from these assumptions is represented in (20).
An alternative way to account for the correlation between causalness and encoding would be to suggest that causalness and encoding are not causally related, but instead correlate because they have a common source, namely the verbs’ spontaneity.Footnote [18] The respective causal chain is given in (21) (see also Heidinger Reference Heidinger2012).
As concerns the relation Spontaneity $\rightarrow$ Causalness, the basic idea is that spontaneous verbs form more often anticausatives than causatives because they denote events where the external cause is not salient and the anticausative construction is one where the external cause is not expressed as an argument. Conversely, non-spontaneous verbs form causatives more often than anticausatives because they denote events where the external cause is salient and the causative construction is one where the external cause can be expressed as an argument.
Under the assumption that causalness is a reflex and thus an indicator of spontaneity (low causalness = high spontaneity), the observed correlation between causalness and encoding (see Section 3 above) is also one between spontaneity and encoding. In (21) above the relation between spontaneity and encoding is represented as a causal relation where spontaneity influences encoding (Spontaneity $\rightarrow$ Encoding). To get an idea of how spontaneity might influence encoding we must take a look at French anticausatives.
It has been argued in the literature that the two types of French anticausatives differ semantically in that the event is presented as occurring more spontaneously if it is encoded by unmarked anticausatives than if it is encoded by marked anticausatives.Footnote [19] The correlation between spontaneity and encoding could then be interpreted as the result of combinatory preferences: Spontaneous verbs combine more easily (and thus more often) with the encoding variant unmarked anticausative than non-spontaneous verbs do, because this encoding is semantically more spontaneous.Footnote [20]
An in-depth evaluation of this alternative to Haspelmath’s (Reference Haspelmath2006, Reference Haspelmath2008) and Haspelmath et al.’s (Reference Haspelmath, Calude, Spagnol, Narrog and Bamyacı2014) frequency-based account requires further research in at least two domains: spontaneity as a lexical semantic property of verbs and the semantic differences between the encoding variants of the two alternants. As concerns spontaneity, it is desirable that other diagnostics for a verb’s spontaneity besides causalness be identified. This would allow one to re-evaluate the relation between spontaneity and causalness. But a more direct access to a verb’s spontaneity is also necessary to evaluate the assumed causal relation between spontaneity and encoding (the first step would then be to verify whether there is indeed a correlation between spontaneity and encoding). The outlook from the empirical results presented in this paper is thus that the semantic side of the phenomenon needs to be investigated in greater detail in order to interpret the observed correlation between causalness and encoding.
APPENDIX A
Coding decisions
This appendix describes how coding decisions were made during the annotation of the corpus data. As a basic principle, all hits from the corpus searches were looked at and coded, i.e. a complete manual coding of the data has been conducted. On a first level the hits from the corpus searches fall into two classes: irrelevant vs. relevant tokens. On a second level the relevant hits belong to one of the four following subclasses: (i) unmarked causative, (ii) marked causative, (iii) unmarked anticausative, (iv) marked causative.
In the actual coding procedure of the corpus data, I looked at all hits from the corpus searches and decided whether a given hit is an instantiation of (A1i–iv) or not. In the positive case, the hit was coded for the respective construction (e.g. unmarked causative). In the negative case the hit was coded as irrelevant for the present study. In the following I describe the four relevant constructions (which are exemplified in (3) and (4) in Section 1 of this paper) and also make reference to those constructions which have been excluded.
As unmarked causative, I have coded hits such as (3a), in which
Regarding the subject, it is always overtly expressed in French, while Spanish is a language with null subjects; the overt expression of the subject is therefore not a criterion in the Spanish data. Hits with null subjects are part of the set of relevant Spanish data.
The restriction to active sentences excludes passives from the set of relevant data. This restriction includes both stative and eventive passives, and all formal types of passives which can be encountered in the two languages: periphrastic passives with auxiliaries, reflexive passives, and impersonal passives. Further, all types of impersonal constructions have been coded as irrelevant.
As marked causative I have coded hits such as (3b), in which
Further, I have excluded formal causativizations of marked anticausatives (e.g. French faire se + verb) as well as anticausativizations of marked causatives (e.g. French se faire + verb), as both cases involve a double marking on the initial unmarked form of the verb.
As unmarked anticausatives I have coded hits such as (4a), in which
One complication in the coding of the Spanish hits comes from the possibility of null subjects in this language. Hits where the only overtly expressed argument in the sentence is an undergoer are in principle ambiguous between interpretations as an unmarked anticausative (undergoer–V) or an unmarked causative (Øactor–V–undergoer). In many cases, however, this ambiguity disappears because of a mismatch between the number inflection of the verb and the number features of the undergoer-argument. In the remaining cases the context provided enough information to decide whether the sentence includes an implicit subject or not (in the first case the hit was coded as an unmarked causative, in the second case it was coded as an unmarked anticausative); doubtful cases were discussed with Spanish native speakers.
As marked anticausative I have coded hits such as (4b), in which
Due to the polyfunctionality of the reflexive clitic se in the two languages, the combination of all three criteria is important. The first two criteria (the clitic se and an undergoer as the sole argument) are also fulfilled by the reflexive passive. Hence the third criterion according to which no cause should be implied which distinguishes anticausatives from passives.
A final remark concerns the semantics of the verbs. Many of the verbs are attested in the corpus data both in concrete and figurative uses. Note that in the coding of the data, I have not taken into account this difference. The coding of the data as relevant or irrelevant solely relies on the above-mentioned criteria, but not on whether they show a concrete or figurative use of the verb.
APPENDIX B
Absolute frequencies underlying data analysis
Table A1 Encoding of the causative and the anticausative alternant in French (absolute frequencies).

Table A2 Encoding of the causative and the anticausative alternant in Spanish (absolute frequencies).
