Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-11T07:11:33.249Z Has data issue: false hasContentIssue false

Subgrouping of Coahuitlán Totonac1

Published online by Cambridge University Press:  14 July 2016

Devin Moore*
Affiliation:
University of Alberta
Rights & Permissions [Opens in a new window]

Abstract

Coahuitlán Totonac is spoken in Veracruz, Mexico, and has been variously ascribed to two different branches of the Totonacan family tree. While recent work has begun to bring empirical evidence to the internal structure of this family tree, there remain several important areas of disagreement, in addition to the disputed affiliation of Coahuitlán. This article informs the family tree and demonstrates that Coahuitlán belongs to the Northern branch using shared innovations and two computational methods. The comparative method seeks sets of shared innovations for evidence of subgrouping. This article presents proposed shared innovations in phonology, morphology, and lexicon, which fall into two sets, one belonging to the Sierra and Lowland branches, and the other belonging to the Northern. Coahuitlán Totonac overwhelmingly shares innovations found in Northern languages and lacks innovations found in Sierra. Two quantitative methods are also used to show that Coahuitlán groups groups closely with other Northern languages.

Résumé

Coahuitlán Totonac est parlé à Veracruz au Mexique et s'est vu assigné à deux branches différentes de l'arbre familial Totonacan. Malgré les travaux récents qui portent de nouveaux faits empiriques concernant la structure interne de cet arbre familial, plusieurs sujets inspirent encore la controverse. À l'aide d'innovations communes et de deux méthodes computationnelles, cet article éclaircit l'arbre familial et montre que Coahuitlán appartient à la branche du nord. La méthode comparative cherche des innovations communes pour établir des sous-groupes. Cet article présente des innovations communes aux niveaux phonologiques, morphologiques, et lexiques qui se divisent en deux groupes, l'un appartenant aux branches Sierra et Lowland, l'autre appartenant à la branche du nord. Coahuitlán Totonac présente surtout ces innovations caractéristiques des langues du nord et manque des innovations attestées en Sierra. Deux méthodes quantitatives sont également employées afin de montrer que le groupe Coahuitlán est similaire aux autres langues du nord.

Type
Articles
Copyright
© Canadian Linguistic Association/Association canadienne de linguistique 2016 

1. Introduction

Coahuitlán Totonac (Ch) is a largely undescribed language spoken in the municipality of Coahuitlán, Veracruz, Mexico, by about 3,800 speakers (SEFIPLAN 2013). Coahuitlán is located on the border of two branches of the Totonac family, Northern and Sierra (see Figure 1), and the variety has been variously described as belonging to one or the other. It is first mentioned in the literature as one of the communities marking the northern boundary of Sierra Totonac, where Aschmann (cited in Ichon Reference Ichon1973) describes it as part of the Sierra branch. Brown et al. (Reference Brown, Beck, Kondrak, Watters and Wichmann2011) tentatively place Coahuitlán (there spelled ‘Cohuahuitlán’) in the Northern branch, based on reports of higher mutual intelligibility between speakers of Coahuitlán and speakers of Upper Necaxa Totonac. My own fieldwork in Coahuitlán confirms that speakers there consider Upper Necaxa Totonac more intelligible than the varieties spoken in Coyutla or Filomeno Mata. MacKay and Trechsel (Reference MacKay and Trechsel2011) also place Coahuitlán in the Northern branch, on the basis of a few morphological patterns. Ethnologue (Lewis Reference Lewis, Simons and Fennig2015) does not treat Coahuitlán Totonac as a variety per se, instead grouping it with nearby Filomeno Mata, and claiming that “Filomeno Mata-Coahuitlán” is “linguistically between” Northern and Sierra. McFarland (Reference McFarland2009) claims that Filomeno Mata Totonac is a highlands, or Sierra, variety, although with characteristics of both Northern and Sierra branches; and that it is distinct from Coahuitlán Totonac. Speakers of Coahuitlán Totonac do not consider Filomeno Mata Totonac to be the same variety, and report low mutual intelligibility between the two varieties. This article will demonstrate that Coahuitlán belongs to the Northern branch by proposing shared innovations in phonology, morphology, and lexicon. Given the relative absence of systematic comparative work, the cognate sets and shared innovations presented here will also contribute to a better understanding of the larger Totonacan language family. Two quantitative analyses are also presented, which support the conclusions of the traditional method.

Figure 1: Traditional classification

Section 2 provides a brief background of Totonacan languages and their traditional and more recent classification. Section 3 presents the primary evidence for my argument in the form of proposed shared innovations. The section begins with the theory of subgrouping by shared innovations, and examines phonological (3.1), morphological (3.2), and lexical (3.3) isoglosses to posit shared innovations relevant to subgrouping. Because of the relative homogeneity of phonological and morphological systems across Totonac languages, shared lexical innovations offer the most productive source of evidence. Section 4 presents additional evidence based on computational methods: first, lexical similarity evidence using ASJP algorithms (4.1), and second, a phylogenetic network of Totonacan languages (4.2). Conclusions are discussed in Section 5. Appendix A provides a complete list of the languages and their abbreviations as used in this article, and the primary data sources.

2. Background

Totonacan languages are spoken in eastern-central Mexico in the states of Puebla, Veracruz, and the eastern edge of Hidalgo, in the Sierra Madre Oriental and along the Gulf Coast. Totonacan is a well-established language family, but there has been little systematic work on its internal subgrouping. The traditional classification, represented in Figure 1, is based on various hypotheses put forward by early fieldworkers (McQuown Reference McQuown1940, Arana Osnaya Reference Aranya and Bernal1953, García Rojas Reference García Rojas1978, Ichon Reference Ichon1973, Levy Reference Levy1987, and MacKay Reference MacKay1994).

The traditional classification divides the family into two main branches, Tepehua and Totonac. There are three Tepehua languages – Huehuetla, Pisaflores, and Tlachichilco. The exact number of Totonac languages is unknown, though estimates range from three or four (MacKay Reference MacKay1999) to between 14 and 20 (Brown et al. Reference Brown, Beck, Kondrak, Watters and Wichmann2011). These are grouped into four branches: Misantla, Northern, Lowland (or Papantla), and Sierra. Misantla Totonac is spoken in a few communities in the area south of the major urban centre of Misantla, Veracruz. Lowland is spoken by a number of communities around Papantla, Veracruz, and along the Gulf coast. The Lowland branch is sometimes called Papantla Totonac, which also refers to the only documented language in the Lowland Branch. Northern and Sierra are both spoken in many communities in the Sierra Madre, mostly in the state of Puebla. Figure 2 shows a map of some of the larger Totonacan communities.

Figure 2: Map of Totonacan communities

This classification was constructed with little empirical basis. While the general lack of documentary and comparative work in the family poses an ongoing challenge, there are recent attempts to study the classification of Totonacan languages empirically. MacKay and Trechsel (Reference MacKay and Trechsel2011, 2015) have investigated morphological patterns of Totonacan languages. Brown et al. (Reference Brown, Beck, Kondrak, Watters and Wichmann2011), attempt to reconstruct proto-Totonacan roots for comparison with other language families. This work has begun to provide support for the basic structure of the family, but significant disagreements remain concerning the relationships within these higher level branches, and the assignment of individual languages to specific branches, including the disputed affiliation of Coahuitlán. Brown et al. (Reference Brown, Beck, Kondrak, Watters and Wichmann2011) present a tentative classification, summarised in Figure 3, that is based largely on lexicostatistical analysis carried out by the ASJP consortium (Wichmann et al. Reference Wichmann, Müller, Wett, Velupillai, Bischoffberger, Brown, Holman, Sauppe, Molochieva, Brown, Hammarström, Belyaev, List, Bakker, Egorov, Urban, Mailhammer, Carrizo, Dryer, Korovina, Beck, Geyer, Epps, Grant and Valenzuela2013) instead of on informed determinations of cognacy.

Figure 3: Brown et al.'s (2011) Classification

The main features of this classification involve further subgrouping of the Totonac branches. Misantla has been set off from what they call Central Totonac, which includes Northern further set against Lowland-Sierra. They do not include data from Ch, but have grouped it with the Northern branch based on reported high mutual intelligibility with Northern varieties. MacKay and Trechsel (Reference MacKay and Trechsel2015) investigate phonological, morphological, and lexical data. They present three morphological features of Sierra languages, the presence of which they consider to be necessary and sufficient to determine Sierra affiliation. They conclude that there is some support for the traditional classification generally; however, they suggest a Sierra vs Northern-Lowland distinction at odds with Brown et al.'s proposed Northern vs. Sierra-Lowland, noting that although Sierra and Lowland share some lexical items, Lowland does not have the morphological features of Sierra languages. They include Coahuitlán in the Northern branch based on the fact that it lacks their three features of Sierra languages. While the present article tends to support a Northern vs. Sierra-Lowland split, it aligns with MacKay and Trechsel in placing Coahuitlán in the Northern branch, here by means of shared phonological, morphological, and lexical innovations; lexical similarity evidence from ASJP and a phylogenetic network created in SplitsTree4.

Focusing on the Sierra, Lowland, and Northern languages (Central), I present two sets of shared innovations, one shared by Northern languages, and another shared by Sierra-Lowland languages. Coahuitlán Totonac overwhelmingly shares innovations found in Northern languages and lacks innovations found in Sierra.

3. Shared innovations

The primary criterion for subgrouping is shared innovation (Fox Reference Fox1995, Campbell Reference Campbell2013). A shared innovation is a feature belonging to a subset of daughter languages that set it off from other members of its family. The shared innovation is assumed to have occurred in an intermediate proto-language, which then diversified into the subset of languages which all share the innovated feature. Languages with the shared innovation inherited it from this intermediate proto-language, while languages without the shared innovation do not descend from that intermediate parent. However, there are reasons besides shared innovation that a subset of languages may have features in common. Three alternate scenariosFootnote 2 that may result in similarity but do not give evidence for shared subgrouping are 1) shared retention, 2) parallel innovation, and 3) contact.

Shared retention refers to a feature of the proto-language that has been retained in a subset of daughter languages. Retention alone does not give evidence for subgrouping because many scenarios may have lead to individual languages keeping or losing some particular feature – closely related languages may differ in retention of some features, and distantly related languages may have independently retained other features. A simple example of shared retention is the presence of škaːn in Tepehua, Misantla, and Northern branches (see section 3.3.1, below). The fact that this word has been retained in each branch does not support any subgrouping of these branches into a single unit.

Parallel innovation, also called language drift (Sapir Reference Sapir1921), is when innovations occur independently in different branches of a set of related languages. Often these are due to typologically common processes, like the devoicing of word-final stops, which may have existed in the proto-language, and undergo independent innovation in a subset of daughter languages. However, whether or not a daughter language undergoes this innovation is not dependent on other daughter languages, and does not give evidence for subgrouping. An example in Totonacan is the case of the alveolar lateral affricate *ƛ. This undergoes the same diachronic shift in two distant branches of the family, as seen in the cognate set for ‘to walk around’, shown in Figure 4.Footnote 3

Figure 4: ‘to walk around’

In Ch and M, *ƛ has changed to /t/. One might initially posit that Ch and M form a subgroup together, but this one piece of evidence is at odds with the geographic distance and great lexical dissimilarity between Ch and M. Further investigation reveals that in other languages, /ƛ/ is described as unstable. Levy's (Reference Levy1987) phonology of Papantla Totonac notes that, while prosodic evidence points to /ƛ/ as a phoneme, there is considerable variation by word and by utterance in Papantla, with a wide range of possible realizations including [ƛ], [ɬ], [t], [lʔ], and even an oddly metathesized [ɬt]. This variation by word and utterance is attested in other languages as well. While at first this may have been taken to be evidence of a relationship between two distant and dissimilar varieties, *ƛ was a good target for sound change, which occurred independently in M, and Ch.

The last scenario is horizontal transfer, or contact. A language of one branch that has borrowed a feature innovated by another branch might appear to have shared the innovation rather than just sharing the feature. A borrowing from an unrelated language appearing in languages from different branches could likewise appear to give evidence of an innovation shared by the languages possessing it. The influence of Spanish on Mesoamerican languages is an example where contact with an outside language has resulted in numerous shared forms in multiple languages that result not from genetic descent, but from borrowing. Another example of contact is what appears to be a dialect-chain phenomenon in Totonacan. Vowel-glide-vowel sequences in some languages are reduced to one long vowel. This is most often /awa/ to /oː/ and /aya/ to /eː/, as shown in Figure 5, but can also include other vowels, as seen here with /awi/ in the case of A ‘sit down’.

Figure 5: Glide reduction

The /awa/ form is well attested in other branches, but while this sound change is attested in the lexica for only these three Northern varieties, which are all geographically close, there are two reasons to believe that it does not represent a shared innovation exclusive to the Northern branch. First, the sound change is attested in both A and U, but is in fact restricted to just one of two dialects of U. While this sound change has run its course in Ch, which does not allow the unreduced forms, in A and U (Patla), the reduced and unreduced forms appear in variation conditioned largely by rate of speech, with the reduced form appearing in faster, more fluent speech and the unreduced form in slower, more emphatic, or elicited speech. However, in U (Chicontla) has only the unreduced form, and this sound change does not occur (Beck, p.c.). If this were an innovation shared across the Northern branch, one would not expect such a dialectal difference in U. Second, while reduced forms do not appear in the lexica for Sierra languages, they do occur in the speech of many Sierra languages, conditioned by rate of speech (Levy, p.c.). In addition to the spread of this change reported in Sierra, there is a similar phenomenon involving glide reduction which is found in Tepehua languages, as shown in the cognate set for ‘you (sg)’, in Figure 6.

Figure 6: ‘you (sg)’

Where Totonac wiš ‘you (sg)’ begins with a glide-vowel sequence, this seems to have been reduced to a single vowel in Tepehua languages. As in Totonac, the place features of the reduced form seem to have been affected by the glide. Glide reduction seems to have some features of a dialect chain, and may be conditioned by contact between neighbouring languages.

Because of these potentially confusing factors, a comparison of shared features in a family is likely to give conflicting evidence for groupings. A single piece of evidence does not give the full picture, and the best evidence for internal relations is sets of multiple shared innovations, just as the best evidence for genetic relatedness is sets of correspondences. Ultimately, these sets of innovations allow the reconstruction of a proto-form from which the development of each language's synchronic form may be traced.

The first step to identifying shared innovations is to assemble isoglosses. The term isogloss originally referred to a line that could be drawn on a map to represent the geographical boundaries of regional linguistic variants based on the distribution of particular dialectal features. It is used here by extension to refer to the dialect features themselves. The distribution of these features can provide evidence for shared innovations. Each of the following sections includes a grammatical or lexical feature that is shared by a subset of the varieties under examination, beginning with a single phonological isogloss (section 3.1), morphological isoglosses (section 3.2), and lexical isoglosses (section 3.3). Ch patterns closely with other Northern languages, sharing all but one of six Northern innovations, and sharing only two of 18 Sierra innovations (section 3.4).

3.1 Phonological isoglosses

One type of shared innovation that can be used to establish phylogenetic proximity between two languages is regular sound change. This type of innovation is especially useful because it is relatively salient and often affects large numbers of items. Totonac languages have relatively homogenous phonological systems, and nearly identical reflexes populate many cognate sets. This is perhaps due to the shallow time depth of the family, and a relative paucity of phonological innovation. It may also be due in part to obscuring factors, such as parallel development and horizontal transfer, as in the cases of *ƛ and /awa/, seen above.

There is one phonological isogloss that provides evidence for a shared Northern innovation, which is also shared by Ch. While Proto-Totonacan has been described as having a three-vowel system (/i/, /u/, and /a/), most varieties have mid vowels, /e/ and /o/, as allophones of /i/ and /u/, respectively (Brown et al. Reference Brown, Beck, Kondrak, Watters and Wichmann2011). These vowels are conditioned primarily by proximity to /q/ or proto *q. Brown et al. (Reference Brown, Beck, Kondrak, Watters and Wichmann2011) claim that the emergence of phonemic /e/ and /o/ is a distinguishing feature of the Northern branch, supported by the appearance of /e/ and /o/ in U and A forms without a clearly identifiable conditioning environment, although they note that proximity to *x, and to a lesser extent to *y, may be a possible conditioning environment for the development of /e/ and /o/ in Northern. I do not exhaustively argue here for the phonemic status of /e/ and /o/ in Ch, but advance a number of cognate sets, shown in Figure 7, where /e/ and /o/ appear in this environment of proximity to *x and *y, but not to *q.

Figure 7: Cognates with unconditioned /e/ and /o/

In these glosses, A, U, and Ch consistently pattern together with regard to /e/ and /o/, while the other languages typically have /i/ or /a/ instead of Northern /e/, and /u/ instead of Northern /o/. Assuming that the three-vowel system traditionally reconstructed for proto-Totonacan is correct, this supports the conclusion that Northern languages – including Ch – share the innovation of phonemic /e/ and /o/.

3.2 Morphological isoglosses

Like sound changes, changes in morphology may also provide evidence of shared innovations that can be used as examples of affinity and subgrouping. I first consider three isoglosses suggested by MacKay and Trechsel (Reference MacKay and Trechsel2011, 2015): marking of 2nd-person subject and 1st-person object (section 3.2.1), unselective 3rd-person plural agreement marker (section 3.2.2), and word-final [y] (section 3.2.3). Following that, I look at three isoglosses from Beck (Reference Beck, Levy and Beck2012): the locative (section 3.2.4), the desiderative (section 3.2.5), and the negative (section 3.2.6).

3.2.1 Marking of 2nd-person subject and 1st-person object

All Totonacan languages have agreement for the person and number of both subject and object on transitive verbs. MacKay and Trechsel (Reference MacKay and Trechsel2015) observe that a distinguishing morphological feature of Sierra languages is what they refer to as “unambiguous marking” of 2nd-person subjects with 1st-person objects. Tepehua, Misantla, Northern and Lowland languages use a non-compositional pattern of affixes to express second person subject and first person object agreement when either subject, object, or both is plural, as in (1):Footnote 6

  1. (1) U

    kila:musuːyá:uw

    ki–laː–musuː–ya–w

    1objrecip–kiss–impfv–1pl.subj

    ‘you (sg.) kiss us’, ‘you (pl.) kiss me’, ‘you (pl.) kiss us’

    (Beck Reference Beck2004: 34)

This combination of morphemes expresses any of three scenarios: 2nd singular subject and 1st plural object, 2nd plural object and 1st singular subject, and 2nd plural subject and 1st plural object.

Sierra languages have innovated distinct, unambiguous affixal sequences for these combinations of persons and numbers that specify the number of both subject and object, as in (2):

  1. (2) Ct

    1. a. kinkaːpaːškiːyaʔ

      kin–kaː–paːškiː–ya–ʔ

      1objpl.obj–love–impfv–2sg.subj

      ‘you (sg.) love us’

    2. b. kimpaːškiːyatín

      kin–paːškiː–ya–tin

      1obj–love–impfv–2pl.subj

      ‘you (pl.) love me’

    3. c. kinkaːpaːškiːyatín

      kin–kaː–paːškiː–ya–tin

      1objpl.obj–love–impfv–2pl.subj

      ‘you (pl.) love us’

      (McQuown Reference McQuown1990: 166, 169; interlinear gloss added)

These Sierra constructions are unambiguous and compositional. Ch has the non-compositional construction analogous to that in (1) above, as shown in (3).

  1. (3) Ch

    kila:pucayá:w

    ki–laː–puca–ya–w

    1objrecip–search–impfv–1subj.pl

    ‘you (sg.) look for us’, ‘you (pl.) look for me’, ‘you (pl.) look for us’

As in most Totonacan languages, Ch uses 1st-person object agreement, the reciprocal marker, and 1st-person plural subject agreement to mark agreement for the three scenarios with 2nd-person subject and 1st person object when one or both are plural. Ch thus lacks the Sierra innovation.

3.2.2 3rd unselective plural /quː/

The second feature said by MacKay and Trechsel to be typical of Sierra languages is the 3rd-person plural agreement marker, /-quː/ or /-qṵː/. This agreement marker is unselective in that it does not select for subject or object, instead marking agreement with a subject or object that is 3rd-person plural, as in (4).

  1. (4) Ol

    lkapáːstákqɔ́ːh

    laka=paːstak–quː–ya

    remember.X–plimpfv

    ‘she/he/it remembers them’, ‘they remember her/him/it’, ‘they remember them’

    (MacKay and Trechsel Reference MacKay and Trechsel2015: 20)

In this example, /quː/ marks agreement with an unspecified 3rd-person plural argument, allowing the ambiguous meaning of either a 3rd-person plural subject, a 3rd-person plural object, or both.

The source of this marker in Sierra varieties is the pan-Totonac terminative or totalitive suffix, which marks “the termination of an event, and/or that all participants in an event have been affected” (Beck Reference Beck, Levy and Beck2012: 593), as in (5).

  1. (5) U

    taa̰knuːʔo̰ːɬcá̰

    ta–a̰k–nuː–ʔo̰ː–li=cá'

    dcs–head–in–totpfv=now

    ‘he sank in completely’

    (Beck Reference Beck2011: 61)

In this example, /ʔo̰ː/ acts as the totalitive and conveys aspectual meaning, namely, that he sank all the way, completely.

Ch does not exhibit this Sierra innovation. Ch /qoː/ is not used as an agreement marker, but instead acts as a true totalitive (6a). Different patterns mark third-person plural agreement of subject (6b) and object (6c).

  1. (6) Ch

    1. a) makaskakqoːɬ

      maka–skak–qoː–li

      hand–dry–tot–pfv

      ‘she finished drying her hands’

    2. b) talaqcín

      ta–laqcin

      3pl.subj–see

      ‘they see her/him/it’

    3. c) kaːlaqcín

      kaː–laqcin

      pl.obj–see

      ‘she/he/it sees them’

The pattern in (6b) and (6c) is the same as in Northern languages for analogous sentences. Ch lacks the Sierra innovation of /qoː/ as a person agreement marker; instead /qoː/ is used as a totalitive marker, and there are different affix patterns to mark 3rd-person plural agreement of subject and object.

3.2.3 Word-final sonorant [y]

MacKay and Trechsel's final Sierra isogloss is a palatal sonorant, [y], which occurs word-finally as the marker of the imperfective aspect for vowel-final stems, shown in (7).

  1. (7) Ct

    taštúy

    ta–štu–ya

    inc–outside–impfv

    ‘she/he exits’

    (MacKay and Trechsel Reference MacKay and Trechsel2015: 26)

In some varieties, this final /y/ is realized as an aspirated palatal fricative or glottal fricative, as in (8).

  1. (8) Oz

    1. a) ɬtatáh

      ɬtata–ya

      sleep–impfv

      ‘she/he sleeps’

      Ol

    2. b) ɬtatáç

      ɬtata–ya

      sleep–impfv

      ‘she/he sleeps’

      (MacKay and Trechsel Reference MacKay and Trechsel2015: 26)

The source of this sound after vowels seems to be the pan-Totonacan imperfective morpheme /-ya/, which is realized in Sierra languages at the end of verbs as [y] if there are no other morphemes following it. In non-Sierra varieties, such as U, (9a), the imperfective morpheme is entirely unrealized in this position.

  1. (9) U

    1. a) taštú

      taštu–yaː

      exit–impfv

      ‘she/he exits’

    2. b) taštuyaːtít

      taštu–yaː–tít

      exit–impfv–2pl.subj

      ‘you (pl.) exit’

      (Beck Reference Beck2004: 34; interlinear gloss added)

In (9a), the verb is imperfective, but no form of /-yaː/ surfaces, while in (9b), the imperfective marker appears between the stem and the 2nd-person plural subject marker. Beck (Reference Beck2004) describes this as a morphophonemically conditioned syncope, where the imperfective morpheme is realized only in cases where it is “protected” by a following morpheme.

Ch behaves like U, and does not realize the imperfective morpheme in unprotected position (10).

  1. (10) Ch

    1. a) kaci

      kaci–ya

      exit–impfv

      ‘she/he knows’

    2. b) kaciyaːtít

      kaci–yaː–tít

      exit–impfv–2pl.subj

      ‘you (pl.) know’

Sierra languages share this feature, and Ch does not pattern accordingly. However, if /yaː/ is the source of this feature in Sierra, it is odd for an innovation to restore phonological material that has been lost in other branches. There seems to be a familial drift towards final syncope and devoicing, which may have been arrested in the Sierra branch for some reason. Alternatively, the word final [y] may have a different source, perhaps created by some kind of phonotactic process in final stressed vowels.

3.2.4 Locative

Totonacan languages have a locative prefix or clitic, which differs in form across the family, as shown in Figure 8.

Figure 8: Locative

Focusing on the Northern, Sierra and Papantla varieties, there are two forms, nak- as a prefix or clitic, and the prefix k-, likely a phonologically reduced form. The reduced form seems to be limited to Sierra varieties (FM, Ol, On) and P, and may represent a Sierra-Lowland innovation. U has a different reduction, in some cases reducing nak = to n(a) = or even ŋ = (Beck, p.c.), nak = to n(a) = is also reported in Apapantilla (Reid Reference Reid1991: 76). Ch does not have the Sierra-Lowland innovation, as it does not allow shortening to k-.

3.2.5 Desiderative

Totonacan languages have a desiderative marker, which attaches to verbs to indicate the subject's desire for the completion of the action of the verb (Beck Reference Beck2004). There are three forms for this marker: -pṵtun/-putun, pan-, and -kṵtun/-kutun, as shown in Figure 9.

Figure 9: Desiderative

The form pan- is exclusive to M, and has not been etymologically connected to the other forms. Tepehua, Sierra, and Papantla (H, Pf, T, Co, Ct, Ol, On, Z, P) have -pṵtun, while Northern varieties (A, U, Ch, FM) have -kṵtun. Footnote 7 It seems likely that -kṵtun represents a shared Northern innovation – the replacement of /p/ by /k/ – because the Tepehua and Sierra languages both share the form -pṵtun. Ch shares this Northern innovation.

3.2.6 Negative

There is considerable formal variation across Totonacan varieties in the negative morpheme, as shown in Figure 10.

Figure 10: Negative

There seem to be three etyma, one belonging to Tepehua, one to Northern and Misantla, and one to Sierra. The reflex in CX is similar to Tepehua, as is that in A. If these forms are shared retentions, the Tepehua etyma likely represents the proto-Totonacan form, but these forms are also possible borrowings, as A and CX, a variety whose affiliation, like that of Ch, has not been definitively determined, are both somewhat adjacent to Tepehua. The ɬaː/laː/xaː forms in Misantla, Northern (U, Ch, Zh), and FM seem to represent a Totonac innovation, perhaps originally *laː, as it remains in M and Ch, changing to ɬaː in FM and Zh. U xaː is possibly derived from this, a hypothesis reinforced by the presence of an archaic form ɬaː. Another hypothesis is that these forms came from *xaː or *haː, perhaps taken from the first part of haːntu, although the directionality of the sound change from /x/ to /ɬ/ and /l/ seems less plausible. The Sierra languages Co, Ct, Ol, On, Z, and Lowland P have the form niː, an apparent Sierra-Lowland innovation that Ch does not share.

3.3 Lexical isoglosses

The lexical isoglosses were drawn from a manually compiled list of some 180 cognate sets from lexica for A, Ct, Co, FM, H, M, P, Pf, T, U, and Z, and my fieldwork in Ch. Of these sets, around one third were largely homogenous across Totonac and Tepehua. Another third provide strong evidence for a Tepehua-Totonac divide. Of the remaining cognates, some are obscured by lateral transfer, like the word ‘chicken’ for which many languages use some form of Spanish pollo, and some simply don't present a clear picture, such as the 3rd-person singular pronoun ‘he or she’. While the Tepehua forms are all similar to each other, the Totonac varieties have one of two forms, one that comes from a possessive form and another that seems to come from a demonstrative form. The distribution of these two forms does not seem to correspond to any a priori grouping. However, there are 16 isoglosses, which give clear and relevant evidence to the Central Totonac divisions of Lowland, Sierra, and Northern.

3.3.1 ‘water’ and ‘rain’

There are isoglosses for ‘water’ with two forms, škaːn and čučut, distributed as in Figure 11.

Figure 11: ‘water’

The form čučut ‘water’ is restricted to the Sierra-Lowland languages (Co, Ct, FM, Z, P), while škaːn is found in Tepehua, Misantla and Northern (H, Pf, T, M, A, U, Ch). This distribution suggests a Sierra-Papantla innovation, which is not shared by Ch.

The word škaːn has the further meaning of ‘rain (n)’ in those languages having this form. This meaning is seen most commonly in a periphrastic construction with min ‘to come’, which is found in Siera and Lowland, Northern, and Tepehua languages, as illustrated in (11).

  1. (11) Pf

    škáːn kamináʔ

    škaːn ka–min–ya–ʔ

    water irr–come–impfvfut

    ‘it will rain’

    (MacKay and Trechsel Reference MacKay and Trechsel2013: 209)

Interestingly, in those languages with čučut ‘water’, the word does not have this second meaning of ‘rain’. With the exception of FM, these languages use another form, saʔin/seːn, as in Figure 12.

Figure 12: ‘rain’

A reflex of this form is found in M, A, and U; however, in these languages the etymon has the meaning of ‘thunderstorm, downpour’, a difference in meaning signalled by italics in Figure 12. Since Misantla and Northern languages share this meaning, this is possibly the meaning of the proto-Totonac form. The Sierra and Lowland languages would thus share a lexical innovation in changing the meaning to ‘rain’.

Additionally, some of these languages have innovated a verb ‘to rain’ from the nominal form si:n, which is used instead of the min construction. Co, Ct, and Z all have a form siːnan or seːnan, ‘to rain’, derived from the nominal form by the suffix -nan, described as a detransitivizer that changes transitive verbs into intransitive, but also forms verbs “from nouns or adjectives denoting processes or activities strongly associated with the meaning of the root” (Beck Reference Beck2004: 64). P uses the min construction with sḛːn. FM also uses the min construction with sayín (McFarland, p.c.).

The forms čučut ‘water’ and siːnan ‘to rain’, and the shift in meaning of siːn from ‘downpour’ to ‘rain’ all represent Sierra-Lowland innovations. Ch has the conservative min construction, and uses škaːn to mean both ‘water’ and ‘rain’. It lacks all three Sierra-Lowland innovations.

3.3.2 ‘sand’

There are two roots for ‘sand’, one of which is restricted to Sierra and Lowland. The other is found across every other branch. These are shown in Figure 13.

Figure 13: ‘sand’

The form kuku is found in all four branches, including the most divergent divisions Tepehua and Misantla, which argues for it being the proto-form. The form muncaya is found only in the Sierra and Lowland languages, and is likely a Sierra-Lowland innovation not shared by Ch, and also not shared by Ct, which instead has the conservative form found in other Northern languages.

3.3.3 ‘see’

There are two roots for ‘see’, shown in Figure 14.

Figure 14: ‘see’

Tepehua, Misantla, and Northern share the form laqciːn, while the Sierra-Lowland languages, with the exception of FM, have the innovated form akšila/a̰kšiɬa, unique to that branch. Ch has the conservative form, not the Sierra-Lowland innovation.

3.3.4 ‘ear’

There are two forms for ‘ear’, one restricted to the Sierra-Lowland varieties, the other found across the rest of the family, as shown in Figure 15.

Figure 15: Nominal form of ‘ear’

The distribution of (q)aqašoɬ across Tepehua, Misantla, and Northern suggest that it is the proto-form, while ta:qéːn is a Sierra-Lowland innovation, not shared by Ch.

3.3.5 ‘big’

The words for ‘big’ are presented in Figure 16.

Figure 16: ‘big’

While the Tepehua and Misantla forms bear some resemblance to the Northern forms, there is clearly a new innovated form in the Sierra languages. Because the form langa/ƛanka is restricted to Sierra-Lowland languages, it is likely a shared innovation. Ch does not share this innovation.

3.3.6 ‘chest’

The isoglosses for ‘chest’ include two roots, shown in Figure 17.

Figure 17: ‘chest’

There is a clear Tepehua-Totonac division, with two Totonac forms, both based on kuš. It is not clear which Totonac form is innovative, but in this case, Ch patterns with the Sierra and Lowland languages, not with Northern.

3.3.7 ‘liver’

There are three related forms for ‘liver’, shown in Figure 18.

Figure 18: ‘liver’

Tepehua and Northern have one form, which includes the prefix mak- ‘body’, a combining bodypart prefix. Its presence in Tepehua and Northern suggests shared retention. This form is similar to the Misantla word, but M lacks the ‘body’ prefix. The Sierra languages Ct, FM, Z and Lowland P have another form, lacking the body prefix and with an extra final syllable. Ch does not share the Sierra and Lowland innovation, rather having the conservative form shared by Tepehua and Northern.

3.3.8 ‘leaf’

There is a very clear Sierra-Lowland isogloss for ‘leaf’, Figure 19.

Figure 19: ‘leaf’

Tepehua and Misantla have unique forms, and U, Co, and FM have the form pa̰ʔɬma̰/paɬma.Footnote 8 The form tuwaːn is found only in Sierra and Lowland languages and seems to be an innovation proper to these branches. Although tuwáːn is present in FM, it is a specific type of leaf; paɬma is the generic word ‘leaf’. Ch does not share this Sierra-Lowland innovation, instead using páɬma, as does U.

3.3.9 ‘fire’

The isoglosses for ‘fire’ include three distinct etyma, shown in Figure 20.

Figure 20: ‘fire’

The form ɬkúyaːt in Sierra and Lowland is likely the innovative form because Northern makskut is quite close to Misantla mukskut. Ch does not share the Sierra-Lowland innovation, though it has modified the form slightly and lost the /k/, with compensatory lengthening.

3.3.10 ‘girl’

The word for ‘girl’ has three forms, shown in Figure 21.

Figure 21: ‘girl’

Setting aside the Tepehua xac'iʔ, the forms in Northern and Sierra-Lowland are closely related. It seems that Sierra (Co, Ct, Z) and Lowland (P) have lost the penult syllable. FM has the unreduced from, as does Ch.

3.3.11 ‘tomorrow’

Three forms of ‘tomorrow’ are observed, all of which seem to have a common root li- or ɬi-. These forms are shown in Figure 22.

Figure 22: ‘tomorrow’

The two different Totonac forms seem to consist of a body part prefix: laqa- ‘eye’ in M, A, U, and Ch; and ča̰ː ‘shin’ in Ct, FM, Z, and P. As it is shared by Northern and Misantla, laqa- may represent a Totonac innovation, with the further Sierra-Lowland innovation being to use ča̰ː- instead. Ch uses the same form as other Northern languages and does not have the Sierra innovation.

3.3.12 ‘heart’

The word ‘heart’ has four forms, shown in Figure 23.

Figure 23: ‘heart’

The Tepehua and Northern etyma may be related, while Misantla has its own form. The form nakú clearly belongs to Sierra-Lowland. In this case, Ch has the Sierra-Lowland form, and not the Northern.

3.3.13 ‘nose’

There are two Totonac forms for the body part ‘nose’, shown in Figure 24.

Figure 24: ‘nose’

The Northern languages U and A have a form distinct from that found in the Sierra-Lowland languages. Typically the nominal form of a body part is formed from the combining form with the morpheme -nḭ following consonants, or -n following vowels (Beck Reference Beck2004). In the case of ‘nose’, the combining form is quite important. Tepehua, Northern, Sierra, and Lowland languages share the combining form kinka- / kanka-. This suggests that Sierra has retained the expected form kinkan / kankan, while Misantla and Northern have reduced this form. The alternative hypothesis would be that Misantla and Northern have reflexes of an irregular independent nominal (cf. ‘ear’, section 3.3.4), which has been regularized in Sierra (and Ch). However, this hypothesis would also require the derivation of kinka- / kanka- from a much smaller nominal (unlike ‘ear’, where the combining form qaqa- / aqa- can easily be derived from the longer irregular nominal qaqašqoɬ/ʔaqašqoɬ). It seems more likely that Misantla and Northern independently reduced kinkan to M kḭʔ and Northern kínḭ. In this case, the isogloss presents a Northern innovation, which Ch does not share.

3.3.14 ‘green’

The colour word ‘green’ is a clear Northern isogloss, shown in Figure 25.

Figure 25: ‘green’

Other branches have no separate word for the colour ‘green’ instead conflating green and yellow, as Ct smukúku ‘light yellow, green’ or green and blue Ct spukuku ‘blue, green’. The Northern languages have reflexes cognate with Ct for yellow and blue, but the form škayaːwa ‘green’ seems to be a Northern innovation. Ch shares this Northern innovation, as does FM.

3.3.15 ‘finger- and toenail, claw’

The words for ‘fingernail’ and ‘toenail’ are given in Figure 26.

Figure 26: ‘toe- and fingernail’

Many of these forms are created by the addition of a combining body-part prefix and a base. In Tepehua, these are the prefixes č'an- ‘foot’ and mak- or max- ‘hand’. Totonac has the prefixes tuː- ‘foot’ and related prefixes maq- (Northern) and maka- (Sierra) ‘hand’, the form maq- likely being a syncopated form of maqa-. H, Z, and P allow the stem to occur on its own, without a combining prefix, and Ct and FM do not have any form attested with a combining prefix. The stem can be used independently in the Sierra languages and H. Northern languages do not allow the stem to occur independently, which may represent an innovation to restrict the distribution of the stem. In addition to this behaviour, the stem has different forms; Tepehua qesiːt, Misantla -sεːh, Northern –siːn, and Sierra-Lowland siyín/sixán. Sierra-Lowland has a longer form, which may have changed independently in the other branches. The Tepehua form is only roughly similar, Misantla appears to have lost the final syllable, and Northern has lost the middle and undergone compensatory lengthening. To summarise, the Northern branch uses the prefixes, tuː- and maq-, a bound stem, and has compensatory lengthening of the stem. Ch shares each of these features.

3.3.16 ‘good’

There are three forms for ‘good’, Figure 27.

Figure 27: ‘good’

The form in M, qɔɬanáʔ, is particularly interesting because it appears to resemble both of the very different forms in Tepehua and Sierra-Lowland. It looks like the proto-form was something like the M form, from which Tepehua took the beginning, and Sierra-Lowland the end. Northern cex is unrelated and represents a clear Northern innovation, which is shared by Ch.

3.4 Summary

The innovations proposed fall into two groups: first, a large group of likely innovations shared by the Sierra and Lowland languages, and a smaller group of probable shared Northern innovations. Table 1 shows each innovation for the Central Totonac languages – A, U, Ch, Co, Ct, FM, Z, and P. The Tepehua languages have no innovations from either group, nor does M. As in the text above, the forms given do not represent reconstructed forms, and are instead used as shorthand for the innovation discussed above.

Table 1: Shared innovations in Central Totonac

The Northern languages A and U have none of the Sierra innovations, while most Sierra languages have most of the Sierra innovations. There are two languages, FM and P, in addition to Ch, which are of special interest.

FM is unique in being quite mixed, having 11 of the 20 Sierra innovations and three of the six Northern innovations. MacKay and Trechsel (Reference MacKay and Trechsel2015) classify FM as Northern, as it lacks the first three innovations in Table 1, which they identify as morphological isoglosses and features of Sierra languages. While MacKay and Trechsel argue that these are necessary and sufficient criteria, I treat them as innovations shared by the Sierra branch, but not a priori more important than any of the other shared innovations identified. Considering the entire list of innovations, FM does not clearly fall into either the Sierra or the Northern branch. It is possible FM belongs to either Sierra or Northern, and the similarities to the other are due to contact. However, though FM is located close to the border between Northern and Sierra languages, other languages near this border, such as Ch and Co, clearly fall into one or the other camp. The status of FM remains uncertain, awaiting further study.

Lowland P has all of the Sierra innovations, except MacKay and Trechsel's (Reference MacKay and Trechsel2011, 2015) three Sierra isoglosses. Considering only these three would lead to a clear differentiation between Sierra and Lowland, but the entire list shows that while Lowland P shares none of the Northern innovations, it has all of the other innovations shared by the Sierra languages, including other morphological innovations. While some other features differentiate Lowland, it appears to be very closely related to the Sierra branch. This suggests that Central Totonac can be further divided between Sierra-Lowland and Northern, as suggested in Brown et al. (Reference Brown, Beck, Kondrak, Watters and Wichmann2011).

Ch shares all but one Northern innovation, but only two of the 20 Sierra innovations. Interestingly, all of the three innovations where it does not align with Northern involve body parts, ‘chest’, ‘heart’, and ‘nose’, though ‘heart’ is one of the few body parts which does not have a combining prefix and thus does not participate in very common body-part constructions. These differences may be explained by borrowing, although the body has been identified as a semantic field resistant to borrowing (Tadmor et al. Reference Tadmor, Haspelmath, Taylor, Wichmann and Grant2010). However, Totonac body parts are a formal class defined by co-occurrence with verbal roots in complex verb structures, structures that have a very high frequency (Levy Reference Levy1999). This high frequency may explain why the two innovated Sierra forms may have been borrowed into Ch, but would not explain ‘heart’, which does not belong to this formal class. The Northern innovation may have been lost by borrowing back a conservative Sierra form, or may represent a further subdivision in the Northern branch.

4. Computational methods

A number of computational tools borrowed from evolutionary biology are becoming increasingly popular in historical linguistics. While promising methodological tools, they are still often used only heuristically in linguistics (Schnoebelen Reference Schnoebelen2009), offering a rapid and efficient method of finding patterns in the data and proposing relationships. In the context of historical linguistics, these tools have been adapted to use measures of phonological or lexical distance, by creating matrices of distance measurements from which to run their analyses, on the assumption that these distances correlate with genetic relatedness. These measures of similarity follow the assumption that high synchronic similarity correlates with genetic closeness. While similarity often does correlate, it does not provide such strong evidence for subgrouping, because, as outlined in section 3, there are various pathways to synchronic similarity. The analyses and visualizations can nevertheless be informative and help find patterns in the data that may support the conclusions arrived at through more traditional methods.

This article presents two analyses using such distance-based measurements, calculated from different types of data. The first analysis, in section 2.1, is a tree calculated from the 100-word Swadesh lists of 12 Totonacan languages with the Neighbour-Joining algorithm used by the Automated Similarity Judgment Program (ASJP) consortium (Wichmann et al. Reference Wichmann, Müller, Wett, Velupillai, Bischoffberger, Brown, Holman, Sauppe, Molochieva, Brown, Hammarström, Belyaev, List, Bakker, Egorov, Urban, Mailhammer, Carrizo, Dryer, Korovina, Beck, Geyer, Epps, Grant and Valenzuela2013), which presents trees based on lexical similarity. The second analysis is a phylogenetic network created using SplitsTree4 and the Neighbour-Net algorithm based on binary characters coded from a list of cognate sets (section 2.2).

4.1 ASJP lexical distance

The ASJP World Tree is a massive collaborative project that calculates and graphically illustrates the lexical similarity between a large number of the world's languages (Müller et al. Reference Müller, Velupillai, Wichmann, Brown, Holman, Sauppe, Brown, Hammarström, Belyaev, List, Bakker, Egorov, Urban, Mailhammer, Dryer, Korovina, Beck, Geyer, Epps, Grant and Valenzuela2013). It currently has data, in the form of a shortened 40-word Swadesh list, for 4,401 of the 7557 languages recognized by ISO 639–3. The 40-word list is the result of a calculation to determine the most stable lexical items on the original 200-word Swadesh list (Holman et al. Reference Holman, Wichmann, Brown, Velupillai, Müller, Brown and Bakker2008). The World Tree is calculated on the basis of a neighbour-joining algorithm, which uses a distance matrix generated from the word lists to optimize a tree, beginning with equal distances between each node. Each stage of clustering will minimize branch length between nodes to create a cladistic tree (Saitou and Nei Reference Saitou and Nei1987), with each branching representing a historical event of splitting or “speciation”. The distance matrix is based on the Levenshtein or edit distance, which is the smallest number of edits between two strings. In this article, the ASJP algorithms have been applied to 100-word Swadesh lists drawn from 12 Totonacan languages, including Ch. Figure 28 presents the resulting tree.Footnote 9

Figure 28: ASJP calculation of Totonacan lexical similarity

The tree generated by ASJP shows important differences from the traditional classification. While there is only one Tepehua language in the ASJP database, T, it is divided from the other Totonac languages at the highest level. The Totonac languages are not split into four main groups; instead the analysis finds three: Misantla, as expected, stands apart, and the Northern languages are set off from Sierra. Lowland is not clearly split from Sierra, with P being grouped closely with Co, Z, and Ct. FM and CX are at the edge of this branch. Geographically, CX is distant from other Sierra languages, but FM is located close to Sierra communities Co and Z. Both are close to Northern languages, FM being one of the varieties located nearest to Ch. Importantly, Ch is grouped with the Northern languages. In this table, Coahuitlán is equally distant from A and U-Zh.

4.2 Phylogenetic network

One weakness in the tree model often used to represent genetic relations is that the analysis of a dataset can often lead to multiple possible trees for any set of data. This can be due to noise or uncertainty in the data, and/or because innovations, which could have been used to identify a branch splitting, may be borrowed between branches after a split. Another representation of phylogenetic relations is a network, which has the advantage of representing multiple possible hypotheses. Networks show something of an average over large, noisy datasets, offering a cumulative visualization of the possible trees, and visually represent the differing amount of support given to groupings in the data (L. Campbell Reference Campbell2013). Trees are derivable from networks, but networks give a richer representation of the data. By way of example, Figure 29 shows a network representation of Romance languages, created using cognates from the Comparative Indo-European Database (Dyen et al. Reference Dyen, Kruskal and Black1997) and SplitsTree4.

Figure 29: Phylogenetic network of romance languages

Phylogenetic networks like this one transparently represent the distance of the relation between two nodes by the length of the line separating them, and the uncertainty by webbing displayed on the network (Schnoebelen Reference Schnoebelen2009). The length of the lines between taxa indicates their likely closeness, so the tight cluster of Vlach and Rumanian suggest these two languages are closely related. Increased webbing is the result of multiple possible trees. Vlach and Rumanian have little webbing, while Sardinian and Italian have more, reflecting greater uncertainty in the data between the nodes of Sardinian languages and Italian.

While networks provide rich visualizations, they ultimately rely upon the assumption that similarity corresponds with genetic closeness. In this analysis, the data come from manually compiled cognate sets (cf. section 3.3), taken from lexica for A, Ct, Co, FM, H, M, P, Pf, T, U, and Z, and my fieldwork in Ch. This list of cognates is based first on the Swadesh 100 list, with the remaining items having been gathered somewhat opportunistically, although including many culturally central items (Kaufman et al. Reference Kaufman, MacKay and Trechsel2004). To calculate a distance matrix, these cognate sets are first coded into characters – parameters for which languages may agree or differ. Each cognate set consists of a gloss, and a number of Totonac stems, or cognate classes. Basing a character on the gloss would result in a multistate character, with each cognate class being a possible state. Because phylogenetic networks typically rely upon binary characters, the cognate sets are made binary by coding each cognate class as a character, which has the added advantage of allowing the encoding of a language with two cognate classes for the same cognate set (Dunn Reference Dunn, Bowern and Evans2014). A list of 218 characters is encoded from the cognate sets, and a distance matrix is calculated in SplitsTree4 (Huson and Bryant Reference Huson and Bryant2006) using the Neighbour Net algorithm (Bryant and Moulton Reference Bryant and Moulton2004). An example of this encoding is seen in Figure 30, with two characters coded from the cognate set ‘water’ (cf. Figure 11 in section 3.3.1).

Figure 30: Encoding of characters for ‘water’

In the above characters, each language having a reflex of the cognate class or stem, škaːn or čučut, is coded in that character with a 1, indicating a shared state, so that in the first character, each 1 represents the stem škaːn in that language, and in the second, each 1 represents the stem čučut. The languages with no reflex are coded with unique states, indicating that there is no shared state between them. Although termed “binary”, the coding of these characters is not represented in 1s and 0s, because if the absence of the reflex were coded as a 0, SplitsTree4 would treat this as a feature shared by those languages. As noted by Ringe et al. (Reference Ringe, Warnow and Taylor2002), the absence of a feature does not give evidence for a relation, so each absence is coded as a unique state. In section 3.3.1 we saw this same cognate set, and used the distribution of the two reflexes to argue that čučut represents a Sierra innovation. The presence of škaːn in each of these languages is then a shared retention, and does not provide evidence that the languages have this stem for any kind of group. However, the phylogenetic analysis will use these two character sets to group first the set of languages having škaːn, and second, the set having čučut, taking only into account these shared reflexes. Although this is counter to what we know, it is hoped that some sort of evening-out will occur across the greater number of data points the phylogenetic analysis is capable of computing. We rely upon the correlation between similarity and genetic closeness. With this caveat, the phylogenetic network is presented in Figure 31.

Figure 31: Phylogenetic network of Totonacan

The three Tepehua languages are represented on the right, quite distant from Totonac languages. Misantla is also quite distant from the other Totonac languages, in agreement with the ASJP results. On the top left, three Sierra varieties, Ct, Co, and Z, group quite closely with Lowland, P. While the ASJP tree puts P closest to Co, which is also the closest geographically, the phylogenetic network puts P closest to Z, which is much more distant. This would suggest that the branch spread widely before further innovation in the geographic centre, Co, and Ct. P is grouped more closely to Z than Z is to Co and Ct, giving little evidence for P as a distinct branch. FM appears somewhat between the other Sierra languages and the Northern languages seen at bottom left, U and A. Ch appears quite close to these Northern languages, here closer to U than to A, while it was equidistant to U and A on the ASJP tree. As in ASJP, Ch clearly groups closely with the Northern languages.

5. Conclusions

Although we have seen that quantitative measures provide less solid evidence, the two analyses do correlate with the evidence from shared innovations summarised in section 3.4. Each approach agrees with the traditional classification to separate Tepehua from Totonac, and Misantla from the other Totonac languages, in agreement with MacKay and Trechsel and with Brown et al. Three further points arising from the shared innovations are also represented in the quantitative analyses. First, while FM lacks the three features MacKay and Trechsel use to delimit the Sierra branch, it clearly shares a large number of both Sierra and Northern innovations. In the ASJP tree, FM appears at the periphery of the Sierra family in the ASJP tree, and appears roughly between the Sierra and Northern clusters in the phylogenetic network. The mixed distribution of shared innovations presents difficulties for both MacKay and Trechsel's Northern analysis, and for Brown et al.'s Sierra-Lowland hypothesis. FM certainly seems to be a promising area for further phylogenetic research. Second, Lowland P is shown to be very close to the Sierra languages, occurring very close to Co in the ASJP tree and to Z in the phylogenetic analysis. While Co is geographically proximate, Z is at the other side of the geographic range of Sierra languages. This close relation is affirmed in the large number of innovations shared by Lowland and Sierra. This supports Brown et al.'s grouping of Sierra and Lowland together opposed to Northern. Finally, the sharing of Northern, but not Sierra, innovations in Ch is matched by the quantitative data. In the ASJP tree, Ch is shown within the Northern branch, and in the phylogenetic network Ch appears close to A and U. Ch clearly belongs in the Northern branch, evidenced by both quantitative analysis and shared innovations.

Appendix A

Language Abbreviations and Sources

Abbreviations of names for languages used in this study are listed below in alphabetical order. The primary source for information on each language is given in parentheses. Other sources are given in the text.

A

Apapantilla Totonac (Reid and Bishop Reference Reid and Bishop1974)

Ch

Coahuitlán Totonac (author's fieldwork)

Co

Coyutla Totonac (lexical database prepared by H. Aschmann)

Ct

Coatepec Totonac (McQuown Reference McQuown1990 and unpublished dictionary by McQuown)

CX

Cerro Xinolatépetl (communication between David Beck and Gerry Anderson)

FM

Filomeno Mata Totonac (lexical database prepared by T. McFarland)

H

Huehuetla Tepehua (Smythe Kung Reference Smythe Kung2007)

M

Misantla Totonac (MacKay and Trechsel Reference MacKay and Trechsel2005)

Ol

Olintla Totonac (word list prepared by Jorge Tino)

Oz

Ozelonacaxtla Totonac (word list prepared by Gabriela Román Lobato)

P

Papantla Totonac (Aschman Reference Aschmann1973a, Levy Reference Levy1990)

Pf

Pisaflores Tepehua (communication between David Beck and Albert Daveltshin, and James Watters, MacKay and Trechsel Reference MacKay and Trechsel2010)

T

Tlachichilco Tepehua (Watters Reference Watters1988, Reference Watters2007)

U

Upper Necaxa Totonac (Beck Reference Beck2011)

Z

Zapotitlán Totonac (Aschmann Reference Aschmann1973b)

Zh

Zihuateutla Totonac (Michelle García-Vega, p.c.)

Footnotes

1

Funding for this project was provided by a Social Sciences and Humanities Research Council of Canada grant to David Beck. I would like to thank David Beck for reading many drafts of this article and for extensive critique. Many thanks also to Eric Campbell, Paulette Levy, and anonymous reviewers for their critique and thoughtful comments, and to Søren Wichmann for running the ASJP algorithms on Totonacan wordlists. While these contributions made this a better article, any remaining faults are my own. To my wonderful field consultants, paškát ka̰cíːnaɬ.

2 Chance, sound-symbolism, and so-called “nursery forms” are other possible sources of similarity.

3 A full list of language abbreviations is found in Appendix A. This article uses an Americanist form of IPA commonly used by Totonacists, with the following notable differences from IPA: c = voiceless alveolar affricate, ƛ = voiceless lateral affricate, y = palatal approximant, ː after a vowel indicates length, ˷ under a vowel indicates laryngealization, ’ after a consonant indicates ejectivization, ´ above a vowel indicates stress. I have tried to make the transcriptions from different sources uniform, according to IPA and these differences.

4 In U, *ƛ has undergone merger with /ɬ/.

5 U has lost uvular stops and *q becomes /ʔ/.

6 The abbreviations used in the interlinear glosses are as follows: 1, 2, 3 = first-, second, third-person; dcs = decausative; fut = future; impfv = imperfective; inc = inchoative; irr = irrealis; obj = object; pfv = perfective; pl = plural; recip = reciprocal; sg = singular; subj= subject; tot = totalitive.

7 Totonac forms, such as -pṵtun and -kṵtun, which are given in the prose to refer to different glosses present in groupings of the varieties, are not intended to represent reconstructed forms. Rather, they should be read as convenient shorthand referring to groups with the same or very similar forms.

8 The form in U, FM, and Ch pa̰ʔɬma̰/paɬma bears some resemblance to Spanish palma ‘palm tree’. However, it seems to be derived from a Totonac verb pa̰ʔɬ- ‘to bloom, flower, sprout’ with suffix –ma ‘by-product’ (Beck, p.c.).

9 The ASJP database has different labels for certain varieties, notably “Highland” for Z, and “Xicotepec” for A; the abbreviations used here are found to the right, beside the ASJP labels.

References

Aranya, Osnaya, 1953. Reconstrucción del protototonaco: Huastecos, totonacos y sus vecinos, ed. Bernal, Ignacio. Revista Mexicana de Estudios Antropológicos 23: 123130.Google Scholar
Aschmann, Herman P. 1973a. Diccionario totonaco de Papantla. México, DF: Instituto Lingüístico de Verano.Google Scholar
Aschmann, Herman P. 1973b. Vocabulario totonaco de la Sierra. México, DF: Instituto Lingüístico de Verano.Google Scholar
Beck, David. 2004. A Grammatical Sketch of Upper Necaxa Totonac. München: LINCOM Europa.Google Scholar
Beck, David. 2011. Upper Necaxa Totonac Dictionary. Berlin: Mouton de Gruyter.Google Scholar
Beck, David. 2012. Apéndice: Tablas de morfología comparativa. In Las lenguas totonacas y tepehuas: Materias para su estudio, eds. Levy, Paulette and Beck, David, 587596. Mexico City: UNAM Press.Google Scholar
Brown, Cecil H., Beck, David, Kondrak, Grzegorz, Watters, James K., and Wichmann, Søren. 2011. Totozoquean. International Journal of American Linguistics 77: 323372.Google Scholar
Bryant, David and Moulton, Vincent. 2004. NeighborNet: An agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21: 255265.Google Scholar
Campbell, Eric. 2013. The internal diversification and subgrouping of Chatino. International Journal of American Linguistics, 79(3): 395420.Google Scholar
Dunn, Michael. 2014. Language phylogenies. In The Routledge Handbook of Historical Linguistics, ed. Bowern, Claire and Evans, Bethwyn, 190211. London: Routledge.Google Scholar
Dyen, Isidore, Kruskal, Joseph B., and Black, Paul. 1997. Comparative IndoEuropean database collected by Isidore Dyen.Google Scholar
Fox, Anthony. 1995. Linguistic Reconstruction: An Introduction to Theory and Method. Oxford: Oxford University Press.Google Scholar
García Rojas, Blanca. 1978. Dialectología de la zona totonaco-tepehua. Honors thesis, Escuela Nacional de Antropología e Historia, México.Google Scholar
Holman, Eric W., Wichmann, Søren, Brown, Cecil H., Velupillai, Viveka, Müller, André, Brown, Pamela, and Bakker, Dik. 2008. Explorations in automated language comparison. Folia Linguistica, 42: 331354.Google Scholar
Huson, Daniel H. and Bryant, David. 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution, 23(2): 254267. Software available from www.splitstree.org.Google Scholar
Ichon, Alain. 1973. La religion de los totonacas de la sierra. México: Instituto Nacional Indigenista.Google Scholar
Kaufman, Terrance, MacKay, Carolyn, and Trechsel, Frank. 2004. Cuestionario lingüístico para la investigación de las variaciones dialectales de la lengua totonaca.Google Scholar
Levy, Paulette. 1987. Fonología del totonaco de Papantla, Veracruz. México, DF: Universidad Nacional Autónoma de México.Google Scholar
Levy, Paulette. 1990. Totonaco de Papantla, Veracruz. México, DF: Colegio de México.Google Scholar
Levy, Paulette. 1999. From ‘Part’ to ‘Shape’: Incorporation in Totonac and the issue of classification by verbs. International Journal of American Linguistics 65: 127175.Google Scholar
Lewis, M. Paul, Simons, Gary F., and Fennig, Charles D. eds. 2015. Ethnologue: Languages of the World. 18th ed. Dallas, Texas: SIL International. Online version at http://www.ethnologue.com.Google Scholar
MacKay, Carolyn J. 1994. A sketch of Misantla Totonac phonology. International Journal of American Linguistics 60: 369419.Google Scholar
MacKay, Carolyn J. 1999. A Grammar of Misantla Totonac. Salt Lake City: University of Utah Press.Google Scholar
MacKay, Carolyn J., and Trechsel, Frank R.. 2005. Totonaco de Misantla, Veracruz. México, DF: Colegio de México.Google Scholar
MacKay, Carolyn J. and Trechsel, Frank R.. 2010. Tepehua de Pisaflores, Veracruz. México, DF: El Colegio de México.Google Scholar
MacKay, Carolyn J. and Trechsel, Frank. 2011. Relaciones internas de las lenguas totonaco-tepehuas. Memorias del V Congreso de Idiomas Indígenas de Latinoamérica.Google Scholar
MacKay, Carolyn J. and Trechsel, Frank. 2013. A sketch of pisaflores tepehua phonology. International Journal of American Linguistics 79: 189218.Google Scholar
MacKay, Carolyn J. and Trechsel, Frank R.. 2015. Totonac-Tepehua genetic relationships. Amerindia 37(2): 121158.Google Scholar
McFarland, Teresa. 2009. The phonology and morphology of Filomeno Mata Totonac. Doctoral dissertation, University of California, Berkeley.Google Scholar
McQuown, Norman A. 1940. A Totonac Grammar. Doctoral Dissertation, Yale University.Google Scholar
McQuown, Norman A. 1990. Gramática de la lengua totonaca (Coatepec, Sierra Norte de Puebla). México, DF: Universidad Nacional Autónoma de México.Google Scholar
Müller, André, Velupillai, Viveka, Wichmann, Søren, Brown, Cecil H., Holman, Eric W., Sauppe, Sebastian, Brown, Pamela, Hammarström, Harald, Belyaev, Oleg, List, Johann-Mattis, Bakker, Dik, Egorov, Dmitri, Urban, Matthias, Mailhammer, Robert, Dryer, Matthew S., Korovina, Evgenia, Beck, David, Geyer, Helen, Epps, Pattie, Grant, Anthony, and Valenzuela, Pilar. 2013. ASJP World Language Trees of Lexical Similarity: Version 4 (October 2013), available at http://asjp.clld.org.Google Scholar
Reid, Aileen A. 1991. Gramática totonaca de Xicotepec de Juárez, Puebla. México, DF: Instituto Lingüístico de Verano.Google Scholar
Reid, Aileen A. and Bishop, Ruth G.. 1974. Diccionarío totonaco de Xicotepec de Juárez, Puebla. México, DF: Instituto Lingüístico de Verano.Google Scholar
Ringe, Don, Warnow, Tandy, and Taylor, Ann. 2002. Indo-European and computational cladistics. Transactions of the Philological Society 100(1): 59129.Google Scholar
Saitou, Naruya and Nei, Masatoshi. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4(4): 406425.Google Scholar
Sapir, Edward. 1921. Language: An Introduction to the Study of Speech. New York: Harcourt, Brace.Google Scholar
Schnoebelen, T. 2009. A how-to guide for using phylogenetic tools on linguistic data (SplitsTree, MrBayes). Ms., Stanford University. Revised on April 23, 2009.Google Scholar
Secretaría de finanzas y planeación del estado de Veracruz (SEFIPLAN). 2013. Cuadernillos Municipales: Coahuitlán, Mexico.Google Scholar
Smythe Kung, Susan. 2007. A Descriptive Grammar of Huehuetla Tepehua. Ph.D. dissertation, University of Texas, Austin.Google Scholar
Tadmor, Uri, Haspelmath, Martin, and Taylor, Bradley. 2010. Borrowability and the notion of basic vocabulary. In Quantitative approaches to linguistic diversity: Commemorating the centenary of the birth of Morris Swadesh, ed. Wichmann, Søren and Grant, Anthony P.. Diachronica 27(2): 226246.Google Scholar
Watters, James K. 1988. Topics in Tepehua Grammar. Doctoral dissertation, University of California, Berkeley.Google Scholar
Watters, James K. 2007. Diccionario tepehua de Tlachichilco–español. Ms. in possession of David Beck, University of Alberta.Google Scholar
Wichmann, Søren, Müller, André, Wett, Annkathrin, Velupillai, Viveka, Bischoffberger, Julia, Brown, Cecil H., Holman, Eric W., Sauppe, Sebastian, Molochieva, Zarina, Brown, Pamela, Hammarström, Harald, Belyaev, Oleg, List, Johann-Mattis, Bakker, Dik, Egorov, Dmitry, Urban, Matthias, Mailhammer, Robert, Carrizo, Agustina, Dryer, Matthew S., Korovina, Evgenia, Beck, David, Geyer, Helen, Epps, Pattie, Grant, Anthony, and Valenzuela, Pilar. 2013. The ASJP Database (version 16), available at http://asjp.clld.org.Google Scholar
Figure 0

Figure 1: Traditional classification

Figure 1

Figure 2: Map of Totonacan communities

Figure 2

Figure 3: Brown et al.'s (2011) Classification

Figure 3

Figure 4: ‘to walk around’

Figure 4

Figure 5: Glide reduction

Figure 5

Figure 6: ‘you (sg)’

Figure 6

Figure 7: Cognates with unconditioned /e/ and /o/

Figure 7

Figure 8: Locative

Figure 8

Figure 9: Desiderative

Figure 9

Figure 10: Negative

Figure 10

Figure 11: ‘water’

Figure 11

Figure 12: ‘rain’

Figure 12

Figure 13: ‘sand’

Figure 13

Figure 14: ‘see’

Figure 14

Figure 15: Nominal form of ‘ear’

Figure 15

Figure 16: ‘big’

Figure 16

Figure 17: ‘chest’

Figure 17

Figure 18: ‘liver’

Figure 18

Figure 19: ‘leaf’

Figure 19

Figure 20: ‘fire’

Figure 20

Figure 21: ‘girl’

Figure 21

Figure 22: ‘tomorrow’

Figure 22

Figure 23: ‘heart’

Figure 23

Figure 24: ‘nose’

Figure 24

Figure 25: ‘green’

Figure 25

Figure 26: ‘toe- and fingernail’

Figure 26

Figure 27: ‘good’

Figure 27

Table 1: Shared innovations in Central Totonac

Figure 28

Figure 28: ASJP calculation of Totonacan lexical similarity

Figure 29

Figure 29: Phylogenetic network of romance languages

Figure 30

Figure 30: Encoding of characters for ‘water’

Figure 31

Figure 31: Phylogenetic network of Totonacan