1 Introduction
This paper demonstrates that two models of tonal representation – those proposed by Yip (Reference Yip1989) and Bao (Reference Bao1990) – cannot be regarded as distinct. Previous work (Bao Reference Bao1990, Chen Reference Chen2000, Yip Reference Yip2002) has claimed that the proposals differ in their empirical coverage of assimilatory tone-sandhi processes in Chinese dialects, and thus constitute distinct representational theories. Arguments in favour of this distinction are situated within a derivational perspective on tonal processes, and assume two basic mechanisms: spreading and delinking. As well as being tied to a specific grammatical formalism, these mechanisms are inadequate for capturing the full range of assimilatory tone sandhi, crucially the cases which are claimed to distinguish the two proposals in question. An additional mechanism is necessary for full coverage of attested tone sandhi.
In this paper, I employ a computational framework to examine earlier claims that the Yip and Bao tonal representations are distinct. The computational perspective advocated here focuses on the properties of the input–output mappings which describe assimilatory tone-sandhi processes (rather than theory-specific mechanisms), thus providing a more direct approach to evaluating the models’ empirical predictions. I show that the two representations handle the relevant assimilatory tone-sandhi patterns equally well when described as input–output mappings. The computational analysis preserves the basic character of the original representational theories in the sense that it reproduces the same basic and necessary mechanisms as traditional accounts: spreading, delinking and copying. This paper demonstrates that, contrary to previous claims, the Yip and Bao models do not differ in their empirical consequences.
Additionally, I apply the same approach to structural comparisons of the representational theories through examination of the properties of mappings between representations. The paper capitalises on structural similarities apparent in the Yip and Bao models to show that one can be freely translated into another, and vice versa. Such a translation does not result in any loss of the contrasts expressible by either theory. Given these two results, the main claim of the paper is that the two representational proposals do not constitute distinct theories, but are notationally equivalent.
The paper is organised as follows. §2 frames the broad definition of ‘notational equivalence’ between representational theories which I adopt in the paper, and highlights some metatheoretical issues regarding the representation of assimilatory tone sandhi. §3 introduces the Yip and Bao proposals in detail, and summarises previous arguments distinguishing them on empirical grounds. In §4, I establish the computational framework in which the issue of notational equivalence is addressed. This section then presents case studies of two assimilatory patterns, and shows that both models capture these patterns as input–output mappings. §5 presents additional evidence for notational equivalence from a structural perspective. §6 discusses these results in broad terms, and addresses metatheoretical issues. §7 concludes.
2 A notion of notational equivalence
There is a tacit assumption in linguistic theory that new theories should improve on older ones by increasing both the expressivity and the restrictiveness of their predecessors. Models of grammar seek to explain the widest possible scope of attested phenomena, at the same time limiting their predictive power, such that the models do not predict unnatural, impossible or unattested patterns. It follows that new theories should not simply rehash older ones; that is, the new contribution should be distinct from earlier iterations. A reasonable expectation of such contributions is that they can predict alternations attested in human language that earlier theories failed to predict. New contributions to linguistic theory should also mitigate problems of overexpressivity apparent in previous theories, by reining in their predictive power. Ideally, they do both simultaneously.
However, if a proposed theory merely restates the generalisations of older theories, or if the former differs from the latter in superficial ways – such that no demonstrable improvement in expressivity and/or restrictiveness obtains – we may argue that the two are not distinct; rather, they are notational equivalents. Chomsky (Reference Chomsky1972: 69) makes this point in the following way:
Given alternative formulations of a theory of grammar, one must first seek to determine how they differ in empirical consequences, and then try to find ways to compare them in the area of difference. It is easy to be misled into assuming that differently formulated theories actually do differ in empirical consequences, when in fact they are intertranslatable – in a sense, mere notational variants.
A more recent definition (Fromkin Reference Fromkin2000) establishes two criteria by which alternative models may be considered notationally equivalent. Not only must the models share the same empirical coverage, they must also represent the same set of basic, abstract properties, and differ only superficially in terms of that representation. Two models are thus notationally equivalent if they satisfy the conditions in (1).
(1)
In this paper, I test the conditions in (1) against two competing models of tonal representation: those proposed by Yip (Reference Yip1989) and Bao (Reference Bao1990), and summarised in (2).
(2)
Both theories model lexical tonal contrasts and a variety of tonal processes, specifically tone-sandhi processes in Chinese dialects. At first glance, they have similar sets of properties – in particular a root node which associates to a tone-bearing unit (TBU), a binary register feature which bisects the vocal range into upper and lower registers, and binary terminal tonal features (‘high’ and ‘low’ tones within a register). They also make the assumption that contour tones are sequences of level tones dominated by a single structural node, and therefore form a constituent. However, there are differences in how these basic properties relate to one another structurally. A key difference is that Yip's root node is specified for a register feature and dominates terminals directly, while Bao's root (represented in (2) as ‘T’) is unspecified, and branches into separate register and contour nodes, the latter of which (represented as ‘c’) dominates tonal terminal nodes.
This difference has been claimed to manifest itself in the way in which the models formalise assimilatory tone-sandhi processes as spreading (Bao Reference Bao1990, Chen Reference Chen2000, Yip Reference Yip2002). Structural separation of register and contour in Bao's model allows the two to spread independently. In Yip's model, since the register node directly dominates contour, there is no distinction between register spread and contour spread; register spread entails contour spread, and vice versa. As such, assimilatory processes modelled as ‘register spread’ or ‘contour spread’ are not predicted by this representation.
However, this observation alone does not guarantee that the two models differ in their empirical predictions, because the set of processes a given model is said to predict is inevitably tied to the full context of the formalism in which they are defined. If the process is couched in derivational terms (as in Chomsky & Halle Reference Chomsky and Halle1968), it may be defined as a set of crucially ordered rewrite rules. In an optimisation-based framework like Optimality Theory, specific constraints interact in an evaluation over some set of candidates to select the output form. Therefore, the empirical predictions of a model's representation – its capacity to capture a certain process – must be considered within the full context of a particular framework.
Feature-geometric models of tone formalise sandhi processes in an autosegmental phonological framework (Goldsmith Reference Goldsmith1976, McCarthy Reference McCarthy1988). The basic mechanisms of this theory are spreading (addition of a single association line between elements of structure) and delinking (deletion of an association line). Simple spreading and delinking, however, are insufficient to model assimilatory tone-sandhi processes using these representations. This is because contour spread requires the extra assumption of tier conflation (Younes Reference Younes1983, McCarthy Reference McCarthy1986, Yip Reference Yip1989), a process by which contour nodes and the terminal tonal nodes they dominate are copied to guarantee that separate contours are realised on each root (not as a single contour over multiple roots; see §3.2 and especially §6 for discussion). This mechanism is shown in (3), where a dashed line indicates spreading.
(3)
A feature-geometric theory of assimilatory tone sandhi thus extends the traditional set of basic operations (spreading and delinking) to include a copying mechanism. This extension, however, also permits alternatives to a traditional spreading analysis. For example, it is unclear why a spreading analysis with tier conflation is preferable to – or even differs from – one which first copies pieces of structure and then reassociates them (by adding a single association line). Using the same basic mechanisms of the theory, this yields an identical structure, as in (4).
(4)
Relatedly, the necessity of tier conflation bears on the accuracy of spreadability – i.e. spreading without copying – as a metric of empirical coverage. If spreading and delinking fail to capture the full range of attested tone sandhi, how reliable can such a test be in distinguishing empirical predictions of different models?
The difficulty in answering such questions is compounded by the fact that traditional analyses of tone sandhi using these representations are inherently derivational. Spreading and tier conflation are crucially ordered with respect to one another, as the application of the latter is dependent on the former. Additionally, the ‘spreading’ and ‘copying’ analyses above differ only in the order of application of basic mechanisms. In a non-serial formalism like parallel OT, for example, such an ordering would be irrelevant. There is no guarantee that the difference between the analyses is not merely a vestige of their formalisation within a derivational paradigm. A more direct approach to evaluating models’ empirical predictions involves the examination of the properties of input–output mappings themselves, rather than theory-specific mechanisms. Ideally, such an approach can also be applied to the structural comparisons of representational models (cf. (1b)), through examination of mappings between representations.
This paper pursues a computational characterisation of tonal representation to explore the question of notational equivalence and address the conditions in (1) as they apply to Yip and Bao tonal models. Within the empirical domain, I narrow the focus to the set of processes which earlier work claims distinguishes the models: register assimilation and contour assimilation. I abstract away from theory-specific considerations – and in particular the derivational nature of earlier spreading analyses – focusing instead on the nature of input–output mappings which describe assimilation (Chandlee & Heinz Reference Chandlee and Heinz2018, Heinz Reference Heinz, Hyman and Plank2018). To do this, I employ a model-theoretic framework (Courcelle Reference Courcelle1994, Enderton Reference Enderton2001, Libkin Reference Libkin2004). The Yip and Bao representations are given rigorous definitions as graph structures, and assimilatory tone mappings between these structures are defined using logical transduction. By fixing the complexity of the logical language necessary to define these mappings, we can compare the structures’ empirical predictions in a principled way. It will be shown that the models do not differ in their empirical consequences (and thus satisfy condition (1a)), because the processes in question can be defined over both models, using a restricted, quantifier-free (QF) logic.
The quantifier-free nature of this logic captures the important intuition that assimilation in tone sandhi is inherently local, an insight which earlier approaches overlook, but which is well attested in computational characterisations of a wide range of phonological processes (Chandlee Reference Chandlee2014). Importantly, non-size-preserving QF logic is necessary to model assimilatory patterns over both models. Such transductions define mappings over a finite number of copies of output structure (see further discussion in §4). In other words, non-size-preserving QF logic describes mappings which allow a copying mechanism. While sentences in this logic are more powerful than size-preserving QF logical statements (i.e. those which prohibit copying), the type of copying these transductions permit is restricted to local bounded environments, and thus does not overextend the intention or spirit of the original theory (see §6.1). The intuitions behind these mappings are described in terms of local connected substructures; full logical definitions can be found in Appendix A.Footnote 1 Mappings of assimilatory processes are presented first, to demonstrate that the models do not differ in their empirical consequences. §6 discusses how the ‘spreading’ and ‘copying’ analyses described above are formally indistinguishable from the computational perspective, because they represent a single QF-definable map. While the scope of this result is limited to two assimilatory tone-sandhi processes, it provides a proof of concept that may be applied to spreading and copying more generally (see §6.2).
The computational approach affords the same formal rigour for the exploration of the structure of representational theories. It allows us to reason over both aspects of notational equivalence in (1), using the same formal language. Again using the formalism of QF-definable transductions, I show further that the two models satisfy the second condition in (1) – i.e. that any structural differences between the models must be only superficial – by demonstrating their bi-interpretability (Friedman & Visser Reference Friedman and Visser2014).Footnote 2 Bi-interpretability provides a restrictive and provable formal notion of ‘superficial’ differences between representational theories, and is divided into two components. First, I define transductions which translate from any structure in Bao's representation directly to an equivalent structure in Yip's representation, and vice versa. These translations capitalise on various structural similarities apparent in elements of both models (in particular the constituency of a tonal contour under a single structural node), such that translating from Bao's structure to Yip's represents a fusion of three separate nodes into a single node, and translating from Yip's structure to Bao's an expansion of one node into three separate nodes. These intuitions are shown in (5).Footnote 3
(5)
The second component of bi-interpretability is a guarantee that these translations are contrast-preserving; that is, no contrasts present in one model are lost in the process of translation into the other. I show that this is the case for the models in question, by demonstrating that the two translations are inverses of one another; in other words, applying both translations to one model (i.e. through composition; see §5 and Appendix B) is the same as mapping that model to itself.
The bi-interpretability result demonstrated here builds on recent studies exploring notational equivalence in syllabic representation (Strother-Garcia & Heinz Reference Strother-Garcia and Heinz2017) and autosegmental representation (Danis & Jardine Reference Danis and Jardine2019) from a formal language perspective. Importantly, this paper adopts a more restrictive definition of bi-interpretability than previous studies, and thus puts forward a stronger hypothesis about notational equivalence. As the Bao and Yip models satisfy both conditions in (1), I conclude that they are notationally equivalent.
This paper does not claim that structural differences between feature-geometric configurations are in principle superficial, or that feature geometry is irrelevant. It does not assume equivalence between these models and other representational theories, for example those which do not assume constituency of contour tones (e.g. Duanmu Reference Duanmu1990, Reference Duanmu1994; see §5.4 below). Rather, it advocates a rigorous formal analysis of claims that any two representations differ, and motivates analyses which are independent of the assumptions of a particular grammatical formalism. While the results of the current study are narrow in scope, they serve as a proof of concept for subsequent analyses of other representational models. The two representational theories examined in this study are introduced in §3.
3 The Yip and Bao models
This section offers a basic introduction to the tonal models proposed by Yip (Reference Yip1989) and Bao (Reference Bao1990). Yip's (Reference Yip2002) design criteria for tonal feature systems state that the purpose of such representational theories is (i) to characterise attested lexical tonal contrasts (both level and contour), and (ii) to model common tonal processes.Footnote 4 It presents each in turn, with a focus on previous arguments in the literature which have been used to distinguish these models in terms of the latter criterion – i.e. claims that they differ in their empirical predictions.
3.1 Tonal geometry in the Yip and Bao models
Table I summarises the representation of level and contour lexical tonal contrasts in the two models. Exhausting the permutations of the binary register features with level and contour tones yields eight distinct tonal structures (two of which overlap for M or mid level). That is, these models represent the same set of lexical tonal contrasts (§5.2 explores the formal expression of this notion). They do so, however, with different structural configurations. Intuitively, this discrepancy can be described as follows: Bao's model splits Yip's [±u] node (which I also represent in this paper as ‘r’) into three separate nodes: a root node (represented here as ‘T’, following Chen Reference Chen2000), a register node (‘±u’ or ‘r’) and a contour node (‘c’). For the remainder of the paper, I refer to Bao's model as the separated model, and Yip's model as the bundled model. This structural difference is crucial, as it has been argued that it distinguishes the models’ empirical predictions, which are examined in the next section.
Table I Level and contour tonal contrasts in Yip (1989) and Bao (1990).

3.2 Reported empirical predictions
Feature-geometric representations model not only lexical tonal contrasts, but also attested tone-sandhi patterns. Assimilatory tone sandhi is formalised over these structures using spreading and delinking, the two basic mechanisms of autosegmental theory. In this framework, assimilation is the addition of a single association line – i.e. spreading – between elements in a structure, followed by the subtraction of an existing association line – i.e. delinking. A hypothetical register-assimilation pattern between two adjacent syllables is illustrated in (6), using a simplified separated representation, in which contours are not shown. A double-barred line indicates delinking.
(6)
The ―u feature on the first syllable spreads to the T root node on the adjacent syllable, and the existing association line between that node and the + u feature then delinks. This models progressive register assimilation, with the result being a sequence of two low-register ―u tones.
To model assimilatory tone sandhi attested in the literature, previous work has measured the empirical coverage of a representational theory based on the ability of specific structural positions within the representation to spread independently of others. For example, the empirical predictions of a given theory are evaluated by whether it can spread a register node independently from all other nodes: contour, root, terminal, etc. It is along this dimension that the separated and bundled models are argued to differ; the separated model claims wider empirical coverage than the bundled model. Table II, adapted from Chen (Reference Chen2000), summarises the claims about the models’ respective empirical predictions using this spreadability metric.
Table II Empirical predictions regarding spread (adapted from Chen 2000: 73).

The separated model makes explicit the structural independence of contour and register. Motivation for this division is empirical; assimilatory tone-sandhi processes attested in Chinese dialects require the contour node to spread independently of register, and vice versa. Consider the data in (7) from Pingyao (Hou Reference Hou1980, Bao Reference Bao1990, Chen Reference Chen2000). High and low register on penultimate contour tones assimilate to an adjacent final tone; a low rising contour (LM) becomes high rising (MH) before a high tone, while a high rising contour becomes low rising before a low tone. The shape of the contour – i.e. rising – does not change.
(7)
Bao (Reference Bao1990) analyses this pattern as assimilation via regressive spreading of a register node, and therefore as crucial evidence for the separation of register and contour in the representation. Bao (Reference Bao1990: 93) proposes (8) to derive the Pingyao assimilation pattern: when a rising contour tone appears in non-final position, the register node (r in (8) below) on the adjacent tone first spreads to the non-final root node (T), and the underlying register node delinks from the non-final T node. As the contour node is independent of register, and given that neither operation in the rule targets c nodes, contour is predicted to be unaffected. The result of the rule is that the penultimate rising contour surfaces with the same register as the adjacent tone. Thus the rule correctly derives the sandhi forms in (7). The rule is shown in (8a), with a derivation of /LM.HM/ in (b).
(8)
Assimilatory patterns such as that of Pingyao can only be derived as spreading using representations which separate register from contour. As Bao (Reference Bao1990: 66) and Chen (Reference Chen2000: 73) argue, Yip's bundled tonal model fails to account for such sandhi processes. Because the register node immediately dominates terminal tonal nodes, register spread will necessarily entail spreading of the terminal nodes. The derivation in (9) shows the same spreading and delinking operations in the bundled model.
(9)
When a high register node spreads to the preceding syllable, the falling contour it dominates spreads as well. Similarly, delinking the register node necessarily delinks its daughters, i.e. the rising contour. This results in the unattested output *[MH.MH].
Similar arguments have been put forward with respect to sandhi alternations where contour spreads independently of register. A relevant example comes from another dialect, Zhenjiang (Zhang Reference Zhang1985, Bao Reference Bao1990): when a rising or falling contour tone appears before a high level tone, it surfaces as either mid level or high level, depending on the register of the affected tone. In (10), low-register contour tones surface as mid level in this environment.
(10)
Bao (Reference Bao1990) proposes a regressive contour-spreading analysis, and cites Zhenjiang as providing further evidence in favour of the separated model, as sandhi does not alter register.Footnote 5 In this analysis, after spreading and delinking operations, the structure undergoes a tier-conflation operation in which the spread c node (along with the immediately dominated h node) is copied (see Bao Reference Bao1990: 101–103 and §6 below for more discussion). This is illustrated in (11) for the derivation /ML.H/ → [M.H].
(11)
As with Pingyao, spreading in a bundled representation would ostensibly entail carriage of register information along with contour information, an undesired result. This is because the node immediately dominating terminal tonal nodes in the bundled model bears register features. In other words, there is no procedural difference between register spread, contour spread and whole tone spread. As shown in (12), the correct output cannot be derived with a similar analysis in the bundled model: contour spread and tier conflation entail register spread, because the former is directly dominated by the latter, producing the unattested *[H.H].
(12)
Claims that the two models differ in their coverage of assimilatory tone-sandhi patterns hinge on the spreadability metric described above. However, spreadability arguments are fundamentally tied to a derivational perspective on tonal processes. It is not clear what this metric means for a particular theory when couched in a non-derivational formalism, or whether it distinguishes one theory from another in such cases. This issue is treated in more detail in §6, but I first present an alternative computational perspective on the question of the empirical predictions of the two models. Instead of a potentially theory-specific notion of spreadability, this perspective establishes an explicit upper bound on the computational complexity of a set of processes, and asks whether competing representational models can capture those processes within that bound. The next section adopts a computational outlook on the separated and bundled models, and challenges previous claims about their empirical predictions.
4 Graph mappings: empirical predictions
This section addresses the condition in (1a), which states that two models are notationally equivalent if their empirical consequences do not differ. It illustrates that this condition holds for bundled and separated representations, contrary to the conclusions of previous work. The focus is on the cases of assimilatory tone sandhi that have been claimed to distinguish the two models – register assimilation and contour assimilation.
To achieve this, I present a computational characterisation of assimilatory tone sandhi. This framework focuses on the nature of the input–output mappings which describe phonological processes (Chandlee & Heinz Reference Chandlee and Heinz2018, Heinz Reference Heinz, Hyman and Plank2018), and thus abstracts away from assumptions specific to any one grammatical formalism. In particular, I consider logical characterisations of tone-sandhi mappings in a model-theoretic framework (Courcelle Reference Courcelle1994, Enderton Reference Enderton2001, Libkin Reference Libkin2004). I begin by offering rigorous definitions of separated and bundled representations as graphs in §4.1. Logical transductions are then defined using these graph structures. Transductions map input graphs to output graphs, and can therefore be used to model phonological processes such as assimilatory tone sandhi. A benefit of this approach is that we may fix the complexity of logic used to define transductions, and determine whether mappings using either representation are possible with the same complexity threshold.
I demonstrate further that the separated and bundled models do not differ in their empirical predictions, by showing that transductions modelling register assimilation (§4.2) and contour assimilation (§4.3) are definable for both representations using quantifier-free first order (QF) logical statements. QF logic is restrictive and computationally simple, and has been shown to be equivalent to the Input Strictly Local class of functions (Chandlee & Jardine Reference Chandlee, Jardine, de Groote, Drewes and Penn2019b, Chandlee & Lindell Reference Chandlee, Lindell and Heinzto appear); these functions are sufficient to model a wide range of local phonological processes, both segmental and autosegmental (Chandlee Reference Chandlee2014, Strother-Garcia Reference Strother-Garcia2018, Chandlee & Jardine Reference Chandlee and Jardine2019a), despite their restrictiveness. Statements using QF logic determine output structure based solely on information about the corresponding input and about input positions within a fixed window (i.e. a local window) around it. Crucially, the information is not global in the sense that a quantifier is required to scan the entire input structure. QF thus provides an appropriate and well-motivated upper bound on the complexity of the processes formalised here. Full transductions are defined in Appendix A; in the main text I provide an intuitive graphical characterisation in terms of local substructures.
This section defines register- and contour-assimilation mappings in both representational models, and explores two hypotheses regarding the restrictiveness of QF. The first, more restrictive, hypothesis limits mappings to size-preserving QF transductions, for which the size of the input and output structures remains constant. The second, less restrictive, hypothesis permits non-size-preserving QF transductions; these map input structures to a finite number of output copies.Footnote 6 I will show that the latter hypothesis is necessary to capture register and contour assimilation in both models.
4.1 Tonal models as graphs and processes as graph mappings
Geometric tonal models can be explicitly represented as graphs, which are finite sets of points or nodes connected by edges. Each node is labelled with at most one feature: syllable, tonal root, ±u register feature, etc. Edges between nodes represent the internal structure of a tone: association between syllable and root node, dominance between internal nodes and linear order between nodes of the same type. A relational model ℳ is a mathematical object defining such a graph structure. It comprises a set or domain 𝒟 of structural positions (nodes) defined over an alphabet Σ of feature symbols. A set of unary relations (denoted P for each symbol in an alphabet Σ) determines the labelling of nodes with a particular feature – that is, the property of being a syllable, register node, etc. Unary relations for the bundled and separated models and the labels that each relation imparts are shown in Table III.
Table III Unary relations for the bundled and separated models and their labels.

The models contain the same set of unary relations, except that the separated model contains two extra relations labelling T and c nodes.
A set of unary functions define node edges representing internal structure and linear order. I use the same set of functions for both representations, and define them as follows. A function α defines an edge between a node labelled as a syllable and a node labelled as a root, and represents association. A function δ defines an edge between nodes that represents immediate dominance. A successor function s defines an edge between a node and its immediate successor. This function thus establishes a linear order over elements in the representation. Crucially, the order obtains only between nodes of the same type (i.e. that are on the same tier) – for example, in the separated model, register nodes are ordered with respect to one another, but not with respect to c nodes, which have their own order. The function is defined such that the final element in a tier is its own successor.
With this set of relations and functions, we can explicitly represent models of bundled and separated graph structures. (13) shows the disyllabic sequence [L.MH], a low level tone followed by a high rising tone, defined over a bundled graph and its corresponding model. Structural positions in the domain are denoted with numbers. A node – a single structural position in a graph – is represented with a circle with its corresponding label inside the circle. Edges are denoted with arrows, and are labelled with corresponding association α, dominance δ and successor s functions. In the model, a relation is defined as a set of positions which is in that relation, i.e. which contains that label. Functions are denoted as a set of ordered pairs of positions which define edges.
(13)
The same tone in the separated representation is defined as a graph model in (14).
(14)
The structural elements of any sequence of tones describable with a bundled or separated model can be defined in this way.
Assimilatory tone-sandhi processes are represented in a model-theoretic framework with transductions. Transductions map an input graph structure to a corresponding output graph, and a QF logic is fixed to define them. Here, I provide an intuitive discussion of these transductions with graph mappings, with the restriction being that outputs can only be defined by referring to local connected substructures in the input – i.e. they reference input nodes connected by edges. This is described in detail below.
Graph models comprise relations (which label nodes) and functions (which define edges between nodes). Similarly, graph transductions determine node labels and edges over an output graph by referring to input structure. Labels and edges are treated separately in a transduction, but refer to the same local input substructures. Importantly, a transduction defines a mapping over a class of graph structures, not over an individual graph. Thus transductions are definitions satisfied by a potentially infinite set of graph mappings. Consider the simplified example of a regressive spreading-type map in (15). The map is defined over a class of graph structures with two separate tiers of ordered nodes: one tier contains nodes labelled a and another contains nodes labelled either b or c. Nodes on these tiers relate one-to-one via edges marked δ. Regressive spread is the addition of a δ edge between the final node on the b/c tier and the penultimate node on the a tier. It also entails deletion of the input δ edge between penultimate nodes on these tiers, and thus ‘deletion’ of the penultimate node on the b/c tier (denoted with a dashed circle; more explanation is given below). The mapping in (15) is over a graph structure with three nodes on each tier, where ↦ denotes ‘maps to’ and δ¹,¹ indicates output δ labels. All mappings discussed in this paper are order-preserving, i.e. order relations are preserved from input to output structures (see Filiot Reference Filiot, Banerjee and Krishna2015 and Chandlee & Jardine Reference Chandlee, Jardine, de Groote, Drewes and Penn2019b for discussion). These edges (i.e. the successor s function) are therefore omitted, for clarity, but I assume a total order over each tier as in (13) and (14).
(15)
Defining output node labels via transduction is achieved through reference to local connected substructures; i.e. a given output node is determined by referring only to the corresponding input node and other nodes connected by edges in the input structure. These definitions thus determine output structure using only a fixed window in the input. It is possible, for example, to isolate structural elements such as the final a node and the penultimate b node, as illustrated in (16). This is because both constitute connected substructures in the input. The former is an a node with a looping s edge and the latter is a b node which shares an s edge with some final element – that is, an element with a looping s edge. The definition may preserve labels in the output for elements which map directly (an identity mapping) such as the final a node in (16a). Alternatively, it may map to an empty label, as in the penultimate b node in (16b). Such a definition ‘deletes’ the label from the output structure, and is comparable to deletion of structural material, for instance, after delinking in an autosegmental analysis. Importantly, however, it is still size-preserving; though the label does not appear in the output, the structural node itself is preserved, and thus does not alter the size of the input structure as a whole. The examples below demonstrate how the transduction defines output labels in terms of local substructures. Relevant input ordering edges are shown with dotted lines; x denotes the relevant input node and x′ the corresponding output node.
(16)
The size-preserving node-labelling definitions in (16) can be contrasted with strictly less restrictive, non-size-preserving transductions. Transductions of this complexity are defined over a finite set of multiple output copies (a copy set), and thus permit an output which is of a greater size than the input. Intuitively, non-size-preserving transductions are those which model processes with a copying mechanism. An example would be a final a node (as in (16a)) mapped to two copies of itself, as in (17), where x″ denotes a second copy of the input structure.
(17)
Transductions of this type are presented in the following sections, but it is worth noting here that the copying they permit is crucially restricted by the local substructure (i.e. QF-definability) requirement. That is, in addition to the finiteness of the copy-set size, the number of nodes definable within a given copy set is bounded by the size and structure of the input. In other words, despite permitting a copying mechanism, these transductions are still part of a restrictive and computationally simple class of maps.
Recall that maps define both output nodes and output edges. Determining output edges proceeds in a similar manner as with nodes, and is subject to the same restrictions. The regressive spreading-type map in (15) is possible because it can be defined in terms of local substructures. This is summarised in (18a) and (b). First, the final nodes on the a and b/c tiers are related via an input δ edge, and so the output dominance relation between them is simply an identity mapping; the input edge between two input nodes (x and y in (18)) is preserved in the output (x′ and y′; note that the δ edge on the first nodes in each tier may be preserved in the same way). As in (15), this output edge is denoted δ¹,¹, i.e. an edge from a node in the first output copy to a node in the first output copy.Footnote 7
(18)
The final c and penultimate a relate via the same δ edge in addition to a successor s edge between the penultimate and final a nodes, thus forming a local input substructure, as in (18b). Combined in a single transduction, these edges model a regressive-spread type pattern in (18c).
Graph mappings which refer only to local connected substructures model the space of maps definable with QF logical transductions. These maps are restrictive and formally rigorous, and align well with the complexity necessary to formalise local phonological processes (Chandlee & Jardine Reference Chandlee, Jardine, de Groote, Drewes and Penn2019b, Chandlee & Lindell Reference Chandlee, Lindell and Heinzto appear).
4.2 Register assimilation: Pingyao
Having defined separated and bundled theories as graphs, I now define transductions over these representations to model attested tone-sandhi patterns. Using this formalism, this subsection and the next show that Pingyao register-assimilation patterns and Zhenjiang contour-assimilation patterns are definable as QF transductions over both separated and bundled representations. Formalising register assimilation in this framework uncovers a discrepancy in the QF logical power required by both representations. This is a discrepancy between size-preserving QF logic (sufficient for the separated model) and non-size-preserving QF logic (necessary for the bundled model). While it may appear to signal a more general difference in the models’ capacity to capture assimilatory tone-sandhi processes, this distinction vanishes in the analysis of contour assimilation. Thus when register and contour assimilation are taken together, more powerful non-size-preserving QF transductions are necessary for both representations to capture register and contour assimilation. Within the computational perspective advocated here, this indicates that the separated and bundled models do not differ in their empirical consequences, as previously claimed.
4.2.1 The separated model
The first transduction defines mappings between input and output separated model graphs, and models register assimilation in Pingyao. Consider the Pingyao input /LM.HM/ as a separated graph structure, as in (19). (For clarity, successor function edges are omitted.)
(19)
The transduction is defined over a single copy of output nodes as follows. A penultimate ―u register node is first mapped to an empty label. Such a node is definable, as is shown by the connected substructure in (20): a ―u node sharing a s edge with some final element (one with a looping s edge).
(20)
Other output labels map directly from corresponding inputs.
Output edges are also preserved from inputs, with one exception.Footnote 8 The δ edge between the final register and the final T node is preserved, but an additional edge is defined between the final register node and the penultimate T node. Again, this definition is possible because these nodes comprise a local input substructure, as in (21).
(21)
Applied to the graph in (19), this transduction maps to the correct form [MH.HM] as an output graph in (22), where primes denote output positions. Note that this graph is also consistent with the sandhi form in (8). Thick arrows here and below denote changes in the output.
(22)
Importantly, the graph mapping as defined generalises beyond this case to any sequence with a penultimate rising tone. The restriction of local substructures (and thus QF) is therefore sufficient to model this pattern in the separated representation. Next, I show that the same holds for the bundled representation, though a more powerful QF logic is required.
4.2.2 The bundled model
A different transduction defines mappings between input and output bundled model graphs, and models register assimilation in Pingyao. We can consider the Pingyao input /MH.HM/ to be a bundled graph structure, as in (23).
(23)
Unlike its separated model counterpart, this transduction is defined over two copies of output nodes (and thus uses non-size-preserving QF logic), as follows. The transduction maps a penultimate ―u register node to an empty label (24a), and generates two copies of a final +u node (24b). Other node labels from the input are preserved.
(24)
Output edges are then redefined such that the α and δ edges on the penultimate syllable terminate on the second copy of the final +u node, with other edges held constant. Again, this is possible because these nodes form a connected substructure of α, δ and s edges in the input. This is shown in (25), where α¹, ² and δ¹, ² denote edges defined from nodes in the first output copy to nodes in the second output copy.
(25)
Applied to the graph in (23), this transduction maps to the attested form [MH.HM] as an output bundled graph, shown in (26).
(26)
Therefore, despite ‘failing’ the spreadability test (see §3.2), the bundled representation can model register assimilation in Pingyao as a logical transduction over bundled graph structures. Such a transduction is definable using a restrictive QF logic (or in our terms, by referring to local input substructures).
4.3 Contour assimilation: Zhenjiang
The difference in the class of logic needed to represent assimilatory processes between models vanishes in the case of contour assimilation. Examining both types of assimilation reveals that size-preserving QF is in fact too restrictive for both bundled and separated models, because we require copying, and thus copy sets, to capture contour assimilation in both representations. Despite this fact, transductions modelling contour assimilation are definable over both models, using non-size-preserving QF logic.
4.3.1 The separated model
This section defines a transduction to model Zhenjiang contour assimilation over a separated representation. The goal is a transduction which maps input graphs to output structures like those in (11); that is, ones which have undergone tier conflation. One desired output of such a transduction would be the graph mapping of /ML.H/ to [M.H] in (27). Here, the output contains two copies of the input's high contour.
(27)
This transduction is necessarily defined over a copy set of size two, as follows. The penultimate contour (c node and any terminal h or l, denoted ‘t’) maps to unlabelled nodes, as in (28).
(28)
Other labels are preserved, with the exception of the final high contour. To produce output graphs consistent with Bao's (Reference Bao1990) analysis, two copies of this connected input substructure are generated, as in (29).
(29)
The first output copy of the high contour node preserves its input δ edge with the corresponding input T node in (30a). Another edge is defined between the second copy of the high contour node and the penultimate T node in (30b).
(30)
Defined in this way, the transduction yields maps consistent with Bao's analysis of contour assimilation, including the mapping in (27) above. Importantly, though, non-size-preserving QF logic is necessary to generate these structures. Taken as a whole, then, size-preserving QF is too restrictive to capture both register and contour assimilation for the separated model.
4.3.2 The bundled model
A transduction of the same process over the bundled model is also defined over a copy set of size two. A Zhenjiang input /ML.H/ is presented as a bundled graph structure in (31).
(31)
Mapping the penultimate contour to unlabelled nodes is similar to the procedure used for the separated model. The difference lies in the fact that we want to preserve the immediate dominator of terminal nodes on the penult (i.e. preserve its input label), because this node carries register features. Since labels and edges are defined separately, terminal nodes on the penult can be isolated by referring to the structural position with which both nodes share a δ edge in the input. This definition thus applies to falling contours (as in the example above), but also equally well to rising contours and high and low level tones; any tonal terminal nodes which share a δ edge with some penultimate node satisfy the definition. In the bundled representation, this position is labelled with a register feature, while in the separated representation, it is labelled c (see §5). It is therefore possible – using connected substructures – to map the penultimate contour to unlabelled nodes (32a), while also preserving input label on the penultimate register node (32b).
(32)
The transduction also generates two copies of the final h node (but not two copies of the final register node) in the same way, as in (33).
(33)
All input edges are preserved in the output, with the exception of the second copy of the final h node. An output δ edge is defined between that node and the penultimate register node in the first copy set. This is shown in (34), where y′ denotes the first (and only) output copy of the penultimate register node and x″ denotes the second copy of the final h terminal node.
(34)
Applied to the graph in (31), this transduction maps to the correct output [M.H] bundled graph, as in (35).
(35)
Zhenjiang contour assimilation is thus formalisable over a bundled representation of tone using non-size-preserving QF logical transductions.
5 Graph mappings: structural differences and bi-interpretability
I now turn to condition (1b), which states that two models are notationally equivalent when they represent the same set of abstract properties, and differ only superficially. This condition can be satisfied in a model-theoretic framework by demonstrating the bi-interpretability of the bundled and separated models. This provides a rigorous formal expression of models differing only ‘superficially’. A definition of model bi-interpretability is given by Friedman & Visser (Reference Friedman and Visser2014: 2):
We note that an interpretation K:U → V gives us a construction of an internal model K̃ (ℳ) of U from a model ℳ of V. We find that U and V are bi-interpretable iff, there are interpretations K:U → V and M:V → U and formulas F and G such that, for all models ℳ of V, the formula F defines an isomorphism between ℳ and M̃ K̃ (ℳ), and, for all models 𝒩 of U, the formula G defines an isomorphism between 𝒩 and K̃ M̃ (𝒩).
Intuitively, this means that one model can be translated into the other (and vice versa), and that all contrasts are preserved through translation. The formal details of separated/bundled model bi-interpretability are spelled out in Appendix B; here I present an intuitive discussion. This section divides the definition above into two main components. The first (§5.1) establishes interpretations between models. An interpretation is a specific kind of map from one structure to another. For example, an interpretation Γbs maps bundled structures to separated structures by providing a model of bundled structures using the logical language of separated structures. Similarly, an interpretation Γsb maps separated structures to bundled structures by providing a model of separated structures using the logical language of bundled structures. The existence of both interpretations corresponds to the notion that the models are intertranslatable. In conceptual terms, the map defined by Γbs represents a fusion of T, c and ±u nodes into a single register node (holding all other nodes constant). The map defined by Γsb represents an expansion of a single register node to separate T, c and ±u nodes (again holding all other nodes constant). These intuitions were provided in (5) above, where dashed arrows represent fusion and expansion respectively.
The second component of bi-interpretability (§5.2) requires that the following conditions hold. First, combining Γsb and Γbs through composition – mapping bundled structures into separated structures and back into bundled structures – produces the same mapping as (i.e. is isomorphic to) the identity map that maps every bundled structure to itself. Similarly, composing Γbs with Γsb is isomorphic to the identity map that maps every separated structure to itself. In intuitive terms, this component demands that the two interpretations be inverses or mirror images of one another; as such, the translations they achieve preserve all contrasts present in the original representation. I demonstrate this by showing that for any tonal structure describable by bundled and separated representations, the output of Γsb is structurally identical to the input of Γbs, and vice versa.
5.1 Intertranslatability
5.1.1 Separated to bundled: fusion
A transduction Γsb maps any structure in a separated representation to a corresponding structure in a bundled representation, and is thus an interpretation of the class of separated models in terms of the class of bundled models. Like process transductions, Γsb is a mapping between graph structures that takes a set of node labels and edge relations as input, and maps it to another set of output node labels and edge relations. It is defined over a single copy set.
All bundled node labels are defined as identity mappings from relevant labels in the separated model, as all features in the former model are contained in the latter. This includes register node labels. Predecessor p and successor s edges are preserved, as linear order does not change. Association α and dominance δ edges, however, must be redefined to reflect the fusion of T and c structural positions into a single register node.
In the separated model, syllables and T nodes relate via α edges (representing the association relation). Register nodes also relate to T nodes, but via δ edges. These nodes and edges constitute a local substructure. We may use this substructure to redefine α edges in a bundled model such that syllable nodes relate directly to register nodes, which is not the case in the separated model. (36) illustrates how α edges are defined in the transduction.
(36)
Although the T node is not labelled in the output structure – recall that the bundled representation does not contain T nodes – this structural position is still a part of the input structure, and can therefore be referred to in defining the output. The definition here ‘fuses’ the separated model's T node and the bundled model's register node as the structural position which shares an α edge with the syllable. More generally, this reflects the fact that the T node in the separated model and the register node in the bundled model have the same structural function, i.e. the tonal root.
Defining δ edges in the transduction utilises another local input substructure to establish edges from terminal tonal nodes directly to register nodes, a relation which does not obtain in the separated structure. It builds on the fact that in the separated model, register nodes and tonal nodes both relate to a T node; the former shares a δ edge, while the latter relates through a c contour node and two δ edges. This mapping is shown in (37).
(37)
Again, the fact that T and c nodes are unlabelled in the output does not prevent reference to these structural positions to relate terminal nodes directly to register nodes. The ‘fusion’ here is between the c node in the separated model and the register node in the bundled model, and reflects the generalisation that these nodes also have the same structural function: the immediate dominator of h and l terminal nodes. The transduction Γsb applied to a separated model structure produces an equivalent structure in a bundled representation. An example of a disyllabic sequence, [L.MH], is given in (38) (predecessor and successor edges are omitted for clarity).
(38)
Γsb produces an equivalent bundled model structure from any separated model structure. It is therefore an interpretation of the class of separated graphs in terms of the class of bundled graphs.
5.1.2 Bundled to separated: expansion
Similarly, a transduction Γbs maps any structure in a bundled representation to a corresponding structure in a separated representation, and is thus an interpretation of the class of bundled models in terms of the class of separated models.
The number of node-label types in the separated model is greater than in the bundled model – the former contains T and c labels, which are not present in the latter. A copy set greater than size one is necessary. The transduction is defined over a copy set of size three, where each copy set will represent one ‘expansion’ of the bundled model's register node: T nodes in the first copy set, register nodes in the second copy set and c nodes in the third copy set. These nodes relate via a one-to-one identity mapping. This is shown in (39), where x′, x″ and x‴ indicate nodes in the first, second and third copy sets respectively.
(39)
Syllable nodes are labelled in the first copy set, while h/l tonal nodes are labelled in the third copy set. This allows for preservation of α and δ edges from the input bundled structure, given in (40a) and (b) respectively. Again, this reflects the fact that these nodes share structural functions across models: register and T nodes relate to syllables via association, and register and c nodes to h/l tonal nodes via dominance.Footnote 9
(40)
The transduction defines the internal structure of the separated model – i.e. δ edges between register/c and T nodes – in terms of the bundled model in the following way. Given that T, register and c labels in the separated structure map directly from a single register node in the bundled structure, δ edges between these nodes can also be defined in terms of the same register node, keeping in mind that a single node constitutes a connected substructure. Dominance from the register node to the T node is a δ edge from a node in the second copy set to an identical node in the first copy set, and dominance from the c node to the T node is a δ edge from a node in the third copy set to an identical node in the first copy set, as shown in (41).
(41)
The edges that the transduction defines between and within copy sets are summarised in (42).
(42)
Combining these definitions, the transduction Γbs applied to a bundled structure produces an equivalent structure in the separated model. An example of a disyllabic sequence [L.MH] translated from a bundled to a separated structure is given in (43).
(43)
Again, this is true for any bundled representation graph. Therefore, Γbs is an interpretation of the class of bundled graphs in terms of the class of separated graphs.
5.2 Contrast preservation
The second main component of the bi-interpretability definition requires translations between models to be contrast-preserving. Appendix B demonstrates this in detail by examining the composition of transductions Γsb and Γbs, and showing that their composition is isomorphic to the identity map, but here I show that the translations described in the previous section crucially preserve structural elements and their relations from input models, and that bundled and separated representations therefore fit this necessary criterion for bi-interpretability. The example translations in (38) and (43) illustrate this; a general demonstration is given in Appendix B.
Given two mappings ℳs ↦ ℳ′ b via Γsb (translating a separated model to an equivalent bundled model) and ℳb ↦ ℳ′s via Γsb (translating a bundled model to an equivalent separated model) of the same tonal structure,the following holds of these graph structures: ℳ′b is structurally identical to ℳb, and ℳ′s is structurally identical to ℳs. Recall the example translations of disyllabic [L.MH] in (38) and (43). The output of (38) contains the same structural elements and relations between those elements as the input of (43). This is illustrated in (44), where ℳ′b denotes the former and ℳb the latter.
(44)
The same is true of the input of (38) and the output of (43). Separated representation components are present in both, and structural elements relate to one another via the same edges, as in (45).
(45)
The above illustration generalises beyond the [L.MH] sequence to any tone or sequence of tones representable by either model. This reflects the generalisation implicit in Table I in §3.1 that bundled and separated models represent the same set of lexical tonal contrasts. Translation between the models maximally preserves those contrasts.
Combining this with the results from §5.1, we can conclude that separated and bundled representations are bi-interpretable in a strict model-theoretic sense. Within the framework adopted here, the models do not differ in any non-trivial way in terms of their structure. Condition (1b) for notational equivalence is thus satisfied.
6 Discussion
The previous sections applied a model-theoretic approach to the question of notational equivalence between two models of tonal representation. They showed that separated and bundled models differ neither in their empirical consequences, as previously argued, nor substantially in their representation of abstract properties. I therefore conclude that they are notationally equivalent. Here, I pause to interpret these results and consider their ramifications.
6.1 ‘Letter’ vs. ‘spirit’ of the theory
In §4 I made the claim that non-size-preserving QF logic is necessary to capture register assimilation and contour assimilation across both models; size-preserving QF is too restrictive in these cases. As it applies to feature-geometric tonal representation, the fundamental difference between these logics is that the latter allows only spreading, while the former permits both spreading and copying. This result diverges crucially from previous analyses, in that register and contour assimilation become possible for the bundled model. A reasonable question to ask is whether this is appropriate, and does not unreasonably coerce the separated and bundled models beyond their original intentions. In other words, is a non-size-preserving QF analysis in the spirit of these theories?
The answer to this question is yes, and stems from the observation that spreading and delinking mechanisms are, on their own, insufficient to capture the full range of assimilatory processes in Chinese dialects. Any representational theory of these patterns requires an additional copying mechanism in the form of tier conflation (Younes Reference Younes1983, McCarthy Reference McCarthy1986), a procedure borrowed from segmental representation and templatic morphology. Yip (Reference Yip1989: 160) describes the process, which ‘automatically copies non-adjacent multiply linked roots so as to allow interpolation of the vocal root’. Applied to an edge-in association and contour-spread pattern in Danyang (Lü Reference Lü1980, Yip Reference Yip1989, Chan Reference Chan1991), for example, tier conflation copies tonal information from a root node which has spread two syllables, to the right, as in (46).
(46)
The extra derivational step ensures the surface form [HL.HL.HL.LH] (three falling contours followed by a rising contour), rather than a single falling contour realised gradually over three syllables, *[H.M.L.LH]. The same generalisation can be applied to local contexts as well, and is in fact necessary for a contour spreading as a unit, for the same reasons. Consider the hypothetical Danyang-like pattern in (47), formalised in a separated model, with progressive contour spreading.
(47)
With a single contour associated to two roots (as above), there is no guarantee that the observed [LH.LH] will obtain, rather than any of the logically possible *[L.H], *[L.LH] or *[LH.H]. The copying mechanism of tier conflation ensures this, and is thus necessary whenever contour spreads as a unit. This means that a theory of tone-sandhi assimilation over these representations comprises three basic mechanisms: addition of association lines (spreading), deletion of association lines (delinking) and copying of structural nodes. Therefore, while it is true that the ‘spreadability’ metric – using only the spreading and delinking mechanisms – distinguishes separated and bundled models in cases like register assimilation, it ultimately provides an incomplete picture, because it neglects a basic operation of the theory.
Given a theory with three basic mechanisms, it is unclear how a spreading analysis with tier conflation is different from an analysis which simply copies the contour nodes and reassociates them. The Zhenjiang derivation in (11) is repeated in (48a) as illustration.
(48)
Spreading with tier conflation is identical to an alternative copying analysis, which proceeds as in (48b). The assimilating contour node is first copied (indicated with an index), then reassociated to the preceding tone after delinking. A copying analysis correctly predicts the observed [M.H] for an input string /LM.H/, as in the spreading analysis. Outputs for both analyses are not only surface-form identical, but also structurally identical, and are achieved using the theory's basic operations. They differ only in the relative order of these procedural mechanisms.
If a copying analysis is permitted in this framework, the bundled representation can in fact model register- and contour-assimilation processes, contrary to previous claims. In Pingyao register assimilation, for example, a copying analysis would generate a copy of the final register node, then reassociate the syllable and terminal nodes to that copy. This correctly predicts the output [MH.HM] from /LM.HM/, as in (49). Note that the resulting structure is identical to the output of the non-size-preserving QF mapping defined over bundled graph structures in (26).
(49)
Similarly, a copying analysis for Zhenjiang contour assimilation would generate a copy of the terminal h node, then reassociate it to the preceding register node. As with the separated model, the analysis generates attested forms for the bundled model. This is shown in (50) for /LM.H/ → [M.H]. As before, the resulting form is identical to the output of the bundled graph mapping in (35), i.e. a mapping describable by a non-size-preserving QF transduction.
(50)
These copying analyses preserve the spirit of the original separated and bundled representational theories in the sense that they utilise the same basic – in fact necessary – mechanisms as traditional spreading accounts with tier conflation: spreading, delinking and copying. The basic mechanisms are not unrestricted in their application, however. They are limited to local environments, and do not involve any long-distance dependencies. The mappings defined in §4.2 and §4.3, then, are also in the spirit of the original theory, given their QF-definability. In spite of allowing copying, they are still restricted to local environments, and thus accord with the limitations of the original theory.
6.2 Spreading vs. copying
We may also ask how ‘spreading’ and ‘copying’ analyses differ (if at all). Answering this question, as suggested in §2, is non-trivial. This is because these mechanisms are inevitably fixed to the grammatical formalisms in which they are proposed. Spreading accounts of assimilatory sandhi in Chinese dialects are couched in derivational terms. Tier conflation, for example, is crucially ordered after spreading; its scope is dependent on the structural environment created by the application of an earlier rule. In principle, a similar argument is available in the case of copying analyses in a derivational framework. Association of a copied element in such an account is also dependent on the prior delinking of underlying associations to avoid line-crossing.
These distinctions vanish in so-called ‘one-jump’ models like Optimality Theory, but the assumptions of that formalism also obscure the picture. A Correspondence account of copying like the one described above does not translate well to spreading in autosegmental representations (although some effort has been made to conflate the two; see Kitto & de Lacy Reference Kitto and de Lacy1999). The correspondence relation ℜ is not analogous to the association relation: the former obtains between elements on the same tier, while the latter is necessarily inter-tier, etc. Instead, autosegmental spreading analyses in OT are formulated in terms of spreading-specific constraints: markedness constraints like Share (McCarthy Reference McCarthy, Goldsmith, Hume and Wetzels2010) and Agree (Baković Reference Baković2000, Lombardi Reference Lombardi and Lombardi2001, Pulleyblank Reference Pulleyblank2002), and faithfulness constraints such as Ident-Assoc (de Lacy Reference de Lacy2002). Direct comparison of these two mechanisms is thus impossible, because the theory assumes that they are regulated by separate constraints in the grammar.
One benefit of the computational approach pursued in this paper is that it allows us to compare these analyses independently of any one grammatical formalism. Instead, we can fix an upper bound on complexity, and simply ask whether the analyses can be mapped within that bound. The complexity difference between size-preserving and non-size-preserving QF logic is precisely the formal expression of the difference between theories which only permit spreading and those which allow spreading and copying. Assimilatory tone sandhi in Chinese dialects modelled with graph structures falls into the latter camp, regardless of which analysis is adopted. The spreading-with-tier-conflation analysis of Zhenjiang in (48a) and the copying analysis in (48b) are definable as non-size-preserving QF logical transductions, crucially not as size-preserving ones. In fact, they are definable using the same logical transduction. Spreading (with tier conflation) and copying are thus formally indistinguishable in such cases, because they realise the same map. This fact renders the traditional ‘spreadability’ metric ineffective as an empirical test to distinguish tonal models of representation; if ‘spreading’ and ‘copying’ analyses of assimilatory tone sandhi both require non-size-preserving QF logic, either is sufficient to show that a representational model captures a given pattern within the QF bound. The QF-definable graph mappings of register and contour assimilation defined for the bundled representation in §4.2.2 and §4.3.2 do precisely that. So while the bundled model may fail the spreadability test for these patterns, it passes the more formally rigorous test, providing evidence that its empirical coverage does not differ from that of the separated model.
This formal result aligns to a certain extent with earlier literature (e.g. Kitto & de Lacy Reference Kitto and de Lacy1999), which collapses spreading and copying into a single mechanism in a Correspondence framework. This is addressed in work by Kawahara (Reference Kawahara2004, Reference Kawahara, Bateman, O'Keefe, Reilly and Werle2007), who motivates a clear distinction between copying and spreading. The aim of the current study, however, is not to settle this larger debate. Formal equivalence between these mechanisms is limited to cases of assimilatory tone-sandhi processes formalised over two classes of graph structures. Determining whether this generalises to other processes (including those for which other rules intervene between spreading and tier conflation) defined over these representations or others is beyond the scope of the current paper. However, as this paper has shown, the model-theoretic approach provides a solid formal foundation for addressing this question in greater detail.
Given the equivalence of spreading and copying in this particular case, we may also ask why spreading one constituent independently of another has been so crucial to theories of tonal representation in the first place. Is spreading all that matters? Recall from §3 that one design criterion for theories of tonal representation, apart from accounting for the full range of lexical tonal contrasts, is the ability to concisely model commonly attested tonal processes (Yip Reference Yip2002). Though assimilatory tone-sandhi processes are attested in Chinese dialects, Chen (Reference Chen2000) notes that the majority are dissimilatory, and that the alternations categorised as tone sandhi also include neutralisation, paradigmatic substitution and metathesis. It is therefore unclear why constituent spreading carries such substantial weight in distinguishing the empirical coverage of different models. This is indeed the case for the models proposed by Yip (Reference Yip1989) and Bao (Reference Bao1990), who defend their models precisely on the basis of their ability to spread one or more constituents as a unit.
One would expect that evidence motivating structural independence of some constituent would be unambiguous, but this is also not the case. Bao (Reference Bao1990), for example, cites sandhi patterns from two dialects to motivate the independence of contour from register: Zhenjiang (Zhang Reference Zhang1985), examined in §4, and Wenzhou (Zhengzhang Reference Zhengzhang1964). Chen (Reference Chen2000: 73), however, rejects both analyses, claiming that only data from another dialect, Zhenhai (Rose Reference Rose1990), provides clear evidence of contour's independence from register. According to Chen, the correct analysis of Zhenjiang is not contour spreading, but rather contour simplification.Footnote 10 That spreading as a means of formalising attested sandhi processes is of such consequence to the representational theories proposed in the literature should cause concern, as what constitutes an unambiguous case of spreading is unclear. Zhang (Reference Zhang, James Huang, Audrey Li and Simpson2014) cites this issue as a source of stagnation in discussions of Chinese tone-sandhi representation over the last decade. The current study, then, will ideally serve to renew interest in representational questions by providing a less formalism-dependent means of evaluating a model's empirical coverage.
6.3 Bi-interpretability and mutual interpretability
The second condition for notational equivalence, as defined in (1b), is entirely separate from considerations of empirical predictions.Footnote 11 Rather, it concerns the nature of structural differences between two models, and the superficiality and substantiveness of those differences. This paper has adopted the notion of bi-interpretability as the formal expression of ‘superficial’ structural differences, and has demonstrated that bundled and separated models of tonal representation are bi-interpretable within the model-theoretic framework. Here, I evaluate this result in the context of earlier studies which employ the same formalism, but differ in their interpretations of structural notational equivalence via bi-interpretability.
Strother-Garcia & Heinz (Reference Strother-Garcia and Heinz2017) explore three representations of syllable structure proposed in the literature, and use model-theoretic graph transductions to demonstrate notational equivalence between all three representations. Their definition of bi-interpretability (and thus notational equivalence) is as follows. If a graph transduction definable using logic L exists from some model M₁ to some model M₂, then ‘M₂ is L-interpretable from M₁’. If the condition holds in both directions, such that ‘M₁ is L-interpretable from M₂ and vice versa, then the two are L-bi-interpretable’. Similarly, Danis & Jardine (Reference Danis and Jardine2019) address the question of notational equivalence between classical autosegmental representations (Goldsmith Reference Goldsmith1976) and Q-theory representations (Shih & Inkelas Reference Shih and Inkelas2018). Bi-interpretability is also defined as the existence of interpretations between two models defined logically. That is, for models and
, if there exists an interpretation of
in
(i.e. a transduction defined in the logic of
; or
) and an interpretation of
in
, then the models are bi-interpretable.
These definitions of bi-interpretability differ from the definition in Friedman & Visser (Reference Friedman and Visser2014) adopted in the current paper. While both require interpretations between models, the definition advocated here establishes the additional requirement that translations between models be contrast-preserving (§5.2). Definitions from earlier studies are more akin to mutual interpretability, a weaker notion of equivalence. Enayat & Wijksgatan (Reference Enayat and Wijksgatan2013) give the following definition of mutual interpretability:
Suppose U and V are first order theories. U is interpretable in V, written U ⊴ V, if there is an interpretation ℐ:U → V. U and V are mutually interpretable when U ⊴ V and V ⊴ U.
Previous accounts of notational equivalence among syllabic and autosegmental representations mentioned above arguably demonstrate mutual interpretability of the representational theories they examine. The existence of interpretations in both directions does not, by itself, guarantee an isomorphism between their composition and the identity map. In other words, some contrasts might be lost through translation from one representation to another. By illustrating that separated and bundled models satisfy the more restrictive definition of bi-interpretability (§5.1 and §5.2), this paper advocates a stronger hypothesis about notational equivalence from a structural perspective.
6.4 Other models of representation
Separated and bundled models of representation are shown to be notationally equivalent by the definition in (1). One benefit of the model-theoretic approach adopted in this paper is that it allows for a principled comparison of representational models’ empirical predictions and their structural differences, using the same formal framework.
This does not entail the notational equivalence of tonal geometries in general, though, nor does it make the claim that geometry is irrelevant. Insofar as feature geometry aims to determine which features behave as units phonologically (McCarthy Reference McCarthy1988), the bundled and separated models proposed by Yip (Reference Yip1989) and Bao (Reference Bao1990) are quite similar, in that they touch on the same conceptual point: contour tones behave as units in phonological processes. Furthermore, they realise this point in the same manner geometrically: contours are represented as a constituent under some other node. In the bundled model, this node is specified for a feature (register), while in the separated model it is not. This paper has shown that such a difference is not as conceptually distinct as previously argued. Importantly, our formal analysis disentangles the key notion of constituency (domination under some structural node) from the featural content of the node itself. In our terms, this is reflected in the notion that node labels and node edges refer to the same structures, but are defined separately. Since both models represent contour as a constituent, they predict it to behave as a unit in processes such as assimilatory tone sandhi. By formalising processes as logical transductions (thereby abstracting away from assumptions specific to a grammatical formalism), the current study has shown that the two models make the same predictions about such processes. It has also exploited the constituency of contour (among other structural similarities) to translate between these representations via logical transduction, providing formally sound evidence that the observed structural differences between the models are superficial.
The scope of the current study is limited to two models of tonal representation. It does not make claims about other representations, but does provide a framework to determine equivalence. Of particular interest is a comparison between models which assume contour units to be single constituents (such as the separated and bundled models) and those which do not, such as those proposed by Duanmu (Reference Duanmu1990, Reference Duanmu1994). In order to make a claim about equivalence between these models, it must be determined that these models satisfy both the conditions in (1): that is, that the models (i) do not differ in their empirical consequences, and (ii) differ only superficially in their representation of abstract properties (i.e. are bi-interpretable). If these models fail to satisfy both conditions, we have rigorous formal evidence that they are not notationally equivalent.
7 Conclusion
This paper has motivated a computational analysis of the notational equivalence of tonal geometries offered by Yip (Reference Yip1989) and Bao (Reference Bao1990). It has defined tonal representations as model-theoretic graph structures, and assimilatory tone-sandhi processes as mappings between graphs, using statements in QF logic. The first result is that the models do not differ in their empirical predictions, as previously claimed. Given the necessity of a tier-conflation mechanism across both representations, a more restrictive, size-preserving QF logic is too restrictive to model the full range of tone-sandhi processes. Statements in non-size-preserving QF logic, by contrast, are sufficient to model tone-sandhi patterns which previous work has claimed distinguishes the two models: register assimilation in Pingyao and contour assimilation in Zhenjiang.
Using the same model-theoretic formalism and a rigorous definition of bi-interpretability, the second result is a proof that any structural difference between the representations is superficial. Specifically, the representations are intertranslatable, and translation between models is contrast-preserving. I thus conclude that separated and bundled models are notationally equivalent, and do not constitute distinct theories of tonal representation.
The purpose of this paper is not to propose a new tonal model or advocate one model over another. Instead, its aim is to establish a formally rigorous procedure for determining whether two competing models comprise two distinct theories of representation. Ideally, this paper serves as a proof of concept to be expanded in future work, including widening the empirical scope beyond tone sandhi and analysis of other competing models of tonal representation which have been claimed to be distinct, but may very well be notationally equivalent.