Caroline Féry’s Intonation and Prosodic Structure is a state-of-the-art survey of the relationship between prosody, morphosyntax and information structure. The book contains highly didactic introductions to the relevant topics such that it can also serve as a textbook for graduate-level courses, and possibly for advanced undergraduate courses. Each chapter is complemented with discussion questions and refers the reader to seminal literature discussed in each of the topics treated in each of the chapters.
The author does not discuss the literature in prosodic phonology in an uncritical fashion but argues for specific assumptions about the way prosodic structure and intonation interact with other parts of grammar. These assumptions can be summarized as follows: (i) Indirect reference theory: Phonological processes do not index morphosyntactic structure directly but make reference to the prosodic hierarchy; (ii) Prosodic hierarchy hypothesis: All languages have a fixed set of phonological constituents that are built from mapping principles that refer to morphosyntactic structure; (iii) Recursive prosodic layer hypothesis: Prosodic layers (can) display recursion; (iv) Autosegmental metrical hypothesis: Tones are assigned to a hierchically organized metrical structure and syllables not specified with tones receive their pitch via interpolation; and (v) Alignment theory of phonology information structure interactions: Information structural categories such as focus and topic relate to alignment constraints at the syntax-phonology interface rather than directly to phonetic content. These assumptions are more or less explicitly argued for in the text. In the review below, I will lay out a few additional assumptions that I think are implicit in Féry’s presentation.
Chapter 1 is an introduction to the book and clarifies its basic structure and scope. Chapter 2 introduces articulatory and acoustic phonetics as they relate to intonational phonology. An introductory discussion of computer assisted pitch analysis is provided including a discussion of errors in pitch track generating algorithms.
Chapters 3 and 4 are broadly concerned with phonological categories and constituents and their relationship to morphosyntax. A historically grounded discussion of the prosodic hierarchy is provided, from the first version where prosodic domains were constrained by the strict layer hypothesis to the much weaker version of the theory adopted today that allows recursion and layer skipping. Chapter 3 deals with the relationship between moras, syllables, feet, and prosodic words. The author treats prosodic words in English and Japanese in detail. This section does not do justice to the typological variation in prosodic wordhood phenomena and the problems such phenomena might pose for some of the assumptions adopted by the author (e.g. Bickel et al. Reference Bickel, Banjade, Gaenzsle, Lieven, Paudyal, Rai, Rai, Rai and Stoll2007, Hildebrandt Reference Hildebrandt2007, Woodbury Reference Woodbury2011, van Gijn & Zúñiga Reference van Gijn and Zúñiga2014, Zúñiga Reference Zúñiga2014, Tallman Reference Tallman2020), but one could argue that such considerations are outside the scope of the volume.
Chapter 3 also introduces the distinction between direct reference theory, which assumes that phonological processes directly index morphosyntactic structure, and indirect reference theory that assumes that phonological processes relate to layers of a universal prosodic hierarchy which are projected on the basis of mapping rules that relate to morphosyntactic structure. Without getting in to too much detail, mapping rules are required because the span of the application of phonological processes that index layers of the prosodic hierarchy are structurally close, but not identical to corresponding morphosyntactic levels. Therefore, there must be some set of functions that maps a prosodic domain from a corresponding morphosyntactic constituent (ω ← X0 / morphosyntactic word; ϕ ← XP / syntactic phrase etc.).
The mapping rules are not discussed until Chapter 4. In this section the case for recursive prosodic domains is made more forcefully as the author argues that one can only maintain that prosodic domains are not recursive insofar as one ignores prosodic phenomena under conditions of complex sentence structure. The author also mentions en passant that positing recursive structures can result in ambiguity in the assignment of layers (62), because for a set of hierarchically organized phonological domains it is unclear whether we are dealing with different layers of the prosodic heirarchy or the same layer recursed. Féry also adopts a potentially controversial assumption regarding how recursed prosodic domains relate to empirical phenomena. For Féry, successive layers of a recursed prosodic domain need not have the same empirical signal. This is illustrated in Féry’s analysis of weak affixes in English, which are conjectured to be prosodic words despite not bearing stress (see Vogel Reference Vogel2019 for more details and criticism). Another example comes from Japanese, where the minimal domain of ϕ-phrases is defined on the basis of pitch-accent culminativity, but the maximal domain of ϕ-phrases conditions catathesis (240–241).
Chapter 5 introduces different models of intonation contrasting the parallel encoding of target implementation model (PENTA), the nuclear tone model, and the tone sequence model. The chapter also includes an overview of some of the tonal phenomena in African languages (Mende, Igbo, Ewe) that partially propelled the autosegmentalization of phonology in the 1970s. This provides the reader with the historical context and empirical motivation underlying the assumptions of the tone sequence model. The tone sequence model, ostensibly favored by the author, is described in much more detail than that of the PENTA or nuclear tone tradition. I found Féryʼs overview of the PENTA model too superficial to justify the author’s dismissive tone towards it (132, 260). Specifically, I think the author missed an important opportunity to highlight some pitfalls of the tone sequence model with regard to how one goes about discovering where and what precisely the underlying tones are over an intonational contour (see Ladd Reference Ladd2008: 134–138 for discussion), a weakness that advocates of the PENTA model have highlighted and addressed on the basis of a different set of assumptions about prosodic structure (e.g. Xu et al. Reference Xu, Lee, Prom-on and Liu2015).
Chapter 6 deals with the relation between meaning and intonation and is the most difficult chapter because of the large number of interacting variables involved. Féry shows that one has to minimally consider the following variables: (i) focus type/strength; (ii) givenness type/strength; (iii) topicality; (iv) theticity; (v) position in relation to the focused constituent; (vi) sentence position; (vii) NP realization type; (viii) position in the ϕ- or ι-phrase; and (ix) presence versus absence of nuclear stress. Despite the complexity and difficulty of the problem, Chapter 6 is the best chapter of the book for two reasons. First, the information-structural categories are teased apart with a high degree of granularity such that they can be distinguished cross-linguistically without ambiguity. For example, at least six types of focus are motivated (brand new > brand new anchored > inferable > containing inferable > evoked > situationally evoked) all discussed in relation to empirically verifiable diagnostics. Secondly, the author reviews the accumulated knowledge in the literature such that testable hypotheses can be distilled from the discussion. For instance, Féry argues that for a given constituent, as it increases its focus strength, it is more likely to be realized with a pitch accent.
Chapters 7 and 8 are organized around overarching typological classifications based on the distribution of stress and tone at different levels in the prosodic hierarchy. Chapter 7 is organized around classifications at the level of the prosodic word. Chapter 7 presents a discussion of word-level prosody organized around four linguistic types that fall out from the presence or absence of lexical tone or lexical stress. Languages with neither lexical tone nor lexical stress are French, Bella Coola, Berber, Indonesian, West Greenlandic, Finnish, and Hungarian. Languages with just lexical stress are English, Danish, Dutch, German, Spanish, Russian, Greek, and Slavic. Languages with just lexical tone are Chinese and Vietnamese. A perhaps controversial aspect of the chapter is that Féryʼs presentation implies that pitch-accent languages, which are defined as displaying lexical tone and lexical stress, are a legitimate cross-linguistic type in some sense. For Féry, pitch accent languages include Swedish, Norwegian, Danish, Japanese, Basque, and Turkish. It is not entirely clear whether these language types are pigeon-holes posited for expositional purposes or whether Féry intends the classification to be informative beyond what falls out of the classification (presence or absence of lexical tone and stress). It is unclear, for instance, what pitch-accent languages have in common beyond the classification itself (see Hyman Reference Hyman2006, Reference Hyman2009).
Chapter 8 presents a cross-linguistic overview of intonation partially organized around the prosodic word types described in the previous chapter. The chapter reviews phrase and intonation-level prosody for a subset of the languages discussed or mentioned in the previous chapter. Féry explicitly points out that tone languages are a ʻheterogeneous groupʼ (257). Lexical stress languages such as English are identified as ʻintonation languagesʼ (227). Chapter 8 also discusses phrase languages which are a ʻnew categoryʼ that ʻresemble intonation languages in that their tonal specifications are mostly assigned at the level of ϕ-phrases and ι-phrases. But contrary to intonation languages, specifications at the level of the word are sparse, absent or only weakly implementedʼ (270). Féryʼs description of phrase languages assumes that certain layers of the prosodic hierarchy need not have any empirical signal. If it were not assumed that some layers could be latent, ʻphrase languagesʼ would presumably need to be interpreted as counterexamples to the prosodic hierarchy (see Schiering, Bickel & Hildebrandt Reference Schiering, Bickel and Hildebrandt2010). Alternatively, we could reinterpret the phonological processes that Féry associates with the ϕ-level as indexes of ω-words, which happen to map over XPs in some languages (Tallman Reference Tallman2020). Indeed, the author provides the reader with no mapping principles for word-level categories and thus it is unclear why such a proposal is ruled out.
Chapter 9 provides an overview of psycholinguistic studies on prosody, which describes studies on speech comprehensive in relation to prosodic breaks and rhythmic alternatives and ʻimplicit prosodyʼ (prosodic structure that a reader imposes on written forms). The chapter ends by emphasizing the need to conduct speech processing studies on a typologically broader set of languages. Chapter 10 summarizes the contents of the book with a focus on areas that require future research.
In my view the main weakness of the book lies in its failure to present the reader with any testable hypotheses that relate to indirect reference and the prosodic hierarchy theory. In the book, the historical trajectory of the prosodic hierarchy theory is presented as an incremental weakening of a more restrictive theory embodied in the strict layer hypothesis (see Vogel Reference Vogel2019 as well). Thus, the greater variety of arboreal structures that emerge from allowing recursion and layer skipping translate to a loosening of the predictive power of the prosodic hierarchy. I think, however, when one considers the consequences of this structural loosening, as it is presented by the author, we are left, not so much with a less restrictive theory, but an unrestricted one, i.e. a tautology. When one posits that successive layers in a recursed prosodic category, be they recursed ω or ϕ, can be indexed by distinct empirical signals, the result is that the prosodic hierarchy theory places no upper bounds on the number of nonconvergent phonological processes that are causally related to its layers. Conversely, if we assume that layers of the prosodic hierarchy need not bear an empirical signal to be latently present, as in the ω of phrase languages, the result is that the theory places no lower bounds on what phonological processes it seeks to explain either. Thus, the prosodic hierarchy theory has no predictive power beyond what we arrive at by positing some form of indirect reference.
However, the arguments about recursive phonological structures in relation to complex syntactic structure actually weaken the claim that indirect reference theories are uniquely positioned to capture prosodic phenomena. This is because recursive prosodic structures are usually isomorphic with the syntactic constituents over which they map, suggesting that a minor change in oneʼs assumptions about the syntax might obviate the need to posit any indirect mapping principles. Furthermore, while indirect reference theories are supposed to be motivated because of non-isomorphy between prosodic domains and morphosyntactic constituency, scarcely any evidence for the morphosyntactic structures over which the mapping principles apply is presented.Footnote 1 Féry does not discuss how one establishes the boundaries between X0 and XP, etc. (contrast this with the clear discussion of diagnostics for information structural categories in Chapter 5), despite the fact that such decisions are crucial for establishing and testing mapping principles (e.g. Miller Reference Miller2018: 134; Bennet & Elfner Reference Bennett and Elfner2019: 162–163). Finally, Féry does not discuss any of the criticisms of the prosodic hierarchy (Scheer Reference Scheer2011: 332), nor research that suggests that prosodic structures might be better explained as emergent properties from language use and history rather than as causally related to a universal latent prosodic hierarchy (Bybee Reference Bybee2007; Woodbury Reference Woodbury1992, Reference Woodbury1998; Bickel, Hildebrandt & Schiering Reference Bickel, Hildebrandt and Schiering2009; Schiering et al. Reference Schiering, Bickel and Hildebrandt2010).
To conclude, the volume provides an in-depth and typologically responsible overview of prosodic phonology, which could usefully serve as a textbook for teaching current approaches to intonation and prosodic phonology. It is questionable, however, that the indirect reference-based theories that underlie the presentation constitute necessary additions to our conceptual framework for describing and explaining morphosyntax–phonology interactions.