Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-11T21:01:08.311Z Has data issue: false hasContentIssue false

“Process and perish” or multiple buffers with push-down stacks?

Published online by Cambridge University Press:  02 June 2016

Stephen C. Levinson*
Affiliation:
Max Planck Institute for Psycholinguistics, Nijmegen & Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6500 AH Nijmegen, The Netherlands. stephen.levinson@mpi.nl

Abstract

This commentary raises two issues: (1) Language processing is hastened not only by internal pressures but also externally by turn-taking in language use; (2) the theory requires nested levels of processing, but linguistic levels do not fully nest; further, it would seem to require multiple memory buffers, otherwise there's no obvious treatment for discontinuous structures, or for verbatim recall.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2016 

Christiansen & Chater (C&C) have tried to convert a truism of psycholinguistics (essentially, Miller's Reference Miller1956 short-term memory limitation) into a general theory of everything in language, in which representations are mere traces of processing, while hierarchy, patterns of change and the design features of language all follow from processing limitations. But like most general theories, this one seems underspecified, and it is hard to know exactly what would falsify it.

In this commentary I make two points. First, I suggest that the pressure for speed of processing comes not only from the effect of an evanescent signal on internal processing constraints, but also from outside, from facts about how language is used. Second, I would like to gently question the truism of the “process and perish” theory of linguistic signals.

Language comes, for the most part, as an acoustic signal that is delivered remarkably fast – as C&C note, faster than comparable nonlinguistic signals can be decoded. But why? One might try, speculatively, to relate this to the natural processing tempo of the auditory cortex (Hickok & Poeppel Reference Hickok and Poeppel2000) or to some general drive to efficiency. In fact, there are more obvious reasons for haste – namely, the turn-taking system of language use (Sacks et al. Reference Sacks, Schegloff and Jefferson1974). The turn-taking system operates with short units (usually a clause with prosodic closure), and after one speaker's such unit, any other speaker may respond, the first speaker gaining rights to that turn – thus ensuring communication proceeds apace. Turn transitions on average have a gap of only c. 200 ms, or the duration of a single syllable. Speakers are hastened on by the fact that delayed responses carry unwelcome semiotics (Kendrick & Torreira Reference Kendrick and Torreira2015). Now, the consequences of this system for language processing are severe: It takes c. 600 ms for preparation to speak a single word (Indefrey & Levelt Reference Indefrey and Levelt2004) and c. 1,500 ms to plan a single clause (Griffin & Bock Reference Griffin and Bock2000), so to achieve a gap of 200 ms requires that midway during an incoming turn, a responder is predicting the rest of it and planning his or her response well in advance of the end. To guard against prediction error, comprehension of the incoming turn must proceed even during preparation of the response – so guaranteeing overlap of comprehension and production processes (Levinson & Torreira Reference Levinson and Torreira2015). This system pushes processing to the limit.

Let's now turn to the psycholinguistic “truism,” namely that, given the short-term memory bottleneck and the problems of competition for lexical access, processing for both comprehension and production must proceed in “chunks” – the “increments” of incremental processing. Miller's (Reference Miller1956) short-term memory bottleneck is often married to Baddeley's (Reference Baddeley1987) auditory loop with a capacity of c. 2 seconds unless refreshed, rapidly overwritten by incoming stimuli. On these or similar foundations the current theory is built.

Assuming Miller's bottleneck, and chunking as a way of mitigating it, I see at least two points in the current theory that are either problematic or need further explication:

  1. 1. How many buffers? Chunking involves recoding longer lower-level strings into shorter, higher-level strings with “lossy” compression of the lower level. In Miller's theory, the higher-level chunks replace the lower ones, using that same short-term memory buffer. But in C&C's theory, the higher-level chunks will need to be retained in another buffer, as the next low-level increment is processed – otherwise, for example, discontinuous syntactic elements will get overwritten by new acoustic detail. Because there is a whole hierarchy of levels (acoustic, phonetic, phonological, morphological, syntactic, discourse, etc.), the “passing the buck upward” strategy will only allow calculation of coherence if there are just as many memory buffers as there are levels.

  2. 2. Mismatching chunks across levels. C&C's theory seems to presume nesting of chunks as one proceeds upward in comprehension from acoustics to meaning. A longstanding linguistic observation is that the levels do not in fact coincide. A well-known example is the mismatch between phonological and syntactic words (Dixon & Aikhenvald Reference Dixon, Aikhenvald, Dixon and Aikhenvald2002): Consider resyllabification, as in the pronunciation of my bike is small as mai.bai.kismall (Vroomen & de Gelder Reference Vroomen and de Gelder1999) – here, the lower-level units don't match the higher ones. Similarly, syntactic structure and semantic structure do not match: All men looks like Tall men in surface structure, but has a quite different underlying semantics. Jackendoff's (Reference Jackendoff2002) theory of grammar, with interface rules handling the mismatch between levels, is an attempt to handle this lack of nesting across levels.

Another fly in the ointment is that, despite the hand-waving in sect. 6.1.2, nonlocal dependencies are not exceptional. Particle verbs, conditionals, parentheticals, wh-movement, center-embedding, topicalization, extraposition, and so forth, have been central to linguistic theorizing, and together such discontinuous constructions are frequent. Now, it is true that English – despite these constructions – generally likes to keep together the bits that belong together. But other languages (like the Australian ones) are much freer in word order – like classical Latin with c. 12% of NPs discontinuous, as in the three-way split (parts in bold, Pinkster Reference Pinkster, Reinhardt, Lapidge and Adams2005; Snijders Reference Snijders, Butt and King2012) in Figure 1.

Figure 1. (Levinson). A discontinuous noun phrase (NP) in Latin wrapped around verb and adverb.

Likewise, the preference for strictly local chunking runs into difficulties at other linguistic levels. Consider the phonological rule that, according to the grammar books, requires the French possessive pronoun ma to become mon before a vowel (ma femme vs. mon épouse, “my wife”); in fact, mon is governed by the properties of the head noun from which it may be separated, as in Marie sera soit mon soit ton épouse (“Marie will become either my or your wife”; Schlenker Reference Schlenker2010). Morphology isn't necessarily well behaved either, some languages even randomizing affixes (Bickel et al. Reference Bickel, Banjade, Gaenszle, Lieven, Paudyal, Rai, Rai, Rai and Stoll2007). So we need to know how the local-processing preference fails to outlaw all of the discontinuous structures in language, and where our push-down stack capacities actually reside.

Finally, C&C's Now-or-Never bottleneck theory suggests that details of an utterance cannot be retained in memory when following material overwrites it – only the gist of what was said may persist. But the practice of “other-initiated repair” suggests otherwise – in the following excerpt Sig repeats verbatim what he earlier said, just with extra stress on shoot even though three conversational turns intervene (Schegloff Reference Schegloff2007, p. 109):

The fact that we can rerun the phonetics (? = rising intonation, underlining = stress) of utterances shows the existence of other buffers that escape the proposed bottleneck.

References

Baddeley, A. (1987) Working memory. Oxford University Press.Google Scholar
Bickel, B., Banjade, G., Gaenszle, M., Lieven, E., Paudyal, N., Rai, I. P., Rai, M., Rai, N. K. & Stoll, S. (2007) Free prefix ordering in Chintang. Language 83(1):4373.Google Scholar
Dixon, R. & Aikhenvald, A. (2002) Word: A typological framework. In: Word: A cross-linguistic typology, ed. Dixon, R. M. W. & Aikhenvald, A. Y., pp. 141. Cambridge University Press.Google Scholar
Griffin, Z. M. & Bock, K. (2000) What the eyes say about speaking. Psychological Science 4:274–79.CrossRefGoogle Scholar
Hickok, G. & Poeppel, D. (2000) Towards a functional anatomy of speech perception. Trends in Cognitive Sciences 4:131–38.CrossRefGoogle Scholar
Indefrey, P. & Levelt, W. J. M. (2004) The spatial and temporal signatures of word production components. Cognition 92(1–2):101–44.Google Scholar
Jackendoff, R. (2002) Foundations of language. Oxford.Google Scholar
Kendrick, K. H. & Torreira, F. (2015) The timing and construction of preference: A quantitative study. Discourse Processes 52(4):255–89.Google Scholar
Levinson, S. C. & Torreira, F. (2015) Timing in turn-taking and its implications for processing models of language. Frontiers in Psychology 6:731. doi:10.3389/fpsyg.2015.00731.Google Scholar
Miller, G. A. (1956) The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review 63(2):8197.Google Scholar
Pinkster, H. (2005) The language of Pliny the Elder. In: The language of Latin Prose, ed. Reinhardt, T., Lapidge, M. & Adams, J. N., pp. 239–56. Oxford University Press.Google Scholar
Sacks, H., Schegloff, E. & Jefferson, G. (1974) Simplest systematics for organization of turn-taking for conversation. Language 50(4):696–35.Google Scholar
Schegloff, E. (2007) Sequence organization in interaction. Cambridge University Press.Google Scholar
Schlenker, P. (2010) A phonological condition that targets discontinuous syntactic units: Ma/mon suppletion in French. Snippets 22:1113.Google Scholar
Snijders, L. (2012) Issues concerning constraints on discontinuous NPs in Latin. In: Proceedings of the LFG12 Conference, Bali, Indonesia, June 28–July 1, 2012, pp. 565–81, ed. Butt, M. & King, T. H.. CSLI Publications. Available at: http://web.stanford.edu/group/cslipublications/cslipublications/LFG/17/lfg12.html Google Scholar
Vroomen, J. & de Gelder, B. (1999) Lexical access of resyllabified words: Evidence from phoneme monitoring. Memory and Cognition 27(3):413–21.Google Scholar
Figure 0

Figure 1. (Levinson). A discontinuous noun phrase (NP) in Latin wrapped around verb and adverb.