Linguistics, cognitive psychology, and the Now-or-Never bottleneck

Ansgar D. Endress; Roni Katzir

doi:10.1017/S0140525X15000953

Linguistics, cognitive psychology, and the Now-or-Never bottleneck

Published online by Cambridge University Press: 02 June 2016

Ansgar D. Endress and

Roni Katzir

Show author details

Ansgar D. Endress: Affiliation:
Department of Psychology, City University London, London EC1V 0HB, United Kingdomansgar.endress.1@city.ac.uk
Roni Katzir: Affiliation:
Department of Linguistics and Sagol School of Neuroscience, Tel Aviv University, Ramat Aviv 69978, Israel. rkatzir@post.tau.ac.il

Article contents

Abstract
References

Rights & Permissions

Abstract

Christiansen & Chater (C&C)'s key premise is that “if linguistic information is not processed rapidly, that information is lost for good” (sect. 1, para. 1). From this “Now-or-Never bottleneck” (NNB), C&C derive “wide-reaching and fundamental implications for language processing, acquisition and change as well as for the structure of language itself” (sect. 2, para. 10). We question both the premise and the consequentiality of its purported implications.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 39 , 2016 , e71

DOI: https://doi.org/10.1017/S0140525X15000953 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2016

Problematic premises

Christiansen & Chater (C&C) base the Now-or-Never bottleneck (NNB) on the observation that sensory memory disappears quickly in explicit memory tasks. We note, first, that not all forms of explicit memory are short-lived. For example, children remember words encountered once after a month (Carey & Bartlett Reference Carey and Bartlett1978; Markson & Bloom Reference Markson and Bloom1997). More important, it is by no means clear that explicit memory is the (only) relevant form of memory for language processing and acquisition, nor how quickly other forms of memory decay. For example, the perceptual learning literature suggests that learning can occur even in the absence of awareness of the stimuli (Seitz & Watanabe Reference Seitz and Watanabe2003; Watanabe et al. Reference Watanabe, Náñez and Sasaki2001) and sometimes has long-lasting effects (Schwab et al. Reference Schwab, Nusbaum and Pisoni1985). Similarly, visual memories that start decreasing over a few seconds can be stabilized by presenting items another time (Endress & Potter Reference Endress and Potter2014). At a minimum, then, such memory traces are long-lasting enough for repeated exposure to have cumulative learning effects.

Information that is not even perceived is thus used for learning and processing, and some forms of memory do not disappear immediately. Hence, it is still an open empirical question whether poor performance in explicit recall tasks provides severe constraints on processing and learning.

We note, in passing, that even if relevant forms of memory were short-lived, this would not necessarily be a bottleneck. Mechanisms to make representations last longer – such as self-sustained activity – are well documented in many brain regions (Major & Tank Reference Major and Tank2004), and one might assume that memories can be longer-lived when this is adaptive. Short-lived memories might thus be an adaptation rather than a bottleneck (e.g., serving to reduce information load for various computations).

Problematic “implications.”

C&C use the NNB to advance the following view: Language is a skill (specifically, the skill of parsing predictively); this skill is what children acquire (rather than some theory-like knowledge); and there are few if any restrictions on linguistic diversity. C&C's conclusions do not follow from the NNB and are highly problematic. Below, we discuss some of the problematic inferences regarding processing, learning, and evolution.

Regarding processing, C&C claim that the NNB implies that knowledge of language is the skill of parsing predictively. There is indeed ample evidence for a central role for prediction in parsing (e.g., Levy Reference Levy2008), but this is not a consequence of the NNB: The advantages of predictive processing are orthogonal to the NNB, and, even assuming the NNB, processing might still occur element by element without predictions. C&C also claim that the NNB implies a processor with no explicit representation of syntax (other than what can be read off the parsing process as a trace). It is unclear what they actually mean with this claim, though. First, if C&C mean that the parser does not construct full syntactic trees but rather produces a minimum that allows semantics and phonology to operate, they just echo a view discussed by Pulman (Reference Pulman1986) and others. Although this view is an open possibility, we do not see how it follows from the NNB. Second, if C&C mean that the NNB implies that parsing does not use explicit syntactic knowledge, this view is incorrect: Many parsing algorithms (e.g., LR, Earley's algorithm, incremental CKY) respect the NNB by being incremental and not needing to refer back to raw data (they can all refer to the result of earlier processing instead) and yet make reference to explicit syntax. Finally, we note that prediction-based, parser-only models in the literature that do not incorporate explicit representations of syntactic structure (e.g., Elman Reference Elman1990; McCauley & Christiansen Reference McCauley, Christiansen, Carlson, Hölscher and Shipley2011) fail to explain why we can recognize unpredictable sentences as grammatical (e.g., Evil unicorns devour xylophones).

Regarding learning, C&C claim that the NNB is incompatible with approaches to learning that involve elaborate linguistic knowledge. This, however, is incorrect: The only implication of the NNB for learning is that if memory is indeed fleeting, any learning mechanism must be online rather than batch, relying only on current information. But online learning does not rule out theory-based models of language in any way (e.g., Börschinger & Johnson Reference Börschinger, Johnson, Mollá and Martinez2011). In fact, some have argued that online variants of theory-based models provide particularly good approximations to empirically observed patterns of learning (e.g., Frank et al. Reference Frank, Goldwater, Griffiths and Tenenbaum2010).

Regarding the evolution of language (which they conflate with the biological evolution of language), C&C claim that it is item-based and gradual, and that linguistic diversity is the norm, with few if any true universals. However, how these claims might follow from the NNB is unclear, and C&C are inconsistent with the relevant literature. For example, language change has been argued to be abrupt and nonlinear (see Niyogi & Berwick Reference Niyogi and Berwick2009), often involving what look like changes in abstract principles rather than concrete lexical items. As for linguistic diversity, C&C repeat claims from Christiansen and Chater (Reference Christiansen and Chater2008) and Evans and Levinson (Reference Evans and Levinson2009), but those works ignore the strongest typological patterns revealed by generative linguistics. For example, no known language allows for a single conjunct to be displaced in a question (Ross Reference Ross1967): We might know that Kim ate peas and something yesterday and wonder what that something is, but in no language can we use a question of the form *What did Kim eat peas and yesterday? to inquire about it. Likewise, in Why did John wonder who Bill hit?, one can only ask about the cause of the wondering, not of the hitting (see Huang Reference Huang1982; Rizzi Reference Rizzi1990). Typological data thus reveal significant restrictions on linguistic diversity.

Conclusion

Language is complex. Our efforts to comprehend it are served better by detailed analysis of the cognitive mechanisms at our disposal than by grand theoretical proposals that ignore the relevant psychological, linguistic, and computational distinctions.

ACKNOWLEDGMENTS

We thank Leon Bergen, Bob Berwick, Tova Friedman, and Tim O'Donnell.

References

Börschinger, B. & Johnson, M. (2011) A particle filter algorithm for Bayesian word segmentation. In: Proceedings of the Australasian Language Technology Association Workshop, Canberra, Australia, ed. Mollá, D. & Martinez, D., pp. 10–18.Google Scholar

Carey, S. & Bartlett, E. (1978) Acquiring a single word. Papers and Reports on Child Language Development 15:17–29.Google Scholar

Christiansen, M. H. & Chater, N. (2008) Language as shaped by the brain. Behavioral & Brain Sciences 31(05):489–58.CrossRef Google Scholar PubMed

Elman, J. L. (1990) Finding structure in time. Cognitive Science 14(2):179–211.Google Scholar

Endress, A. D. & Potter, M. C. (2014b) Something from (almost) nothing: Buildup of object memory from forgettable single fixations. Attention, Perception, and Psychophysics 76:2413–23.CrossRef Google Scholar PubMed

Evans, N. & Levinson, S. (2009) The myth of language universals: Language diversity and its importance for cognitive science. Behavioral and Brain Sciences 32:429–92.Google Scholar

Frank, M., Goldwater, S., Griffiths, T. & Tenenbaum, J. (2010) Modeling human performance in statistical word segmentation. Cognition 117:107–25.CrossRef Google Scholar PubMed

Huang, C.-T. J. (1982) Logical relations in Chinese and the theory of grammar. Unpublished doctoral dissertation. Departments of Linguistics and Philosophy, Massachusetts Institute of Technology.Google Scholar

Levy, R. (2008) Expectation-based syntactic comprehension. Cognition 106:1126–77.CrossRef Google Scholar PubMed

Major, G. & Tank, D. (2004) Persistent neural activity: Prevalence and mechanisms. Current Opinion in Neurobiology 14(6):675–84.Google Scholar

Markson, L. & Bloom, P. (1997) Evidence against a dedicated system for word learning in children. Nature 385:813–15.Google Scholar

McCauley, S. M. & Christiansen, M. H. (2011) Learning simple statistics for language comprehension and production: The CAPPUCCINO model. In: Proceedings of the 33rd Annual Conference of the Cognitive Science Society, Boston, MA, July 2011. pp. 1619–24, ed. Carlson, L. A., Hölscher, C. & Shipley, T. F.. Cognitive Science Society.Google Scholar

Niyogi, P. & Berwick, R. C. (2009) The proper treatment of language acquisition and change in a population setting. Proceedings of the National Academy of Sciences 106:10124–29.CrossRef Google Scholar

Pulman, S. G. (1986) Grammars, parsers, and memory limitations. Language and Cognitive Processes 1(3):197–225.CrossRef Google Scholar

Rizzi, L. (1990) Relativized minimality. MIT Press.Google Scholar

Ross, J. R. (1967) Constraints on variables in syntax. Unpublished doctoral dissertation. Department of Linguistics, MIT.Google Scholar

Schwab, E. C., Nusbaum, H. C. & Pisoni, D. B. (1985) Some effects of training on the perception of synthetic speech. Human Factors 27:395–408.Google Scholar