Christiansen & Chater (C&C) propose that memory limitations force language comprehenders to compress perceptual data immediately, forgetting lower-level information and maintaining only higher-level categories (“chunks”). Recent data from speech perception and sentence processing, however, demonstrate that comprehenders can maintain fine-grained lower-level perception information for substantial durations. These results directly contradict the central idea behind the Now-or-Never bottleneck. To the extent that the framework allows them, it risks becoming so flexible that it fails to make substantive claims. On the other hand, these results are predicted by existing frameworks, such as bounded rationality, which are thus more productive frameworks for future research. We illustrate this argument with recent developments in our understanding of a classic result in speech perception: categorical perception.
Initial results in speech perception suggested that listeners are insensitive to fine-grained within-category differences in voice onset time (VOT, the most important cue distinguishing voiced and voiceless stop consonants, e.g., “b” versus “p” in bill versus pill), encoding only whether a sound is “voiced” or “voiceless” (Liberman et al. Reference Liberman, Harris, Hoffman and Griffith1957). Subsequent work demonstrated sensitivity to within-category differences (Carney et al. Reference Carney, Widin and Viemeister1977; Pisoni & Tash Reference Pisoni and Tash1974), with some findings interpreted as evidence this sensitivity rapidly decays (e.g., Pisoni & Lazarus Reference Pisoni and Lazarus1974). Such a picture is very similar to the idea behind Chunk-and-Pass: Listeners rapidly chunk phonetic detail into a phoneme, forgetting the subcategorical information in the process.
Although it may perhaps be intuitive, given early evidence that perceptual memory is limited (Sperling Reference Sperling1960), such discarding of subcategorical information would be surprising from the perspective of bounded rationality: Information critical to the successful recognition of phonetic categories often occurs downstream in the speech signal (Bard et al. Reference Bard, Shillcock and Altmann1988; Grosjean Reference Grosjean1985). Effective language understanding thus requires maintaining and integrating graded support for different phonetic categories provided by a sound's acoustics (its subcategorical information) with information present in the downstream signal. Indeed, more recent work suggests that comprehenders do this. For example, within-category differences in VOT are not immediately forgotten but are still available downstream at the end of a multisyllabic word (McMurray et al. Reference McMurray, Tanenhaus and Aslin2009; see Dahan 2010, for further discussion of right-context effects).
Of particular relevance is a line of work initiated by Connine et al. (Reference Connine, Blasko and Hall1991, Expt. 1). They manipulated VOT in the initial segment of target words (dent/tent) and embedded these words in utterances with downstream information about the word's identity (e.g., “The dent/tent in the fender” or “… forest”). They found that listeners can maintain subcategorical phonetic detail and integrate it with downstream information even beyond word boundaries.
Chunk-and-Pass does not predict these results. Recognizing this, C&C allow violations of Now-or-Never, as long as “such online ‘right-context effects’ [are] highly local, because raw perceptual input will be lost if it is not rapidly identified” (sect. 3.1, para. 7). This substantially weakens the predictive power of their proposal. On the other hand, Connine et al.'s results do seem to support this qualification. They reported that subcategorical phonetic detail (a) was maintained only 3 syllables downstream, but not 6–8, and (b) was maintained only for maximally ambiguous tokens.
Recent work, however, points to methodological issues that call both of these limitations into question (Bicknell et al. Reference Bicknell, Tanenhaus and Jaeger2015). Regarding (a), Connine et al. allowed listeners to respond at any point in the sentence: On 84% of trials in the 6–8 syllable condition, listeners categorized the target word prior to hearing the relevant right-context (e.g., fender or forest). Therefore, these responses could not probe access to subcategorical information. In a replication that avoided this problem, we found that subcategorical detail decays more slowly than Connine et al.'s analysis would suggest: Subcategorical detail was maintained for at least 6–8 syllables (the longest range investigated). Regarding (b), Connine et al.'s analysis was based on proportions, rather than log-odds. Rational integration of downstream information with subcategorical information should lead to additive effects in log-odds space (which, in proportional space, then are largest around the maximally ambiguous tokens; Bicknell et al. Reference Bicknell, Tanenhaus and Jaeger2015), This is indeed what we found: The effect of downstream information on the log-odds of hearing dent (or tent) was constant across the entire VOT range. In short, subcategorical information is maintained longer than previous studies suggested, not immediately discarded by chunking (see also Szostak & Pitt Reference Szostak and Pitt2013). Moreover, maintenance is not limited to special cases; it is the default (Brown et al. Reference Brown, Dilley, Tanenhaus, Campbell, Gibbon and Hirst2014).
Clearly, language processing is subject to cognitive limitations; many – if not most – theories of language processing acknowledge this. In its general form, the Now-or-Never bottleneck thus embodies an idea as old as the cognitive sciences: that observable behavior and the cognitive representations and mechanisms underlying this behavior are primarily driven by a priori (static/fixed) cognitive limitations. This contrasts with another view: Cognitive and neural systems have evolved efficient solutions to the computational tasks agents face (Anderson Reference Anderson1990). Both views have been productive, providing explanations for perception, motor control, and cognition, including language (and C&C have contributed to both views). A number of proposals have tied together these insights. This includes the idea of bounded rationality, that is, rational use of limited resources given task constraints (Howes et al. Reference Howes, Lewis and Vera2009; Neumann et al. Reference Neumann, Rafferty, Griffiths, Bello, Guarini, McShane and Scassellati2014; Simon Reference Simon1982; for language: e.g., Bicknell & Levy Reference Bicknell, Levy, Hajič, Carberry, Clark and Nivre2010; Feldman et al. Reference Feldman, Griffiths and Morgan2009; Kleinschmidt & Jaeger Reference Kleinschmidt and Jaeger2015; Kuperberg & Jaeger Reference Kuperberg and Jaeger2016; Lewis et al. Reference Lewis, Shvartsman and Singh2013). Chunk-and-Pass is a step backward because it blurs the connection between these two principled dimensions of theory development. Consequently, it fails to predict systematic maintenance of subcategorical information, whereas bounded rationality predicts this property of language processing and offers an explanation for it.
The Now-or-Never bottleneck makes novel, testable predictions only insofar as it makes strong claims about comprehenders' (in)ability to maintain lower-level information beyond the “now.” The studies we summarized above are inconsistent with this claim. Similarly inconsistent is evidence from research on reading suggesting that lower-level information survives long enough to influence incremental parsing (Levy Reference Levy, Matsumoto and Mihalcea2011; Levy et al. Reference Levy, Bicknell, Slattery and Rayner2009). Moreover, the history of research on categorical perception provides a word of caution: Rather than focusing too much on cognitive limitations, it is essential for researchers to equally consider the computational problems of language processing and how comprehender goals can be effectively achieved.
Christiansen & Chater (C&C) propose that memory limitations force language comprehenders to compress perceptual data immediately, forgetting lower-level information and maintaining only higher-level categories (“chunks”). Recent data from speech perception and sentence processing, however, demonstrate that comprehenders can maintain fine-grained lower-level perception information for substantial durations. These results directly contradict the central idea behind the Now-or-Never bottleneck. To the extent that the framework allows them, it risks becoming so flexible that it fails to make substantive claims. On the other hand, these results are predicted by existing frameworks, such as bounded rationality, which are thus more productive frameworks for future research. We illustrate this argument with recent developments in our understanding of a classic result in speech perception: categorical perception.
Initial results in speech perception suggested that listeners are insensitive to fine-grained within-category differences in voice onset time (VOT, the most important cue distinguishing voiced and voiceless stop consonants, e.g., “b” versus “p” in bill versus pill), encoding only whether a sound is “voiced” or “voiceless” (Liberman et al. Reference Liberman, Harris, Hoffman and Griffith1957). Subsequent work demonstrated sensitivity to within-category differences (Carney et al. Reference Carney, Widin and Viemeister1977; Pisoni & Tash Reference Pisoni and Tash1974), with some findings interpreted as evidence this sensitivity rapidly decays (e.g., Pisoni & Lazarus Reference Pisoni and Lazarus1974). Such a picture is very similar to the idea behind Chunk-and-Pass: Listeners rapidly chunk phonetic detail into a phoneme, forgetting the subcategorical information in the process.
Although it may perhaps be intuitive, given early evidence that perceptual memory is limited (Sperling Reference Sperling1960), such discarding of subcategorical information would be surprising from the perspective of bounded rationality: Information critical to the successful recognition of phonetic categories often occurs downstream in the speech signal (Bard et al. Reference Bard, Shillcock and Altmann1988; Grosjean Reference Grosjean1985). Effective language understanding thus requires maintaining and integrating graded support for different phonetic categories provided by a sound's acoustics (its subcategorical information) with information present in the downstream signal. Indeed, more recent work suggests that comprehenders do this. For example, within-category differences in VOT are not immediately forgotten but are still available downstream at the end of a multisyllabic word (McMurray et al. Reference McMurray, Tanenhaus and Aslin2009; see Dahan 2010, for further discussion of right-context effects).
Of particular relevance is a line of work initiated by Connine et al. (Reference Connine, Blasko and Hall1991, Expt. 1). They manipulated VOT in the initial segment of target words (dent/tent) and embedded these words in utterances with downstream information about the word's identity (e.g., “The dent/tent in the fender” or “… forest”). They found that listeners can maintain subcategorical phonetic detail and integrate it with downstream information even beyond word boundaries.
Chunk-and-Pass does not predict these results. Recognizing this, C&C allow violations of Now-or-Never, as long as “such online ‘right-context effects’ [are] highly local, because raw perceptual input will be lost if it is not rapidly identified” (sect. 3.1, para. 7). This substantially weakens the predictive power of their proposal. On the other hand, Connine et al.'s results do seem to support this qualification. They reported that subcategorical phonetic detail (a) was maintained only 3 syllables downstream, but not 6–8, and (b) was maintained only for maximally ambiguous tokens.
Recent work, however, points to methodological issues that call both of these limitations into question (Bicknell et al. Reference Bicknell, Tanenhaus and Jaeger2015). Regarding (a), Connine et al. allowed listeners to respond at any point in the sentence: On 84% of trials in the 6–8 syllable condition, listeners categorized the target word prior to hearing the relevant right-context (e.g., fender or forest). Therefore, these responses could not probe access to subcategorical information. In a replication that avoided this problem, we found that subcategorical detail decays more slowly than Connine et al.'s analysis would suggest: Subcategorical detail was maintained for at least 6–8 syllables (the longest range investigated). Regarding (b), Connine et al.'s analysis was based on proportions, rather than log-odds. Rational integration of downstream information with subcategorical information should lead to additive effects in log-odds space (which, in proportional space, then are largest around the maximally ambiguous tokens; Bicknell et al. Reference Bicknell, Tanenhaus and Jaeger2015), This is indeed what we found: The effect of downstream information on the log-odds of hearing dent (or tent) was constant across the entire VOT range. In short, subcategorical information is maintained longer than previous studies suggested, not immediately discarded by chunking (see also Szostak & Pitt Reference Szostak and Pitt2013). Moreover, maintenance is not limited to special cases; it is the default (Brown et al. Reference Brown, Dilley, Tanenhaus, Campbell, Gibbon and Hirst2014).
Clearly, language processing is subject to cognitive limitations; many – if not most – theories of language processing acknowledge this. In its general form, the Now-or-Never bottleneck thus embodies an idea as old as the cognitive sciences: that observable behavior and the cognitive representations and mechanisms underlying this behavior are primarily driven by a priori (static/fixed) cognitive limitations. This contrasts with another view: Cognitive and neural systems have evolved efficient solutions to the computational tasks agents face (Anderson Reference Anderson1990). Both views have been productive, providing explanations for perception, motor control, and cognition, including language (and C&C have contributed to both views). A number of proposals have tied together these insights. This includes the idea of bounded rationality, that is, rational use of limited resources given task constraints (Howes et al. Reference Howes, Lewis and Vera2009; Neumann et al. Reference Neumann, Rafferty, Griffiths, Bello, Guarini, McShane and Scassellati2014; Simon Reference Simon1982; for language: e.g., Bicknell & Levy Reference Bicknell, Levy, Hajič, Carberry, Clark and Nivre2010; Feldman et al. Reference Feldman, Griffiths and Morgan2009; Kleinschmidt & Jaeger Reference Kleinschmidt and Jaeger2015; Kuperberg & Jaeger Reference Kuperberg and Jaeger2016; Lewis et al. Reference Lewis, Shvartsman and Singh2013). Chunk-and-Pass is a step backward because it blurs the connection between these two principled dimensions of theory development. Consequently, it fails to predict systematic maintenance of subcategorical information, whereas bounded rationality predicts this property of language processing and offers an explanation for it.
The Now-or-Never bottleneck makes novel, testable predictions only insofar as it makes strong claims about comprehenders' (in)ability to maintain lower-level information beyond the “now.” The studies we summarized above are inconsistent with this claim. Similarly inconsistent is evidence from research on reading suggesting that lower-level information survives long enough to influence incremental parsing (Levy Reference Levy, Matsumoto and Mihalcea2011; Levy et al. Reference Levy, Bicknell, Slattery and Rayner2009). Moreover, the history of research on categorical perception provides a word of caution: Rather than focusing too much on cognitive limitations, it is essential for researchers to equally consider the computational problems of language processing and how comprehender goals can be effectively achieved.