Firestone & Scholl (F&S) wish to draw a sharp distinction between attention acting to change the inputs to perception and top-down influences on perceptual processes proper. In the extreme example of a person coming to believe that important information is coming at them from the left and consequently moving their head to the left, this distinction is compelling. The belief causes an explicit head movement that thereby changes the input to perceptual processes. The perceptual processes may operate in a standard, unvarying way without their internal workings being directly affected by the belief. The problem with generalizing this kind of account is that attention influences perceptual processes at many stages and sometimes with a high degree of specificity. Because of the wide range of attentional effects – from centrally oriented modulations of percept selection, to adjustments of gain to certain features, to the extreme case of head-turning – the boundary between peripheral attention shifts and changes to perception is entirely unclear and potentially artificial. In our view, a better strategy for understanding the role of perception in human behavior is to consider the workings of larger attention-perception-learning-goal-action loops.
As F&S acknowledge, in some cases the action of attention is far more nuanced than moving one's head or opening one's eyes. Drawing a sharp distinction between cases in which attention is operating peripherally on the inputs to perception versus centrally on perception itself is counterproductive, leading to an unproductive debate about whether a particular process is part of “genuine” perception. The intellectual effort devoted to this debate would be more efficaciously applied to determining how different mental circuits coordinate to account for sophisticated and flexible behavior.
Attention is deployed not only to spatial regions, but also to sensory dimensions, learned dimensions, and learned complex configurations of visual components. Relegating attentional effects to occurring peripheral to perception is awkward given that these effects occur at many stages of perceptual processing. Neurophysiologically, when a particular cortical area for vision projects to a higher level, there are typically recurrent connections from that higher level back to the cortical area. These recurrent connections are particularly important when processing degraded inputs, and they serve to strengthen weak bottom-up signals (Wyatte et al. Reference Wyatte, Curran and O'Reilly2012). Treating feature-driven attention as distinct in kind from object or location-driven attention is equally problematic, because the apparent mechanisms underlying these processes are quite similar, and the impacts are entirely analogous.
Consider the evidence that attention can select complex learned configurations of relevance for a task. For example, the efficiency of search for conjunctions of visual parts gradually increases over the course of hours of training (Shiffrin & Lightfoot Reference Shiffrin and Lightfoot1997). The learning of these complex visual features ought to be considered perceptual, rather than generically associative, based both on its neural locus in IT brain regions specialized for visual shape representation (Logothetis et al. Reference Logothetis, Pauls and Poggio1995) and on behavioral evidence indicating perceptual constraints on the acquisition of these features (Goldstone Reference Goldstone2000). For simpler auditory (Recanzone et al. Reference Recanzone, Schreiner and Merzenich1993) and visual (Jehee et al. Reference Jehee, Ling, Swisher, van Bergen and Tong2012) discriminations, even earlier primary sensory cortical loci have been implicated in perceptual learning. For both simple and complex perceptual learning, attention and perception are inextricably intertwined. Attention can be effectively deployed to subtle, simple discriminations and complex configurations only because perceptual processes have been adapted to fit task demands. Segregating these mechanisms into “attention” and “perception” and demanding that they be analyzed only separately ignores the key role of interacting systems in coordinating experience.
The interplay between perceptual processing and attention is even more striking in situations where perceptual learning leads to people separating dimensions that they originally treated as psychologically fused. For example, saturation and brightness are aspects of color that most people have difficulty separating. It is hard for people to classify objects on the basis of saturation while ignoring brightness differences (Burns & Shepp Reference Burns and Shepp1988). However, with training, dimensions that were originally fused can become more perceptually separated (Goldstone & Steyvers Reference Goldstone and Steyvers2001), and once this occurs, it becomes possible for attention to select the separate dimensions. Perceptual changes affect what can be attended, and attention affects what is perceptually learned (Shiu & Pashler Reference Shiu and Pashler1992). Given the intertwined nature of perception–attention relations such as these, it is appropriate to consider human behavior on perceptual tasks to be a product of an integrated perception–attention system.
Some might argue that perceptual learning occurs but is not driven by top-down influences such as expectations and goals. People are not yet generally able to directly implement the neural changes that they would like to have, but neurosurgery is only one way in which we can purposefully, in a goal-directed fashion, influence our perceptual systems. Athletes, musicians, coaches, doctors, and gourmets are all familiar with engaging in training methods for improving their own perceptual performances (Goldstone et al. Reference Goldstone, de Leeuw and Landy2015). For example, different music students will give themselves very different training depending on whether they want to master discriminations between absolute pitches (e.g., A vs. A#) or relative intervals (e.g., minor vs. major thirds). The neuroscientist Susan Barry provides a compelling case of the strategic hacking of one's own perceptual system: By presenting to herself colored beads at varying distances and forcing her eyes to jointly fixate on them, Barry caused her visual system to acquire the binocular stereoscopic depth-perception ability that it originally lacked (Sacks Reference Sacks2006).
To maintain that the intentions of learners only indirectly change perceptual processes only reinforces a distinction that conceals the consequential interactions between intention and perception that adapt perception to specific tasks. By analogy, when a person blows out a candle flame, does he or she blow it out directly or through an indirect chain of air pressure differentials, displacement of the flame away from the wick, and resulting temperature drop to the wick? Perseverating on this dubious distinction, or F&S's distinction between intentions directly versus indirectly affecting perception, risks neglecting the perception–attention, perception–learning, and intention–perception loops that are critically important for allowing people to perceive their world in efficient and task-specific ways.
Firestone & Scholl (F&S) wish to draw a sharp distinction between attention acting to change the inputs to perception and top-down influences on perceptual processes proper. In the extreme example of a person coming to believe that important information is coming at them from the left and consequently moving their head to the left, this distinction is compelling. The belief causes an explicit head movement that thereby changes the input to perceptual processes. The perceptual processes may operate in a standard, unvarying way without their internal workings being directly affected by the belief. The problem with generalizing this kind of account is that attention influences perceptual processes at many stages and sometimes with a high degree of specificity. Because of the wide range of attentional effects – from centrally oriented modulations of percept selection, to adjustments of gain to certain features, to the extreme case of head-turning – the boundary between peripheral attention shifts and changes to perception is entirely unclear and potentially artificial. In our view, a better strategy for understanding the role of perception in human behavior is to consider the workings of larger attention-perception-learning-goal-action loops.
As F&S acknowledge, in some cases the action of attention is far more nuanced than moving one's head or opening one's eyes. Drawing a sharp distinction between cases in which attention is operating peripherally on the inputs to perception versus centrally on perception itself is counterproductive, leading to an unproductive debate about whether a particular process is part of “genuine” perception. The intellectual effort devoted to this debate would be more efficaciously applied to determining how different mental circuits coordinate to account for sophisticated and flexible behavior.
Attention is deployed not only to spatial regions, but also to sensory dimensions, learned dimensions, and learned complex configurations of visual components. Relegating attentional effects to occurring peripheral to perception is awkward given that these effects occur at many stages of perceptual processing. Neurophysiologically, when a particular cortical area for vision projects to a higher level, there are typically recurrent connections from that higher level back to the cortical area. These recurrent connections are particularly important when processing degraded inputs, and they serve to strengthen weak bottom-up signals (Wyatte et al. Reference Wyatte, Curran and O'Reilly2012). Treating feature-driven attention as distinct in kind from object or location-driven attention is equally problematic, because the apparent mechanisms underlying these processes are quite similar, and the impacts are entirely analogous.
Consider the evidence that attention can select complex learned configurations of relevance for a task. For example, the efficiency of search for conjunctions of visual parts gradually increases over the course of hours of training (Shiffrin & Lightfoot Reference Shiffrin and Lightfoot1997). The learning of these complex visual features ought to be considered perceptual, rather than generically associative, based both on its neural locus in IT brain regions specialized for visual shape representation (Logothetis et al. Reference Logothetis, Pauls and Poggio1995) and on behavioral evidence indicating perceptual constraints on the acquisition of these features (Goldstone Reference Goldstone2000). For simpler auditory (Recanzone et al. Reference Recanzone, Schreiner and Merzenich1993) and visual (Jehee et al. Reference Jehee, Ling, Swisher, van Bergen and Tong2012) discriminations, even earlier primary sensory cortical loci have been implicated in perceptual learning. For both simple and complex perceptual learning, attention and perception are inextricably intertwined. Attention can be effectively deployed to subtle, simple discriminations and complex configurations only because perceptual processes have been adapted to fit task demands. Segregating these mechanisms into “attention” and “perception” and demanding that they be analyzed only separately ignores the key role of interacting systems in coordinating experience.
The interplay between perceptual processing and attention is even more striking in situations where perceptual learning leads to people separating dimensions that they originally treated as psychologically fused. For example, saturation and brightness are aspects of color that most people have difficulty separating. It is hard for people to classify objects on the basis of saturation while ignoring brightness differences (Burns & Shepp Reference Burns and Shepp1988). However, with training, dimensions that were originally fused can become more perceptually separated (Goldstone & Steyvers Reference Goldstone and Steyvers2001), and once this occurs, it becomes possible for attention to select the separate dimensions. Perceptual changes affect what can be attended, and attention affects what is perceptually learned (Shiu & Pashler Reference Shiu and Pashler1992). Given the intertwined nature of perception–attention relations such as these, it is appropriate to consider human behavior on perceptual tasks to be a product of an integrated perception–attention system.
Some might argue that perceptual learning occurs but is not driven by top-down influences such as expectations and goals. People are not yet generally able to directly implement the neural changes that they would like to have, but neurosurgery is only one way in which we can purposefully, in a goal-directed fashion, influence our perceptual systems. Athletes, musicians, coaches, doctors, and gourmets are all familiar with engaging in training methods for improving their own perceptual performances (Goldstone et al. Reference Goldstone, de Leeuw and Landy2015). For example, different music students will give themselves very different training depending on whether they want to master discriminations between absolute pitches (e.g., A vs. A#) or relative intervals (e.g., minor vs. major thirds). The neuroscientist Susan Barry provides a compelling case of the strategic hacking of one's own perceptual system: By presenting to herself colored beads at varying distances and forcing her eyes to jointly fixate on them, Barry caused her visual system to acquire the binocular stereoscopic depth-perception ability that it originally lacked (Sacks Reference Sacks2006).
To maintain that the intentions of learners only indirectly change perceptual processes only reinforces a distinction that conceals the consequential interactions between intention and perception that adapt perception to specific tasks. By analogy, when a person blows out a candle flame, does he or she blow it out directly or through an indirect chain of air pressure differentials, displacement of the flame away from the wick, and resulting temperature drop to the wick? Perseverating on this dubious distinction, or F&S's distinction between intentions directly versus indirectly affecting perception, risks neglecting the perception–attention, perception–learning, and intention–perception loops that are critically important for allowing people to perceive their world in efficient and task-specific ways.