Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-07T04:09:15.183Z Has data issue: false hasContentIssue false

Perception, as you make it

Published online by Cambridge University Press:  05 January 2017

David W. Vinson
Affiliation:
Cognitive and Information Sciences, University of California, Merced, Merced, CA 95340dvinson@ucmerced.edudabney@ucmerced.edurdale@ucmerced.eduspivey@ucmerced.edu
Drew H. Abney
Affiliation:
Cognitive and Information Sciences, University of California, Merced, Merced, CA 95340dvinson@ucmerced.edudabney@ucmerced.edurdale@ucmerced.eduspivey@ucmerced.edu
Dima Amso
Affiliation:
Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, RI 02912dima_amso@brown.edu
Anthony Chemero
Affiliation:
Department of Philosophy and Psychology, University of Cincinnati, Cincinnati, OH 45220chemeray@ucmail.uc.edu
James E. Cutting
Affiliation:
Department of Psychology, Cornell University, Ithaca, NY 14850james.cutting@cornell.edu
Rick Dale
Affiliation:
Cognitive and Information Sciences, University of California, Merced, Merced, CA 95340dvinson@ucmerced.edudabney@ucmerced.edurdale@ucmerced.eduspivey@ucmerced.edu
Jonathan B. Freeman
Affiliation:
Department of Psychology, New York University, New York, NY 10003jon.freeman@nyu.edu
Laurie B. Feldman
Affiliation:
Psychology Department, University of Albany, SUNY, Albany, NY 12222lfeldman@albany.edu
Karl J. Friston
Affiliation:
Wellcome Trust Centre for Neuroimaging, University College London, London WC1E 6BT, United Kingdomk.friston@ucl.ac.uks.ondobaka@ucl.ac.ukdcr@eyethink.org
Shaun Gallagher
Affiliation:
Department of Philosophy, University of Memphis, Memphis, TN 38152s.gallagher@memphis.edu
J. Scott Jordan
Affiliation:
Department of Psychology, Illinois State University, Normal, IL 61761jsjorda@ilstu.edu
Liad Mudrik
Affiliation:
School of Psychological Sciences, Tel Aviv University, Tel Aviv-Yafo, Israelmudrikli@tau.ac.il
Sasha Ondobaka
Affiliation:
Wellcome Trust Centre for Neuroimaging, University College London, London WC1E 6BT, United Kingdomk.friston@ucl.ac.uks.ondobaka@ucl.ac.ukdcr@eyethink.org
Daniel C. Richardson
Affiliation:
Wellcome Trust Centre for Neuroimaging, University College London, London WC1E 6BT, United Kingdomk.friston@ucl.ac.uks.ondobaka@ucl.ac.ukdcr@eyethink.org
Ladan Shams
Affiliation:
Psychology Department, University of California, Los Angeles, Los Angeles, CA 90095lshams@psych.ucla.edu
Maggie Shiffrar
Affiliation:
Office of Research and Graduate Studies, California State University, Northridge, Northridge, CA 91330. mag@csun.edu
Michael J. Spivey
Affiliation:
Cognitive and Information Sciences, University of California, Merced, Merced, CA 95340dvinson@ucmerced.edudabney@ucmerced.edurdale@ucmerced.eduspivey@ucmerced.edu

Abstract

The main question that Firestone & Scholl (F&S) pose is whether “what and how we see is functionally independent from what and how we think, know, desire, act, and so forth” (sect. 2, para. 1). We synthesize a collection of concerns from an interdisciplinary set of coauthors regarding F&S's assumptions and appeals to intuition, resulting in their treatment of visual perception as context-free.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2016 

No perceptual task takes place in a contextual vacuum. How do we know that an effect is one of perception qua perception that does not involve other cognitive contributions? Experimental instructions alone involve various cognitive factors that guide task performance (Roepstorff & Frith Reference Roepstorff and Frith2004). Even a request to detect simple stimulus features requires participants to understand the instructions (language, memory), keep track of them (working memory), become sensitive to them (attention), and pick up the necessary information to become appropriately sensitive (perception). These processes work in a dynamic parallelism that is required when one participates in any experiment. Any experiment with enough cognitive content to test top-down effects would seem to invoke all of these processes. From this task-level vantage point, the precise role of visual perception under strict modular assumptions seems, to us, difficult to intuit. We are, presumably, seeking theories that can also account for complex natural perceptual acts. Perception must somehow participate with cognition to help guide action in a labile world. Perception operating entirely independently, without any task-based constraints, flirts with hallucination. Additional theoretical and empirical matters elucidate even more difficulties with their thesis.

First, like Firestone & Scholl (F&S), Fodor (Reference Fodor1983) famously used visual illusions to argue for the modularity of perceptual input systems. Cognition itself, Fodor suggested, was likely too complex to be modular. Ironically, F&S have turned Fodor's thesis on its head; they argue that perceptual input systems may interact as much as they like without violating modularity. But there are some counterexamples. In Jastrow's (Reference Jastrow1899) and Hill's (Reference Hill1915) ambiguous figures, one sees either a duck or rabbit on the one hand, and either a young woman or old woman on the other. Yet, you can cognitively control which of these you see. Admittedly, cognition cannot “penetrate” our perception to turn straight lines into curved ones in any arbitrary stimulus; and clearly we cannot see a young woman in Jastrow's duck-rabbit figure. Nonetheless, cognition can change our interpretation of either figure.

Perhaps more compelling are auditory demonstrations of certain impoverished speech signals called sine-wave speech (e.g., Darwin Reference Darwin1997; Remez et al. Reference Remez, Pardo, Piorkowski and Rubin2001). Most of these stimuli sound like strangely squeaking wheels until one is told that they are speech. But sometimes the listener must be told what the utterances are. Then, quite spectacularly, the phenomenology is one of listening to a particular utterance of speech. Unlike visual figures such as those from Jastrow and Hill, this is not a bistable phenomenon; once a person hears a sine wave signal as speech, he or she cannot fully go back and hear these signals as mere squeaks. Is this not top-down?

Such phenomena – the bistability of certain visual figures and the asymmetric stability of these speechlike sounds, among many others – are not the results of confirmatory research. They are indeed the “amazing demonstrations” that F&S cry out for.

Second, visual neuroscience shows numerous examples of feedback projections to visual cortex, and feedback influences on visual neural processing that F&S ignore. The primary visual cortex (V1) receives descending projections from a wide range of cortical areas. Although the strongest feedback signals come from nearby visual areas V3 and V4, V1 also receives feedback signals from V5/MT, parahippocampal regions, superior temporal parietal regions, auditory cortex (Clavagnier et al. Reference Clavagnier, Falchier and Kennedy2004) and the amygdala (Amaral et al. Reference Amaral, Behniea and Kelly2003), establishing that the brain shows pervasive top-down connectivity. The next step is to determine what perceptual function descending projections serve. F&S cite a single paper to justify ignoring a massive literature accomplishing this (sect 2.2, para 2).

Neurons in V1 exhibit differential responses to the same visual input under a variety of contextual modulations (e.g., David et al. Reference David, Vinje and Gallant2004; Hupé et al. Reference Hupé, James, Payne, Lomber, Girard and Bullier1998; Kapadia et al. Reference Kapadia, Ito, Gilbert and Westheimer1995; Motter Reference Motter1993). Numerous studies with adults have established that selective attention enhances processing of information at the attended location, and suppresses distraction (Gandhi et al. Reference Gandhi, Heeger and Boynton1999; Kastner et al. Reference Kastner, Pinsk, De Weerd, Desimone and Ungerleider1999; Markant et al. Reference Markant, Worden and Amso2015b; Slotnick et al. Reference Slotnick, Schwarzbach and Yantis2003). This excitation/suppression mechanism improves the quality of early vision, enhancing contrast sensitivity, acuity, d-prime, and visual processing of attended information (Anton-Erxleben & Carrasco Reference Anton-Erxleben and Carrasco2013; Carrasco Reference Carrasco2011; Lupyan & Spivey Reference Lupyan and Spivey2010; Zhang et al. Reference Zhang, Jamison, Engel, He and He2011). This modulation of visual processing in turn supports improved encoding and recognition for attended information among adults (Rutman et al. Reference Rutman, Clapp, Chadick and Gazzaley2010; Uncapher & Rugg Reference Uncapher and Rugg2009; Zanto & Gazzaley Reference Zanto and Gazzaley2009) and infants (Markant & Amso Reference Markant and Amso2013; Reference Markant and Amso2016; Markant et al. Reference Markant, Oakes and Amso2015a). Recent data indicate that attentional biases can function at higher levels in the cognitive hierarchy (Chua & Gauthier Reference Chua and Gauthier2015), indicating that attention can serve as a mechanism guiding vision based on category-level biases.

Results like these have spurred the visual neuroscience community to develop new theories to account for how feedback projections change the receptive field properties of neurons throughout visual cortex (Dayan et al. Reference Dayan, Hinton, Neal and Zemel1995; Friston Reference Friston2010; Gregory Reference Gregory1980; Jordan Reference Jordan2013; Kastner & Ungerleider Reference Kastner and Ungerleider2001; Kveraga et al. Reference Kveraga, Ghuman and Bar2007b; Rao & Ballard Reference Rao and Ballard1999; Spratling Reference Spratling2010). It is not clear how F&S's theory of visual perception can claim that recognition of visual input takes place without top-down influences, when the activity of neurons in the primary visual cortex is routinely modulated by contextual feedback signals from downstream cortical subsystems. The role of downstream projections is still under investigation, but theories of visual perception and experience ought to participate in understanding them rather than ignoring them.

F&S are incorrect when they conclude that it is “eminently plausible that there are no top-down effects of cognition on perception” (final paragraph). Indeed, F&S's argument is heavily recycled from a previous BBS contribution (Pylyshyn Reference Pylyshyn1999). Despite their attempt to distinguish their contribution from that one, it suffers from very similar weaknesses identified by past commentary (e.g., Bruce et al. Reference Bruce, Langton and Hill1999; Bullier Reference Bullier1999; Cavanagh Reference Cavanagh1999, among others). F&S are correct when they state early on that, “discovery of substantive top-down effects of cognition on perception would revolutionize our understanding of how the mind is organized” (abstract). Especially in the case of visual perception, that is exactly what has been happening in the field for these past few decades.

References

Amaral, D. G., Behniea, H. & Kelly, J. L. (2003) Topographic organization of projections from the amygdala to the visual cortex in the macaque monkey. Neuroscience 118(4):1099–120.CrossRefGoogle Scholar
Anton-Erxleben, K. & Carrasco, M. (2013) Attentional enhancement of spatial resolution: Linking behavioural and neurophysiological evidence. Nature Reviews Neuroscience 14(3):188200. doi:10.1038/nrn3443.CrossRefGoogle ScholarPubMed
Bruce, V., Langton, S. & Hill, H. (1999) Complexities of face perception and categorisation. Behavioral and Brain Sciences 22(3):369–70.CrossRefGoogle Scholar
Bullier, J. (1999) Visual perception is too fast to be impenetrable to cognition. Behavioral and Brain Sciences 22(3):370.CrossRefGoogle Scholar
Carrasco, M. (2011) Visual attention: The past 25 years. Vision Research 51:1484–525.CrossRefGoogle ScholarPubMed
Cavanagh, P. (1999) The cognitive penetrability of cognition. Behavioral and Brain Sciences 22(3):370–71.CrossRefGoogle Scholar
Chua, K. W. & Gauthier, I. (2015) Learned attention in an object-based frame of reference. Journal of Vision 15(12):899–99.CrossRefGoogle Scholar
Clavagnier, S., Falchier, A. & Kennedy, H. (2004) Long-distance feedback projections to area V1: Implications for multisensory integration, spatial awareness, and visual consciousness. Cognitive, Affective, and Behavioral Neuroscience 4(2):117–26.CrossRefGoogle ScholarPubMed
Darwin, C. J. (1997) Auditory grouping. Trends in Cognitive Sciences 1(9):327–33.CrossRefGoogle ScholarPubMed
David, S. V., Vinje, W. E. & Gallant, J. L. (2004) Natural stimulus statistics alter the receptive field structure of V1 neurons. The Journal of Neuroscience 24(31):69917006.CrossRefGoogle ScholarPubMed
Dayan, P., Hinton, G. E., Neal, R. & Zemel, R. (1995) The Helmholtz machine. Neural Computation 7(5):889904.CrossRefGoogle ScholarPubMed
Fodor, J. A. (1983) Modularity of mind: An essay on faculty psychology. MIT Press.CrossRefGoogle Scholar
Friston, K (2010) The free-energy principle: A unified brain theory? Nature Reviews Neuroscience 11:127–38.CrossRefGoogle ScholarPubMed
Gandhi, S. P., Heeger, D. J. & Boynton, G. M. (1999) Spatial attention affects brain activity in human primary visual cortex. Proceedings of the National Academy of Sciences USA 96(6):3314–19.CrossRefGoogle ScholarPubMed
Gregory, R. L. (1980) Perceptions as hypotheses. Philosophical Transactions of the Royal Society B: Biological Sciences 290(1038):181–97.Google ScholarPubMed
Hill, W. E. (1915) My wife and my mother-in-law. Puck Nov. 6, p. 11.Google Scholar
Hupé, J. M., James, A. C., Payne, B. R., Lomber, S. G., Girard, P. & Bullier, J. (1998) Cortical feedback improves discrimination between figure and background by V1, V2 and V3 neurons. Nature 394(6695):784–87.CrossRefGoogle ScholarPubMed
Jastrow, J. (1899) The mind's eye. Popular Science Monthly 54:299312.Google Scholar
Jordan, J. S. (2013) The wild ways of conscious will: What we do, how we do it, and why it has meaning. Frontiers in Psychology 4:574.CrossRefGoogle Scholar
Kapadia, M. K., Ito, M., Gilbert, C. D. & Westheimer, G. (1995) Improvement in visual sensitivity by changes in local context: Parallel studies in human observers and in V1 of alert monkeys. Neuron 15(4):843–56.CrossRefGoogle ScholarPubMed
Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R. & Ungerleider, L. G. (1999) Increased activity in human visual cortex during directed attention in the absence of visual stimulation. Neuron 22(4):751–61.CrossRefGoogle ScholarPubMed
Kastner, S. & Ungerleider, L. G. (2001) The neural basis of biased competition in human visual cortex. Neuropsychologia 39(12):1263–76.CrossRefGoogle ScholarPubMed
Kveraga, K., Ghuman, A. S. & Bar, M. (2007b) Top-down predictions in the cognitive brain. Brain and Cognition 65(2):145–68.CrossRefGoogle ScholarPubMed
Lupyan, G. & Spivey, M. J. (2010) Making the invisible visible: Verbal but not visual cues enhance visual detection. PLoS ONE 5(7):e11452.CrossRefGoogle Scholar
Markant, J. & Amso, D. (2013) Selective memories: Infants' encoding is enhanced in selection via suppression. Developmental Science 16(6):926–40.CrossRefGoogle ScholarPubMed
Markant, J. & Amso, D. (2016) The development of selective attention orienting is an agent of change in learning and memory efficacy. Infancy 21(2): 154–76. doi: 10.1111/infa.12100.CrossRefGoogle ScholarPubMed
Markant, J., Oakes, L. M. & Amso, D. (2015a) Visual selective attention biases contribute to the other-race effect among 9-month-old infants. Developmental Psychobiology 58(3):355–65. doi:10.1002/dev.21375.CrossRefGoogle Scholar
Markant, J., Worden, M. S. & Amso, D. (2015b) Not all attention orienting is created equal: Recognition memory is enhanced when attention orienting involves distractor suppression. Neurobiology of Learning and Memory 120:2840.CrossRefGoogle ScholarPubMed
Motter, B. C. (1993) Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. Journal of Neurophysiology 70(3):909–19. Available at: http://doi.org/0022-3077/93.CrossRefGoogle ScholarPubMed
Pylyshyn, Z. (1999) Is vision continuous with cognition? The case for cognitive impenetrability of visual perception. Behavioral and Brain Sciences 22(3):341–65.CrossRefGoogle ScholarPubMed
Rao, R. P. & Ballard, D. H. (1999) Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience 2(1):7987.CrossRefGoogle ScholarPubMed
Remez, R. E., Pardo, J. S., Piorkowski, R. L. & Rubin, P. E. (2001) On the bistability of sine wave analogues of speech. Psychological Science 12(1):2429.CrossRefGoogle ScholarPubMed
Roepstorff, A. & Frith, C. (2004) What's at the top in the top-down control of action? Script-sharing and “top-top” control of action in cognitive experiments. Psychological Research 68(2–3):189–98.CrossRefGoogle ScholarPubMed
Rutman, A. M., Clapp, W. C., Chadick, J. Z. & Gazzaley, A. (2010) Early top–down control of visual processing predicts working memory performance. Journal of Cognitive Neuroscience 22(6):1224–34.CrossRefGoogle ScholarPubMed
Slotnick, S. D., Schwarzbach, J. & Yantis, S. (2003) Attentional inhibition of visual processing in human striate and extrastriate cortex. NeuroImage 19(4):1602–11.CrossRefGoogle ScholarPubMed
Spratling, M. W. (2010) Predictive coding as a model of response properties in cortical area V1. The Journal of Neuroscience 30(9):3531–43.CrossRefGoogle Scholar
Uncapher, M. R. & Rugg, M. D. (2009) Selecting for memory? The influence of selective attention on the mnemonic binding of contextual information. The Journal of Neuroscience 29(25):8270–79.CrossRefGoogle ScholarPubMed
Zanto, T. P. & Gazzaley, A. (2009) Neural suppression of irrelevant information underlies optimal working memory performance. The Journal of Neuroscience 29(10):3059–66.CrossRefGoogle ScholarPubMed
Zhang, P., Jamison, K., Engel, S., He, B. & He, S. (2011) Binocular rivalry requires visual attention. Neuron 71(2):362–69.CrossRefGoogle ScholarPubMed