Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-02-07T04:54:06.366Z Has data issue: false hasContentIssue false

Firestone & Scholl conflate two distinct issues

Published online by Cambridge University Press:  05 January 2017

Ryan Ogilvie
Affiliation:
Department of Philosophy, University of Maryland, College Park, MD 20742-7615. rogilvie@umd.edupcarruth@umd.eduhttps://sites.google.com/site/ryanogilvie/http://faculty.philosophy.umd.edu/pcarruthers/
Peter Carruthers
Affiliation:
Department of Philosophy, University of Maryland, College Park, MD 20742-7615. rogilvie@umd.edupcarruth@umd.eduhttps://sites.google.com/site/ryanogilvie/http://faculty.philosophy.umd.edu/pcarruthers/

Abstract

Firestone & Scholl (F&S) seem to believe that the viability of a distinction between perception and cognition depends on perception being encapsulated from top-down information. We criticize this assumption and argue that top-down effects can leave the distinction between perception and cognition fully intact. Individuating the visual system is one thing; the question of encapsulation is quite another.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2016 

What is at stake in the debate between those who think that vision is informationally encapsulated and those who don't? Firestone & Scholl (F&S) believe that it is the viability of the distinction between perception and cognition. They write, “the extent to which what and how we see is functionally independent from what and how we think, know desire, act, and so forth” bears on “whether there is a salient ‘joint’ between perception and cognition” (sect. 2). They further claim that if cognition “can affect what we see…then a genuine revolution in our understanding of perception is in order” (sect. 1.1, para. 3). Thus, they seem to believe that one can draw the traditional distinction between perception and cognition only if perception is encapsulated from cognition.

Although F&S cite a number of opposing authors who agree with this sentiment, we think that it rests on a conflation between two different notions of modularity. (Roughly, the distinction is between Fodor modularity, which requires encapsulation [see Fodor Reference Fodor1983] and functional modularity, which does not [see Barrett & Kurzban Reference Barrett and Kurzban2006; Carruthers Reference Carruthers2006].) In fact, one can perfectly well individuate a system in terms of what it does (e.g., what computations it performs) regardless of whether the operations of the system are sensitive to information from outside.

One can characterize the visual system as the set of brain mechanisms specialized for the analysis of signals originating from the retina. The computations these mechanisms perform are geared toward making sense out of the light array landing on the retina. We may not know precisely how to identify these mechanisms or how they perform their computations. But it is surely a plausible hypothesis that there is a set of brain mechanisms that does this, and perhaps only this. (Indeed, it is widely assumed in the field that the set would include at least V1, V2, and V3.) In this we agree with F&S. But one can accept that the visual system consists of a proprietary set of mechanisms while denying that it takes only bottom up input. For example, the existence of crossmodal effects need in no way undermine the distinction between audition and vision. Hence, we see no reason to think that the existence of top-down effects should undermine the distinction between vision and higher-level cognitive systems, either.

Of course, if there were no way to identify some set of mechanisms as proprietary to the visual system, then one might be justified in denying the traditional distinction between perception and cognition. But we see no reason for such skepticism. In fact, we think that holding fixed (or abstracting away from) top-down effects provides one effective way of individuating perceptual systems. Having established a relatively plausible model of bottom-up visual processing one can thereafter look at how endogenous variables modulate that processing. Indeed, this appears to underlie the methodology employed by at least some cognitive neuroscientists.

Consider a study by Kok et al. (Reference Kok, Brouwer, van Gerven and de Lange2013) in which participants implicitly learned two tone–orientation pairings. During subsequent trials, participants viewed random-dot-motion displays, where a subset of the dots moved in a coherent fashion. Participants were then asked to judge the direction of coherent motion. On trials when a tone was present, participants' orientation judgments showed a clear “attractive” bias – that is, they judged the orientation of the motion to be closer to the cued orientation than they did when there was no tone present (or a tone paired to a different orientation).

An important note: Participants performed this task within an fMRI scanner. The investigators then used a forward-modeling approach to estimate the perceived direction of coherent motion on each trial. This essentially involved collecting fMRI data from motion-selective voxels in areas V1, V2, and V3 on each trial, and using the data from the unbiased (bottom-up) trials to create an orientation-sensitive artificial neural network. The fMRI models for the biased (top-down influenced) trials turned out to match the participants' reports of the perceived direction better than they did the actual directions of motion, which suggests that the model accurately represents direction-sensitive processing in early vision. Further support for the validity of the model comes from the fact that there was a positive correlation between participants' behavioral and modeled responses. For example, if someone showed a stronger bias than others in the behavioral condition, then so did her fMRI forward model.

The moral we want to draw from this case is as follows. Artificial neural networks have long been used to model orientation processing in a bottom-up fashion. The network consists of neurons preferentially tuned to specific orientations, which in turn will exhibit differential activation patterns in the presence of coherent motion at different particular orientations. Kok et al. (Reference Kok, Brouwer, van Gerven and de Lange2013) assume that such a model will predict people's behavioral responses in the absence of a tone because the system (comprising at least V1, V2, and V3) is specialized for processing visual inputs, and it will do so relying on bottom-up information alone in the absence of top-down modulation (which is what they found). In the presence of a tone, however, the model continues to match the behavioral response, but it no longer tracks the stimulus orientation. Thus, they infer that there must be some sort of top-down signal that alters the manner in which information is processed within the visual system.

In short, rather than obviating any distinction between perceptual and cognitive systems, this model seems to presuppose such a distinction, all the while claiming that vision is porous to endogenous information about the statistical regularities in the environment. In fact, it is in virtue of stable bottom-up models that one can begin to understand how top-down effects modulate visual processing. It may yet turn out that there are no (interesting) top-down modulations of the visual system, of course. That is an empirical possibility. But if there are interesting top-down effects (as in fact we think there are; see Ogilvie & Carruthers Reference Ogilvie and Carruthers2016), we don't think that should be regarded as especially revolutionary.

References

Barrett, H. & Kurzban, R. (2006) Modularity in cognition. Psychological Review 113:628–47.CrossRefGoogle ScholarPubMed
Carruthers, P. (2006) The architecture of the mind: Massive modularity and the flexibility of thought. Oxford University Press.CrossRefGoogle Scholar
Fodor, J. A. (1983) Modularity of mind: An essay on faculty psychology. MIT Press.CrossRefGoogle Scholar
Kok, P., Brouwer, G., van Gerven, M. & de Lange, F. (2013) Prior expectations bias sensory representations in visual cortex. Journal of Neuroscience 33:16275–84.CrossRefGoogle ScholarPubMed
Ogilvie, R. & Carruthers, P. (2016) Opening up vision: The case against encapsulation. Review of Philosophy and Psychology 7:122 [ePub ahead of print]. doi:10.1007/s13164-015-0294-8.CrossRefGoogle Scholar