Attention is more than prediction precision

Howard Bowman; Marco Filetti; Brad Wyble; Christian Olivers

doi:10.1017/S0140525X12002324

Attention is more than prediction precision

Published online by Cambridge University Press: 10 May 2013

Brad Wyble and

Howard Bowman: Affiliation:
Centre for Cognitive Neuroscience and Cognitive Systems, and the School of Computing, University of Kent at Canterbury, Kent CT2 7NF, United Kingdom. H.Bowman@kent.ac.ukM.Filetti@kent.ac.ukhttp://www.cs.kent.ac.uk/people/staff/hb5/http://www.cs.kent.ac.uk/people/rpg/mf266/
Marco Filetti: Affiliation:
Centre for Cognitive Neuroscience and Cognitive Systems, and the School of Computing, University of Kent at Canterbury, Kent CT2 7NF, United Kingdom. H.Bowman@kent.ac.ukM.Filetti@kent.ac.ukhttp://www.cs.kent.ac.uk/people/staff/hb5/http://www.cs.kent.ac.uk/people/rpg/mf266/
Brad Wyble: Affiliation:
Department of Psychology, Syracuse University, Syracuse, NY 13244. bwyble@gmail.comwww.bradwyble.com
Christian Olivers: Affiliation:
Department of Cognitive Psychology, Faculty of Psychology and Education, VU University Amsterdam, 1081 BT Amsterdam, The Netherlands. c.n.l.olivers@vu.nlhttp://olivers.cogpsy.nl

Article contents

Abstract
References

Rights & Permissions

Abstract

A cornerstone of the target article is that, in a predictive coding framework, attention can be modelled by weighting prediction error with a measure of precision. We argue that this is not a complete explanation, especially in the light of ERP (event-related potentials) data showing large evoked responses for frequently presented target stimuli, which thus are predicted.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 36 , Issue 3 , June 2013 , pp. 206 - 208

DOI: https://doi.org/10.1017/S0140525X12002324 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

The target article by Andy Clark champions predictive coding as a theory of brain function. Perception is the domain in which many of the strongest claims for predictive coding have been made, and we focus on that faculty. It is important to note that there are other unifying explanations of perception, one being that the brain is a salience detector, with salience referring broadly to relevance to an organism's goals. These goals reflect a short-term task set (e.g., searching a crowd for a friend's face), or more ingrained, perhaps innate motivations (e.g., avoiding physical threat). A prominent perspective is, exactly, that one role of attention is to locate and direct perception towards, salient stimuli.

The target article emphasises the importance of evoked responses, particularly EEG event-related potentials (ERPs), in adjudicating between theories of perception. The core idea is that the larger the difference between an incoming stimulus and the prediction, the larger the prediction error and thus the larger the evoked response. There are indeed ERPs that are clearly modulated by prediction error, for example, the Mismatch Negativity (evoked by deviation from a repeating pattern of stimulus presentation), the N400 (evoked by semantic anomalies), and P3 responses to oddball stimuli. In addition, stimuli that violate our expectations do often capture attention (Horstmann Reference Horstmann2002), consistent with predictive coding. However, such surprise-driven orienting is just one aspect of attention, and we question whether prediction error provides an adequate explanation for attentional functioning as a whole.

A central aspect of attention, which makes perception highly adaptive, is that it can purposefully select and enhance expected stimuli. This arises when an arrow cues where a target will appear, or a verbal instruction indicates it will be red. However, in this context, ERPs are largest to the target stimuli (P1, N1, N2pc, P3; Luck Reference Luck and Breitmeyer2006), in line with a saliency account. Such heightened responses to predicted stimuli do not seem to sit comfortably with predictive coding. As Clark highlights, resolution of this conundrum has, in analogy with statistical tests, focused on precision (Feldman & Friston Reference Feldman and Friston2010). The two-sample t-test, say, is a ratio of the difference of two means, and variability in the estimate of that difference. Precision-weighted prediction error is such a test: The difference between prediction and observation is weighted by the precision or confidence in that difference – that is, the inverse of variability, or, in other words, the signal fed back up the sensory pathway, the evoked response, is a precision-weighted prediction error. Importantly, attention is proposed to increase precision; that is, the brain has greater confidence in its estimate of disparity between predicted and observed when that observation is being spot-lit by attention, and, indeed, perception does seem more accurate in the presence of attention (Chennu et al. Reference Chennu, Craston, Wyble and Bowman2009). This then enables predictive coding to generate big bottom-up responses to expected, in the sense of attended stimuli, as simulated for spatial attention in (Feldman Reference Feldman and Friston2010).

Although predictive coding is an elegant and intriguing approach, obstacles remain to its being fully reconciled with the saliency perspective. First, precision-weighting has a multiplicative effect. Hence, there has to be a difference between observed and predicted in the first place for precision to work on. If observed is exactly as expected, however big precision might be, the precision-weighted prediction error will be zero. Yet classic EEG experiments show that attentional enhancement of ERP components (e.g., P1 and N1) is greatest when targets appear in the same location for many trials (Van Voorhis & Hillyard Reference Van Voorhis and Hillyard1977). One could of course argue that there is always some error, and that the effects of attention on precision are extremely large relative to that error. However, depending upon the extent to which precision modulates the prediction error, one could obtain classically predictive or anti-predictive (i.e., salience sensitive) patterns, and both patterns are found experimentally. Thus, the theory really requires a computational explanation of how the modulatory effect of precision varies across experimental contexts, otherwise there is a risk that it becomes effectively unfalsifiable.

Second, prediction error is passed back up the sensory pathway so that parameters can be adjusted to improve predictions (i.e., learning), and the amount parameters change is a function of the size of the precision-weighted prediction error. This, however, raises a further problem with a big precision-weighted prediction error being generated through a large (attention-governed) precision, when observed and predicted are similar. Specifically, in this case, the parameters should not change and certainly not a lot, even though precision-weighted prediction error might mandate it.

Third, directing attention, and thus improving precision, at a pre-determined location is one thing. But what makes attention so adaptive is that it can guide towards an object at an unpredictable location – simply on the basis of features. For example, we could ask the reader to find the nearest word printed in bold. Attention will typically shift to one of the headers, and indeed momentarily increase precision there, improving reading. But this makes precision weighting a consequence of attending. At least as interesting is the mechanism enabling stimulus selection in the first place. The brain has to first deploy attention before a precision advantage can be realised for that deployment. Salience theory proposes that stimuli carrying a target feature become more salient and thus draw attention. But which predictive coding mechanism is sensitive to the match between a stimulus feature and the target description? In typical visual search experiments, observers are looking for, and finding, the same target in trial after trial. For example, in our rapid serial visual presentation experiments, each specific distractor appears very rarely (once or twice), while pre-described targets appear very frequently. We obtained effectively no evoked response for distractors but a large deflection for the target (see Fig. 1). It seems that predictive coding mandates little if any response for this scenario. If anything, should the distractors not have generated the greatest response, since they were (a) rare, and (b) not matching predictions?

Figure 1. An anti-predictive ERP pattern.

Even if one could devise a predictive coding framework that allocated a higher precision to the target representation (which is a step beyond its spatial allocation in Feldman Reference Feldman and Friston2010), it is unclear how it could generate a massive precision-weighted prediction error specifically for targets, where predicted and observed match exactly. It is also unclear why such an error is needed.

References

Chennu, S., Craston, P., Wyble, B. & Bowman, H. (2009) Attention increases the temporal precision of conscious perception: Verifying the neural ST² model. PLOS Computational Biology 5(11):1–13.Google Scholar

Feldman, H. & Friston, K. J. (2010) Attention, uncertainty, and free-energy. Frontiers in Human Neuroscience 4:215. doi:10.3389/fnmuh.2010.00215.Google Scholar

Horstmann, G. (2002) Evidence for attentional capture by a surprising color singleton in visual search. Psychological Science 13(6):499–505.Google Scholar

Luck, S. J. (2006) The operation of attention – millisecond by millisecond – over the first half second. In: The first half second: The microgenesis and temporal dynamics of unconscious and conscious visual processing, ed. Breitmeyer, H. Ö. B. G., pp. 187–206. MIT Press.Google Scholar

Van Voorhis, S. & Hillyard, S. A. (1977) Visual evoked potentials and selective attention to points in space. Perception and Psychophysics 22(1):54–62.CrossRef Google Scholar