Neuronal inference must be local, selective, and coordinated

William A. Phillips

doi:10.1017/S0140525X12002257

Neuronal inference must be local, selective, and coordinated

Published online by Cambridge University Press: 10 May 2013

William A. Phillips

Show author details

William A. Phillips*: Affiliation:
Psychology Department, University of Stirling, FK9 4LA Stirling, Scotland, United Kingdom, and Frankfurt Institute of Advanced Studies, 60438 Frankfurt am Main, Germany. wap1@stir.ac.ukhttp://www.psychology.stir.ac.uk/staff/staff-profiles/honorary-staff/bill-phillips

Article contents

Abstract
References

Rights & Permissions

Abstract

Life is preserved and enhanced by coordinated selectivity in local neural circuits. Narrow receptive-field selectivity is necessary to avoid the curse-of-dimensionality, but local activities can be made coherent and relevant by guiding learning and processing using broad coordinating contextual gain-controlling interactions. Better understanding of the functions and mechanisms of those interactions is therefore crucial to the issues Clark examines.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 36 , Issue 3 , June 2013 , pp. 222 - 223

DOI: https://doi.org/10.1017/S0140525X12002257 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

Much in Clark's review is of fundamental importance. Probabilistic inference is crucial to life in general and neural systems in particular, but does it have a single coherent logic? Jaynes (Reference Jaynes2003) argued that it does, but for that logic to be relevant to brain theory, it must be shown how systems built from local neural processors can perform essential functions that are assumed to be the responsibility of the scientist in Jaynes' theory (Fiorillo Reference Fiorillo2012; Phillips Reference Phillips2012).

Most crucial of those functions are selection of the information relevant to the role of each local cell or microcircuit and coordination of their multiple concurrent activities. The information available to neural systems is so rich that it cannot be used for inference if taken as a single, multi-dimensional whole because the number of locations in multi-dimensional space increases exponentially with dimensionality. Most events that actually occur in high-dimensional spaces are therefore novel and distant from previous events, precluding learning based on sample probabilities. This constraint, well-known to the machine-learning community as the curse-of-dimensionality, has major consequences for psychology and neuroscience. It implies that for learning and inference to be possible large data-bases must be divided into small subsets, as amply confirmed by the clear selectivity observed within and between brain regions at all hierarchical levels. Creation of the subsets involves both prespecified mechanisms, as in receptive field selectivity, and dynamic grouping as proposed by Gestalt psychology (Phillips et al. Reference Phillips, von der Malsburg, Singer, von der Malsburg, Phillips and Singer2010). The criteria for selection must be use-dependent because information crucial to one use would be fatal to another, as in the contrast between dorsal and ventral visual pathways. Contextual modulation is also crucial because interpretations with low probability overall may have high probability in certain contexts. Therefore, the activity of local processors must be guided by the broader context, and their multiple concurrent decisions must be coordinated if they are to create coherent percepts, thoughts, and actions.

Most models of predictive coding (PC) and Bayesian inference (BI) assume that the information to be coded and used for inference is a given. In those models, it is – by the modelers. Modelers may assume that in the real world this information is given by the external input, but that provides more information than could be used for inference if taken as a whole. Self-organized selection of the information relevant to particular uses is therefore crucial. Efficient coding strategies, such as PC, are concerned with ways of transmitting information through a hierarchy, not with deciding what information to transmit. They assume lossless transmission of all input information to be the goal, and so provide no way of extracting different information for different uses. Models using BI show how to combine information from different sources when computing a single posterior decision; but they do not show how local neural processors can select the relevant information, nor do they show how multiple streams of processing can coordinate their activities. Thus, local selectivity, dynamic-grouping, contextual-disambiguation, and coordinating interactions are all necessary within cognitive systems, but are not adequately explained by the essential principles of either PC or BI.

Clark's review, however, does contain the essence of an idea that could help resolve the mysteries of selectivity and coordination, that is, context-sensitive gain-control, for which there are several widely-distributed neural mechanisms. A crucial strength of the free-energy theory is that it uses gain-controlling interactions to implement attention (Feldman & Friston Reference Feldman and Friston2010), but such mechanisms can do far more than that. For example, they can select and coordinate activities by amplifying or suppressing them as a function of their predictive relationships and current relevance. This is emphasized by the theory of Coherent Infomax (Kay et al. Reference Kay, Floreano and Phillips1998; Kay & Phillips Reference Kay and Phillips2010; Phillips et al. Reference Phillips, Kay and Smyth1995), which synthesizes evidence from neuroanatomy, neurophysiology, macroscopic neuroimaging, and psychophysics (Phillips & Singer Reference Phillips and Singer1997; von der Malsburg et al. Reference von der Malsburg, Phillips and Singer2010). That theory is further strengthened by evidence from psychopathology as reviewed by Phillips and Silverstein (Reference Phillips and Silverstein2003), and extended by many subsequent studies. Körding and König (Reference Körding and König2000) argue for a closely related theory.

Free-energy theory (Friston Reference Friston2010) and Coherent Infomax assume that good predictions are vital, and formalize that assumption as an information theoretic objective. Though these theories have superficial differences, with Coherent Infomax being formulated at the neuronal rather than the system level, it may be possible to unify their objectives as that of maximizing prediction success, which, under plausible assumptions, is equivalent to minimizing prediction error (Phillips & Friston, in preparation). Formulating the objective as maximizing the amount of information correctly predicted directly solves the “dark-room” problem discussed by Clark. That objective, however, does not necessarily imply that prediction errors are the fundamental currency of feedforward communication. Inferences could be computed by reducing prediction errors locally, and communicating inferences more widely (Spratling Reference Spratling2008a). That version of PC is supported by much neurobiological evidence, though it remains possible that neural systems use both versions.

Another important issue concerns the obvious diversity of brains and cognition. How could any unifying theory cast light on that? Though possible in principle, detailed answers to this question are largely a hope for the future. Coherent Infomax hypothesizes a local building-block from which endlessly many architectures could be built, but use of that to enlighten the obvious diversity is a task hardly yet begun. Similarly, though major transitions in the evolution of inferential capabilities seem plausible, study of what they may be remains a task for the future (Phillips Reference Phillips2012). By deriving algorithms for learning, Coherent Infomax shows in principle how endless diversity can arise from diverse lives, and it has been shown that the effectiveness of contextual-coordination varies greatly across people of different ages (Doherty et al. Reference Doherty, Campbell, Tsuji and Phillips2010), sex (Phillips et al. Reference Phillips, Chapman and Berry2004), and culture (Doherty et al. Reference Doherty, Tsuji and Phillips2008). Use of this possible source of variability to enlighten diversity across and within species still has far to go, however.

Overall, I expect theories such as those examined by Clark to have far-reaching consequences for philosophy, and human thought in general, so I fully endorse the journey on which he has embarked.

References

Doherty, M. J., Campbell, N. M., Tsuji, H. & Phillips, W. A. (2010) The Ebbinghaus illusion deceives adults but not young children. Developmental Science 13:714–21. doi:10.1111/j.1467-7687.2009.00931.x.CrossRef Google Scholar

Doherty, M. J., Tsuji, H. & Phillips, W. A. (2008) The context-sensitivity of visual size perception varies across cultures. Perception 37:1426–33.CrossRef Google Scholar PubMed

Feldman, H. & Friston, K. J. (2010) Attention, uncertainty, and free-energy. Frontiers in Human Neuroscience 4:215. doi:10.3389/fnmuh.2010.00215.Google Scholar

Fiorillo, C. D. (2012) Beyond Bayes: On the need for a unified and Jaynesian definition of probability and information within neuroscience. Information 3(2):175–203. doi:10.3390/info3020175.Google Scholar

Friston, K. J. (2010) The free-energy principle: A unified brain theory? Nature Reviews Neuroscience 11(2):127–38.Google Scholar

Jaynes, E. T. (2003) Probability theory: The logic of science. Cambridge University Press.Google Scholar

Kay, J., Floreano, D. & Phillips, W. A. (1998) Contextually guided unsupervised learning using local multivariate binary processors. Neural Networks 11:117–40.Google Scholar

Kay, J. & Phillips, W. A. (2010) Coherent Infomax as a computational goal for neural systems. Bulletin of Mathematical Biology 73:344–72. doi: 10.1007/s11538-010-9564-x.CrossRef Google Scholar PubMed

Körding, K. P. & König, P. (2000) Learning with two sites of synaptic integration. Network: Computation in Neural Systems 11:1–15.Google Scholar PubMed

Phillips, W. A. (2012) Self-organized complexity and coherent Infomax from the viewpoint of Jaynes's probability theory. Information 3(1):1–15. doi:10.3390/info3010001.Google Scholar

Phillips, W. A., Chapman, K. L. S. & Berry, P. D. (2004) Size perception is less context-sensitive in males. Perception 33:79–86.Google Scholar

Phillips, W. A., Kay, J. & Smyth, D. (1995) The discovery of structure by multistream networks of local processors with contextual guidance. Network: Computation in Neural Systems 6:225–46.Google Scholar

Phillips, W. A. & Silverstein, S. M. (2003) Convergence of biological and psychological perspectives on cognitive coordination in schizophrenia. Behavioral and Brain Sciences 26:65–82; discussion 82–137.Google Scholar

Phillips, W. A. & Singer, W. (1997) In search of common foundations for cortical computation. Behavioral and Brain Sciences 20:657–722.Google Scholar

Phillips, W. A., von der Malsburg, C. & Singer, W. (2010) Dynamic coordination in brain and mind. In: Strüngmann Forum Report, vol. 5: Dynamic coordination in the brain: From neurons to mind, ed. von der Malsburg, C., Phillips, W. A. & Singer, W., Chapter 1, pp. 1–24. MIT Press.Google Scholar

Spratling, M. W. (2008a) Predictive coding as a model of biased competition in visual attention. Vision Research 48(12):1391–408.Google Scholar

von der Malsburg, C., Phillips, W. A. & Singer, W., eds. (2010) Strungmann Forum Report, Vol. 5. Dynamic coordination in the brain: From neurons to mind. MIT Press.Google Scholar