Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-06T03:00:47.672Z Has data issue: false hasContentIssue false

Representation and agency

Published online by Cambridge University Press:  19 June 2020

Karl Friston*
Affiliation:
The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, LondonWC1N 3AR, UK. k.friston@ucl.ac.uk https://www.fil.ion.ucl.ac.uk/~karl/

Abstract

Gilead et al. raise some fascinating issues about representational substrates and structures in the predictive brain. This commentary drills down on a core theme in their arguments; namely, the structure of models that generate predictions. In particular, it highlights their factorial nature – both in terms of deep hierarchies over levels of abstraction and, crucially, time – and how this underwrites agency.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2020. Published by Cambridge University Press

There are a myriad of enticing issues raised by Gilead et al. I will focus on a theme that emerges in different guises throughout their treatment. This theme is the structure of implicit generative models that the brain uses to furnish predictions of its sensorium. The nature of generative models is especially important from the perspective of active inference – a corollary of the free energy principle (Friston Reference Friston2013); where many interesting aspects of generative models boil down to their factorial structure.

In what follows, I try to explain why generative models are so central to representation in active (Bayesian) inference as planning (Attias Reference Attias2003; Baker & Tenenbaum Reference Baker, Tenenbaum, Sukthankar, Geib, Bui, Pynadath and Goldman2014; Friston et al. Reference Friston, Mattout and Kilner2011). I then consider the factorial nature of these models, which endows them with deep (hierarchical) structure; from the concrete to the abstract – and, crucially, from the past to the future (Friston et al. Reference Friston, Rosch, Parr, Price and Bowman2017d; Russek et al. Reference Russek, Momennejad, Botvinick, Gershman and Daw2017). Underwriting this treatment is an enactive aspect of representational processing; namely, the notion that inference about the causes of our sensations is the easy problem: the hard part is inferring the best way to gather those sensations (Davison & Murray Reference Davison and Murray2002; Ferro et al. Reference Ferro, Ognibene, Pezzulo and Pirrelli2010; MacKay Reference MacKay1992).

Gilead et al. refer often to the formalism of active inference. I think this is perfectly appropriate, because a formal treatment of representational structure is, in its essence, a treatment of the generative models that underwrite inference. Technically, a generative model is just a probability distribution over some causes and their consequences. In the setting of the embodied brain, the causes are states of the world “out there” – that are hidden behind our sensations. These sensations are the consequences. Inverting a generative model refers to the inverse mapping from (sensory) consequences to their (worldly) causes. These causes are the abstracta and concreta that constitute different kinds of representations in Gilead et al. The generative model is important because most of the heavy lifting – in terms of understanding structure–function relationships in the brain – rests on its form. In other words, if one knows the generative model, model inversion can be cast in terms of the Bayesian brain hypothesis (in a normative sense) (Doya Reference Doya2007; Knill & Pouget Reference Knill and Pouget2004) or combined with standard inversion schemes to generate neuronal processes theories about computational brain architectures and neuronal message passing (Friston et al. Reference Friston, Parr and de Vries2017c).

These theories are usually cast in terms of belief-updating via a gradient descent on variational free energy. There are several schemes that fall under this class; all of which have been used as biologically plausible process theories for perceptual inference. Crucially, exactly the same quantity is optimised by action; thereby providing a formal account of the action–perception cycle (Fuster Reference Fuster2004). Particular instances include predictive coding (Rao & Ballard Reference Rao and Ballard1999) and variational message passing for generative models based upon continuous and discrete states, respectively. These process theories constitute a field in cognitive neuroscience that has become known as predictive processing (Clark Reference Clark2013; Seth Reference Seth2014). Therefore, what are the most important aspects of a generative model?

One aspect has already been mentioned; namely, the distinction between continuous and discrete models. However, a feature that is common to both is their factorial structure. In fact, from a technical perspective, the way in which we factorise our (non-propositional) posterior beliefs about hidden causes (i.e., how we come to represent things “out there”) rests upon a factorisation known as a mean field approximation in physics and machine learning. Key examples emerge throughout (Gilead et al.). The first is a factorisation over the levels of a deep (hierarchical) generative model. Typically, the lowest levels – that generate sensory data – are concrete and modality bound. As one ascends the hierarchy, the states of the world represented become more abstract and inclusive.

Another important aspect of factorisation is a carving of putative hidden states of the world within any hierarchical level. My favourite example is the factorisation into “what” and “where” (Ungerleider & Haxby Reference Ungerleider and Haxby1994). In short, knowing what something is does not tell you where it is and vice versa. This (conditional) independence is manifest beautifully, in terms of the functional anatomy of the dorsal and ventral streams in the brain. This sort of factorisation emerges frequently in Gilead et al. One intriguing example is the notion of predicators; namely, representations that behave like functions. An interesting question here is whether one needs to treat relationships in a way that is fundamentally different from objects? For example, how is a representation of “what” formerly distinct from a representation of “where,” when generating visual input?

The final factorisation is over time. This theme emerges in modality-specific features, objects, and relationships – that rest upon the notion of object permanence. This sort of permanence has to be written into a generative model of a capricious world. This theme reappears in terms of spatiotemporal contiguity in the treatment of multimodal features. Indeed, the premise of Gilead et al. rests upon integrating influential theories in the “predictive brain camp” with “prospection (or future oriented mental time travel).” This is a big move, because it entails generative models of dynamics, narratives, or trajectories – with representations of the past and future. In turn, this enables the representation of states that have not yet been realised. These states undergird “simulation of future events” (intro., para. 4) and a sense of agency. In other words, the notion of a model that can “generate a representation that models the specific problem at hand” (sect. 3.1, para. 3) is exactly a generative model of the future, with “my” action as a latent state that has to be inferred. This is important from the point of view of active inference, because it suggests that much of our inference is not about states of affairs “out there” but more about “what would happen if I did that” (Schmidhuber Reference Schmidhuber2006). This is nicely summarised as (ibid., p. 23):

The functionality of a simulation stems from the fact that the person running the simulation self-projects into it, that is, becomes an agent in the simulated situation.

Gilead et al. then offer a compelling conclusion about mental travel (sect. 4, para. 2):

Representational structures … form the bridges that allow us to traverse uncertainty. In light of this … the link between abstraction and mental travel is fundamental to any consideration of these constructs; there is no mental travel without abstraction, and there is no need for abstraction but to support mental travel.

I would add:

and there is no need for mental travel that but to support inference about what I should do next.

References

Attias, H. (2003) Planning by probabilistic inference. Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics.Google Scholar
Baker, C. L. & Tenenbaum, J. B. (2014) Modeling human plan recognition using Bayesian theory of mind. In: Plan, activity, and intent recognition, ed. Sukthankar, G., Geib, C., Bui, H. H., Pynadath, D. V. & Goldman, R. P., pp. 177204. Morgan Kaufmann.CrossRefGoogle Scholar
Clark, A. (2013) Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences 36(3):181204.CrossRefGoogle ScholarPubMed
Davison, A. J. & Murray, D. W. (2002) Simultaneous localization and map-building using active vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7):865–80. doi: 10.1109/tpami.2002.1017615.CrossRefGoogle Scholar
Doya, K. (2007) Bayesian brain: Probabilistic approaches to neural coding. MIT press.Google Scholar
Ferro, M., Ognibene, D., Pezzulo, G. & Pirrelli, V. (2010) Reading as active sensing: A computational model of gaze planning during word recognition. Frontiers in Neurorobotics 4:1.Google Scholar
Friston, K. (2013) Life as we know it. Journal of the Royal Society, Interface 10(86):20130475.Google Scholar
Friston, K., Mattout, J. & Kilner, J. (2011) Action understanding and active inference. Biological Cybernetics 104:137–60.CrossRefGoogle ScholarPubMed
Friston, K. J., Parr, T. & de Vries, B. (2017c) The graphical brain: Belief propagation and active inference. Network Neuroscience 1(4):381414. doi: 10.1162/NETN_a_00018.CrossRefGoogle Scholar
Friston, K. J., Rosch, R., Parr, T., Price, C. & Bowman, H. (2017d) Deep temporal models and active inference. Neuroscience & Biobehavioral Reviews 77:388402. doi: 10.1016/j.neubiorev.2017.04.009.CrossRefGoogle Scholar
Fuster, J. M. (2004) Upper processing stages of the perception–action cycle. Trends in Cognitive Sciences 8(4):143–45.CrossRefGoogle ScholarPubMed
Knill, D. C. & Pouget, A. (2004) The Bayesian brain: The role of uncertainty in neural coding and computation. Trends in Neurosciences 27(12):712–19.CrossRefGoogle ScholarPubMed
MacKay, D. J. C. (1992) Information-based objective functions for active data selection. Neural Computation 4(4):590604. doi: 10.1162/neco.1992.4.4.590.CrossRefGoogle Scholar
Rao, R. P. N. & Ballard, D. H. (1999) Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience 2(1):7987. Available at: http://doi.org/10.1038/4580.CrossRefGoogle ScholarPubMed
Russek, E. M., Momennejad, I., Botvinick, M. M., Gershman, S. J. & Daw, N. D. (2017) Predictive representations can link model-based reinforcement learning to model-free mechanisms. PLOS Computational Biology 13(9):e1005768. doi: 10.1371/journal.pcbi.1005768.CrossRefGoogle ScholarPubMed
Schmidhuber, J. (2006) Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Science 18(2):173–87. doi: 10.1080/09540090600768658.CrossRefGoogle Scholar
Seth, A. K. (2014) A predictive processing theory of sensorimotor contingencies: Explaining the puzzle of perceptual presence and its absence in synesthesia. Cognitive Neuroscience 5(2):97118. doi: 10.1080/17588928.2013.877880.CrossRefGoogle ScholarPubMed
Ungerleider, L. G. & Haxby, J. V. (1994) “What” and “where” in the human brain. Current Opinion in Neurobiology 4(2):157–65. doi: 10.1016/0959-4388(94)90066-3.CrossRefGoogle Scholar