Hostname: page-component-745bb68f8f-hvd4g Total loading time: 0 Render date: 2025-02-06T02:41:45.279Z Has data issue: false hasContentIssue false

Representation, abstraction, and simple-minded sophisticates

Published online by Cambridge University Press:  19 June 2020

Peter Dayan*
Affiliation:
Max-Planck-Gesellschaft, Max Planck-Ring 8, 72076Tübingen, Germany. dayan@tue.mpg.de https://www.kyb.tuebingen.mpg.de/publication-search/60427?person=persons217460

Abstract

Bayesian decision theory provides a simple formal elucidation of some of the ways that representation and representational abstraction are involved with, and exploit, both prediction and its rather distant cousin, predictive coding. Both model-free and model-based methods are involved.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2020. Published by Cambridge University Press

Bayesian decision theorists (BDTs), a group which active inferencers might beneficially pupate to join, are sophisticated simpletons. The simpleton half of this oxymoron comes from the straightforward inferential crank that they turn to generate behaviour (Berger Reference Berger1985): agents should characterize their probabilistic beliefs about the state of the world; evaluate the expected present worth of the potential long run future consequences of their available choices or actions given this characterization; and make an appropriate choice in light of these evaluations. BDTs are sophisticated because done correctly, this leads to optimal behaviour in both individual and collective (Harsanyi Reference Harsanyi1967) settings, and because of the statistical and computational complexities they have to overcome to execute each of these steps correctly.

From a formal perspective, we can see the centrality of predictions about the future (a rider that will seem less odd shortly) – because it is the portended worth of those consequences that matter. Indeed, agents’ very characterizations of the current state of the world should only make distinctions that make a difference in terms of what the future might hold (Dayan Reference Dayan1993; Littman et al. Reference Littman, Sutton and Singh2001). However, BDTs only need to predict evaluations – predicting in more detail what will happen is at most a means to this particular end.

I hope that Bayesian decision theory helps put all the rich representational and process distinctions in the target article into slightly starker light. In terms of representation, abstraction is a useful, and indeed sometimes normative, approach to the complexities mentioned above. Throwing away distinctions that do not matter (or perhaps do not matter very much) allows one to generalize predictions about future worth (perhaps approximately), obviating more learning, more computation, or indeed both. One might quibble about the particular forms of abstraction considered here – for instance, the article frequently flirts with deterministic, rather than probabilistic, criteria for substitutability. This would seem likely to be somewhat too rigid in most circumstances.

Second, in terms of processes, we can see that neither simulation theory nor the “theory–theory” that the article puts in partial competition with it, are really fundamental constructs – because we only really need to predict evaluations rather than actual future outcomes. It is this observation that underlies the sorts of model-free reinforcement learning (RL) to which the target article refers (Sutton & Barto Reference Sutton and Barto1998), and which can also exploit rich representations. Of course, there are statistical benefits (though computational costs; Daw et al. Reference Daw, Niv and Dayan2005; Keramati et al. Reference Keramati, Dezfouli and Piray2011; Reference Keramati, Smittenaar, Dolan and Dayan2016; Pezzulo et al. Reference Pezzulo, Rigoli and Chersi2013) to model-based (MB) RL – in which more elaborate aspects of the future are predicted as a means of making long-run evaluations. However, one might note that even this conventional sort of MB RL already includes the sort of flexible incorporation of inferential abstraction which is referenced – there is nothing that requires any vividness of simulation. Perhaps, the term “cognitive model” in MB RL might have seemed a bit overly ascetic. Equally, one might note the active investigation of how episodic and semantic contributions to various forms of RL are integrated (Collins & Frank Reference Collins and Frank2012; Gershman & Daw Reference Gershman and Daw2017; Lengyel & Dayan Reference Lengyel and Dayan2007).

A third elucidation concerns the fact that predictive coding models (MacKay Reference MacKay, Shannon and McCarthy1956; Rao & Ballard Reference Rao and Ballard1999) also consider prediction about the present – a sort of ersatz prediction that should be kept conceptually completely separate from predictions about the future. That is, such models specify hierarchical abstractions of the current state as a way of analysing that state. They do this by considering how this state might have been generated, that is, how it might have been synthesized. An example of this sort of analysis by synthesis (Neisser Reference Neisser1967) is to consider performing computer vision to analyse a visual scene into its underlying contents by determining all the settings of the graphics engine in a computer game that could synthesize the scene. Each setting would provide a description of the objects, their positions, the lighting, the location of the observer, the shot noise, etc., that could have produced the scene. One way to perform this analysis is to start from some likely settings, predict what the scene should look like if those settings were indeed responsible, look at how the actual scene differs (this is the prediction error), and change the settings accordingly. Ultimately, though, it is the analysis that matters (a conclusion that the target article steps around somewhat balletically).

Analysis by synthesis turns out to be a powerful idea about how to create abstractions in what is known as an unsupervised manner (Hinton & Sejnowski Reference Hinton and Sejnowski1999). Furthermore, the target article points to aspects of such generative models that could usefully be structurally far more sophisticated (although the sorts of probabilistic programming notions that are becoming popular in some circles; Goodman et al. (Reference Goodman, Mansinghka, Roy, Bonawitz and Tenenbaum2012) arguably generalize even the highest order representational construct considered by the target article, namely predication). However, even the earliest thoughts about unsupervised learning (Marr Reference Marr1970) were suffused with concern about the fundamental lack of justification for these sorts of representational ideas for the task of making good decisions for the future – a dilemma that is, however, not resolved here.

In sum, I applaud the authors for their lucid challenge to overly simplistic notions of representations and processes. Abstraction is of tremendous benefit in many ways to real prediction and thus real control, and therefore much work in RL is attempting to find ways of determining and exploiting appropriate representational structures both within single domains of decision-making, and across multiple such domains.

References

Berger, J. (1985) Statistical decision theory and Bayesian analysis. Springer.CrossRefGoogle Scholar
Collins, A. G. E. & Frank, M. J. (2012) How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience 35:1024–35.CrossRefGoogle Scholar
Daw, N. D., Niv, Y. & Dayan, P. (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience 8(12):1704–711.CrossRefGoogle ScholarPubMed
Dayan, P. (1993) Improving generalization for temporal difference learning: The successor representation. Neural Computation 5:613–24.CrossRefGoogle Scholar
Gershman, S. J. & Daw, N. D. (2017) Reinforcement learning and episodic memory in humans and animals: An integrative framework. Annual Review of Psychology 68(1):101–28. https://doi.org/10.1146/annurev-psych-122414-033625.CrossRefGoogle Scholar
Goodman, N. D., Mansinghka, V. K., Roy, D. M., Bonawitz, K. & Tenenbaum, J. B. (2012) Church: A language for generative models. CoRR, abs/ 1206.3255.Google Scholar
Harsanyi, J. C. (1967) Games with incomplete information played by “Bayesian” players, I–III Part I. The basic model. Management Science 14(3):159–82.CrossRefGoogle Scholar
Hinton, G. & Sejnowski, T., ed. (1999) Unsupervised learning: Foundations of neural computation. MIT Press.CrossRefGoogle Scholar
Keramati, M., Dezfouli, A. & Piray, P. (2011) Speed/accuracy trade-off between the habitual and the goal-directed processes. PLOS Computational Biology 7:e1002055.CrossRefGoogle ScholarPubMed
Keramati, M., Smittenaar, P., Dolan, R. J. & Dayan, P. (2016) Adaptive integration of habits into depth-limited planning defines a habitual-goal-directed spectrum. Proceedings of the National Academy of Sciences 113:12868–73.CrossRefGoogle ScholarPubMed
Lengyel, M. and Dayan, P. (2007) Hippocampal contributions to control: The third way. In: Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3–6, 2007, pp. 889–96.Google Scholar
Littman, M. L., Sutton, R. S. & Singh, S. P. (2001) Predictive representations of state. In: Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3–8, 2001, Vancouver, British Columbia, Canada], pp. 1555–61.Google Scholar
MacKay, D. (1956) The epistemological problem for automata. In: Automata studies, ed. Shannon, C. & McCarthy, J., pp. 235–51. Princeton University Press.Google Scholar
Marr, D. (1970) A theory for cerebral neocortex. Proceedings of the Royal Society B: Biological Sciences 176:161234.Google ScholarPubMed
Neisser, U. (1967) Cognitive psychology. Appleton-Century-Crofts.Google Scholar
Pezzulo, G., Rigoli, F. & Chersi, F. (2013) The mixed instrumental controller: Using value of information to combine habitual choice and mental simulation. Frontiers in Psychology 4:92.CrossRefGoogle ScholarPubMed
Rao, R. P. N. & Ballard, D. H. (1999) Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience 2(1):7987. Available at: http://doi.org/10.1038/4580.CrossRefGoogle ScholarPubMed
Sutton, R. S. & Barto, A. G. (1998) Reinforcement learning: An introduction (adaptive computation and machine learning). The MIT Press.Google Scholar