Bayesian decision theorists (BDTs), a group which active inferencers might beneficially pupate to join, are sophisticated simpletons. The simpleton half of this oxymoron comes from the straightforward inferential crank that they turn to generate behaviour (Berger Reference Berger1985): agents should characterize their probabilistic beliefs about the state of the world; evaluate the expected present worth of the potential long run future consequences of their available choices or actions given this characterization; and make an appropriate choice in light of these evaluations. BDTs are sophisticated because done correctly, this leads to optimal behaviour in both individual and collective (Harsanyi Reference Harsanyi1967) settings, and because of the statistical and computational complexities they have to overcome to execute each of these steps correctly.
From a formal perspective, we can see the centrality of predictions about the future (a rider that will seem less odd shortly) – because it is the portended worth of those consequences that matter. Indeed, agents’ very characterizations of the current state of the world should only make distinctions that make a difference in terms of what the future might hold (Dayan Reference Dayan1993; Littman et al. Reference Littman, Sutton and Singh2001). However, BDTs only need to predict evaluations – predicting in more detail what will happen is at most a means to this particular end.
I hope that Bayesian decision theory helps put all the rich representational and process distinctions in the target article into slightly starker light. In terms of representation, abstraction is a useful, and indeed sometimes normative, approach to the complexities mentioned above. Throwing away distinctions that do not matter (or perhaps do not matter very much) allows one to generalize predictions about future worth (perhaps approximately), obviating more learning, more computation, or indeed both. One might quibble about the particular forms of abstraction considered here – for instance, the article frequently flirts with deterministic, rather than probabilistic, criteria for substitutability. This would seem likely to be somewhat too rigid in most circumstances.
Second, in terms of processes, we can see that neither simulation theory nor the “theory–theory” that the article puts in partial competition with it, are really fundamental constructs – because we only really need to predict evaluations rather than actual future outcomes. It is this observation that underlies the sorts of model-free reinforcement learning (RL) to which the target article refers (Sutton & Barto Reference Sutton and Barto1998), and which can also exploit rich representations. Of course, there are statistical benefits (though computational costs; Daw et al. Reference Daw, Niv and Dayan2005; Keramati et al. Reference Keramati, Dezfouli and Piray2011; Reference Keramati, Smittenaar, Dolan and Dayan2016; Pezzulo et al. Reference Pezzulo, Rigoli and Chersi2013) to model-based (MB) RL – in which more elaborate aspects of the future are predicted as a means of making long-run evaluations. However, one might note that even this conventional sort of MB RL already includes the sort of flexible incorporation of inferential abstraction which is referenced – there is nothing that requires any vividness of simulation. Perhaps, the term “cognitive model” in MB RL might have seemed a bit overly ascetic. Equally, one might note the active investigation of how episodic and semantic contributions to various forms of RL are integrated (Collins & Frank Reference Collins and Frank2012; Gershman & Daw Reference Gershman and Daw2017; Lengyel & Dayan Reference Lengyel and Dayan2007).
A third elucidation concerns the fact that predictive coding models (MacKay Reference MacKay, Shannon and McCarthy1956; Rao & Ballard Reference Rao and Ballard1999) also consider prediction about the present – a sort of ersatz prediction that should be kept conceptually completely separate from predictions about the future. That is, such models specify hierarchical abstractions of the current state as a way of analysing that state. They do this by considering how this state might have been generated, that is, how it might have been synthesized. An example of this sort of analysis by synthesis (Neisser Reference Neisser1967) is to consider performing computer vision to analyse a visual scene into its underlying contents by determining all the settings of the graphics engine in a computer game that could synthesize the scene. Each setting would provide a description of the objects, their positions, the lighting, the location of the observer, the shot noise, etc., that could have produced the scene. One way to perform this analysis is to start from some likely settings, predict what the scene should look like if those settings were indeed responsible, look at how the actual scene differs (this is the prediction error), and change the settings accordingly. Ultimately, though, it is the analysis that matters (a conclusion that the target article steps around somewhat balletically).
Analysis by synthesis turns out to be a powerful idea about how to create abstractions in what is known as an unsupervised manner (Hinton & Sejnowski Reference Hinton and Sejnowski1999). Furthermore, the target article points to aspects of such generative models that could usefully be structurally far more sophisticated (although the sorts of probabilistic programming notions that are becoming popular in some circles; Goodman et al. (Reference Goodman, Mansinghka, Roy, Bonawitz and Tenenbaum2012) arguably generalize even the highest order representational construct considered by the target article, namely predication). However, even the earliest thoughts about unsupervised learning (Marr Reference Marr1970) were suffused with concern about the fundamental lack of justification for these sorts of representational ideas for the task of making good decisions for the future – a dilemma that is, however, not resolved here.
In sum, I applaud the authors for their lucid challenge to overly simplistic notions of representations and processes. Abstraction is of tremendous benefit in many ways to real prediction and thus real control, and therefore much work in RL is attempting to find ways of determining and exploiting appropriate representational structures both within single domains of decision-making, and across multiple such domains.
Bayesian decision theorists (BDTs), a group which active inferencers might beneficially pupate to join, are sophisticated simpletons. The simpleton half of this oxymoron comes from the straightforward inferential crank that they turn to generate behaviour (Berger Reference Berger1985): agents should characterize their probabilistic beliefs about the state of the world; evaluate the expected present worth of the potential long run future consequences of their available choices or actions given this characterization; and make an appropriate choice in light of these evaluations. BDTs are sophisticated because done correctly, this leads to optimal behaviour in both individual and collective (Harsanyi Reference Harsanyi1967) settings, and because of the statistical and computational complexities they have to overcome to execute each of these steps correctly.
From a formal perspective, we can see the centrality of predictions about the future (a rider that will seem less odd shortly) – because it is the portended worth of those consequences that matter. Indeed, agents’ very characterizations of the current state of the world should only make distinctions that make a difference in terms of what the future might hold (Dayan Reference Dayan1993; Littman et al. Reference Littman, Sutton and Singh2001). However, BDTs only need to predict evaluations – predicting in more detail what will happen is at most a means to this particular end.
I hope that Bayesian decision theory helps put all the rich representational and process distinctions in the target article into slightly starker light. In terms of representation, abstraction is a useful, and indeed sometimes normative, approach to the complexities mentioned above. Throwing away distinctions that do not matter (or perhaps do not matter very much) allows one to generalize predictions about future worth (perhaps approximately), obviating more learning, more computation, or indeed both. One might quibble about the particular forms of abstraction considered here – for instance, the article frequently flirts with deterministic, rather than probabilistic, criteria for substitutability. This would seem likely to be somewhat too rigid in most circumstances.
Second, in terms of processes, we can see that neither simulation theory nor the “theory–theory” that the article puts in partial competition with it, are really fundamental constructs – because we only really need to predict evaluations rather than actual future outcomes. It is this observation that underlies the sorts of model-free reinforcement learning (RL) to which the target article refers (Sutton & Barto Reference Sutton and Barto1998), and which can also exploit rich representations. Of course, there are statistical benefits (though computational costs; Daw et al. Reference Daw, Niv and Dayan2005; Keramati et al. Reference Keramati, Dezfouli and Piray2011; Reference Keramati, Smittenaar, Dolan and Dayan2016; Pezzulo et al. Reference Pezzulo, Rigoli and Chersi2013) to model-based (MB) RL – in which more elaborate aspects of the future are predicted as a means of making long-run evaluations. However, one might note that even this conventional sort of MB RL already includes the sort of flexible incorporation of inferential abstraction which is referenced – there is nothing that requires any vividness of simulation. Perhaps, the term “cognitive model” in MB RL might have seemed a bit overly ascetic. Equally, one might note the active investigation of how episodic and semantic contributions to various forms of RL are integrated (Collins & Frank Reference Collins and Frank2012; Gershman & Daw Reference Gershman and Daw2017; Lengyel & Dayan Reference Lengyel and Dayan2007).
A third elucidation concerns the fact that predictive coding models (MacKay Reference MacKay, Shannon and McCarthy1956; Rao & Ballard Reference Rao and Ballard1999) also consider prediction about the present – a sort of ersatz prediction that should be kept conceptually completely separate from predictions about the future. That is, such models specify hierarchical abstractions of the current state as a way of analysing that state. They do this by considering how this state might have been generated, that is, how it might have been synthesized. An example of this sort of analysis by synthesis (Neisser Reference Neisser1967) is to consider performing computer vision to analyse a visual scene into its underlying contents by determining all the settings of the graphics engine in a computer game that could synthesize the scene. Each setting would provide a description of the objects, their positions, the lighting, the location of the observer, the shot noise, etc., that could have produced the scene. One way to perform this analysis is to start from some likely settings, predict what the scene should look like if those settings were indeed responsible, look at how the actual scene differs (this is the prediction error), and change the settings accordingly. Ultimately, though, it is the analysis that matters (a conclusion that the target article steps around somewhat balletically).
Analysis by synthesis turns out to be a powerful idea about how to create abstractions in what is known as an unsupervised manner (Hinton & Sejnowski Reference Hinton and Sejnowski1999). Furthermore, the target article points to aspects of such generative models that could usefully be structurally far more sophisticated (although the sorts of probabilistic programming notions that are becoming popular in some circles; Goodman et al. (Reference Goodman, Mansinghka, Roy, Bonawitz and Tenenbaum2012) arguably generalize even the highest order representational construct considered by the target article, namely predication). However, even the earliest thoughts about unsupervised learning (Marr Reference Marr1970) were suffused with concern about the fundamental lack of justification for these sorts of representational ideas for the task of making good decisions for the future – a dilemma that is, however, not resolved here.
In sum, I applaud the authors for their lucid challenge to overly simplistic notions of representations and processes. Abstraction is of tremendous benefit in many ways to real prediction and thus real control, and therefore much work in RL is attempting to find ways of determining and exploiting appropriate representational structures both within single domains of decision-making, and across multiple such domains.