Clark impressively surveys the prospects, based on current evidence and speculations tethered to clearly specified models, that action-oriented predictive processing (AOPP) accounts of cortical activity offer the basis for a deeply unified account of perception, cognition, and action. It is indeed clear that such accounts provide, at the very least, a fresh and stimulating framework for explaining the apparently expectation-driven nature of perception. And once one gets this far, it would be a strangely timid modeler who did not see value in exploring the hypothesis that such perception was closely linked to preparation of action and to monitoring of its consequences. However, Clark structures his critical discussion around the most ambitious efforts to use AOPP as the basis for a reductive unification of “all elements of systemic organization” in the brain (sect. 1.6, para. 3), efforts mainly associated with the work of Karl Friston and his co-authors. Clark expresses some reservations about this strong, over-arching hypothesis. My commentary amplifies some of these reservations, based on neglect of the role of specialized subsystems that may integrate valuation, attention, and motor preparation semi-independently of general cortical processing.
Clark's survey is notable for the absence of any discussion of relative reward-value computation. Studies of such valuation based on single-cell recordings in rat striatum were the original locus of models of neural learning as adjustment of synaptic weights and connections through prediction-error correction (Schultz et al. Reference Schultz, Dayan and Montague1997). The temporal difference (TD) learning that has been progressively generalized in descendents of Schultz et al.'s model is a form of Rescorla-Wagner conditioning, not Bayesian equilibration, and so could not plausibly be expected to provide a general account of mammalian cognition. However, neuroeconomists have subsequently embedded TD learning in models of wider scope that exploit drift diffusion and meta-conditioning to track such complex targets as stochastic dominance of strategies in games with shifting mixed-strategy equilibria (Glimcher Reference Glimcher2010; Lee & Wang Reference Lee, Wang, Glimcher, Camerer, Fehr and Poldrack2009). Such models can effectively approximate Bayesian learning. However, as Clark reports, Friston's most recent work “looks to involve a strong commitment … to the wholesale replacement of value functions, considered as determinants of action, with expectations … about action” (see Note 12 in the target article).
One theorist's elimination is frequently another theorist's construct implementation. Neuroeconomic models of the striatal dopamine circuit do away with the need to posit learned or innate reward value hierarchies that provide targets for the learning of action and the training of attention. Like AOPP theory, such models effectively fuse attentional capture and entrenchment with reward, explaining both as functional products of the prediction error learning encoded by dopamine signals. Extensions of neuroeconomic models to account for pathologies of attention and valuation, such as addiction, have incorporated evidence for direct dopaminergic/striatal signaling to motor preparation areas. For example, Everitt et al. (Reference Everitt, Dickinson and Robbins2001) suggest that direct signals to motor systems to prepare to consume addictive targets when attention is drawn to predictors of their availability are the basis for the visceral cravings that, in turn, cause addictive preoccupation. More basically, Glimcher's (Reference Glimcher2003) proposal to model some neural response using economics was originally motivated by observations of activity in cells that control eye saccades when monkeys implement incentivized choices through gaze direction (Platt & Glimcher Reference Platt and Glimcher1999).
This integration of attention and neural learning with action is crucial in the present context, because, like the prediction errors modeled in AOPP, this allows them to “carry information not just about the quantity of error but … about the mismatched content itself,” as Clark says (Note 9 of the target article).
So far, we might seem to have only a semantic difference between neuroeconomics and Friston's radical interpretation of AOPP: Neuroeconomists take themselves to be furnishing a theory of neural value functions, while Friston proposes to eliminate them. But this in fact represents substantive divergences, all of which reflect worries that Clark notes but doesn't connect with particular alternative accounts.
First, consider the problem of why, if AOPP is the general account of cognitive dynamics, animals do not just sit still in dark rooms to maintain error-minimizing equilibria. Clark cites Friston's suggestion in response that “some species are equipped with prior expectations that they will engage in exploratory or social play” (Friston Reference Friston, Tschacher and Bergomi2011a; see sect. 3.2, para. 2, in the target article). However, good biological methodology recommends against positing speculative innate knowledge as inferences to best explanations conditional on one's hypothesis. The neuroeconomic model of striatal valuation makes this posit unnecessary – or, on another philosophical interpretation, replaces the dubious IBE by evidence for a mechanism – by suggesting that discovery of mismatches between expectations and consequences of action is the basis of phasic dopamine release, and such release is the foundation of reward, attention, and further action.
Second, allowing for a relatively encapsulated and cognitively impenetrable pre-frontal mechanism in striatum that integrates attention and action in a way that is partly independent of general cognition, allows us to straightforwardly model the disconnect Clark identifies between surprise to the brain (“surprisal”) and surprise to the agent. Clark's example is of a surprise-minimizing perceptual inference that surprises the agent. But disconnects in the other direction are also important. Gambling addiction may result from the fact that the midbrain reward circuit is incapable of learning that there is nothing to learn from repeatedly playing a slot machine, even after the mechanism's victim/owner has become sadly aware of this truth (Ross et al. Reference Ross, Sharp, Vuchinich and Spurrett2008).
The suggestion here is that neuroeconomics is one resource – of course we should expect there to be others – for addressing Clark's concern that “even taken together, the mathematical model (the Bayesian brain) and the hierarchical, action-oriented, predictive processing implementation fail to specify the overall form of a cognitive architecture. They fail to specify, for example, how the brain … divides its cognitive labors between multiple cortical and subcortical areas” (sect. 3.3, para. 4). But in that case it seems most natural to join the neuroeconomists in understanding sub-cognitive valuation as an input to cognition, rather than as something that a model of cognitive activity should reduce away.
Clark impressively surveys the prospects, based on current evidence and speculations tethered to clearly specified models, that action-oriented predictive processing (AOPP) accounts of cortical activity offer the basis for a deeply unified account of perception, cognition, and action. It is indeed clear that such accounts provide, at the very least, a fresh and stimulating framework for explaining the apparently expectation-driven nature of perception. And once one gets this far, it would be a strangely timid modeler who did not see value in exploring the hypothesis that such perception was closely linked to preparation of action and to monitoring of its consequences. However, Clark structures his critical discussion around the most ambitious efforts to use AOPP as the basis for a reductive unification of “all elements of systemic organization” in the brain (sect. 1.6, para. 3), efforts mainly associated with the work of Karl Friston and his co-authors. Clark expresses some reservations about this strong, over-arching hypothesis. My commentary amplifies some of these reservations, based on neglect of the role of specialized subsystems that may integrate valuation, attention, and motor preparation semi-independently of general cortical processing.
Clark's survey is notable for the absence of any discussion of relative reward-value computation. Studies of such valuation based on single-cell recordings in rat striatum were the original locus of models of neural learning as adjustment of synaptic weights and connections through prediction-error correction (Schultz et al. Reference Schultz, Dayan and Montague1997). The temporal difference (TD) learning that has been progressively generalized in descendents of Schultz et al.'s model is a form of Rescorla-Wagner conditioning, not Bayesian equilibration, and so could not plausibly be expected to provide a general account of mammalian cognition. However, neuroeconomists have subsequently embedded TD learning in models of wider scope that exploit drift diffusion and meta-conditioning to track such complex targets as stochastic dominance of strategies in games with shifting mixed-strategy equilibria (Glimcher Reference Glimcher2010; Lee & Wang Reference Lee, Wang, Glimcher, Camerer, Fehr and Poldrack2009). Such models can effectively approximate Bayesian learning. However, as Clark reports, Friston's most recent work “looks to involve a strong commitment … to the wholesale replacement of value functions, considered as determinants of action, with expectations … about action” (see Note 12 in the target article).
One theorist's elimination is frequently another theorist's construct implementation. Neuroeconomic models of the striatal dopamine circuit do away with the need to posit learned or innate reward value hierarchies that provide targets for the learning of action and the training of attention. Like AOPP theory, such models effectively fuse attentional capture and entrenchment with reward, explaining both as functional products of the prediction error learning encoded by dopamine signals. Extensions of neuroeconomic models to account for pathologies of attention and valuation, such as addiction, have incorporated evidence for direct dopaminergic/striatal signaling to motor preparation areas. For example, Everitt et al. (Reference Everitt, Dickinson and Robbins2001) suggest that direct signals to motor systems to prepare to consume addictive targets when attention is drawn to predictors of their availability are the basis for the visceral cravings that, in turn, cause addictive preoccupation. More basically, Glimcher's (Reference Glimcher2003) proposal to model some neural response using economics was originally motivated by observations of activity in cells that control eye saccades when monkeys implement incentivized choices through gaze direction (Platt & Glimcher Reference Platt and Glimcher1999).
This integration of attention and neural learning with action is crucial in the present context, because, like the prediction errors modeled in AOPP, this allows them to “carry information not just about the quantity of error but … about the mismatched content itself,” as Clark says (Note 9 of the target article).
So far, we might seem to have only a semantic difference between neuroeconomics and Friston's radical interpretation of AOPP: Neuroeconomists take themselves to be furnishing a theory of neural value functions, while Friston proposes to eliminate them. But this in fact represents substantive divergences, all of which reflect worries that Clark notes but doesn't connect with particular alternative accounts.
First, consider the problem of why, if AOPP is the general account of cognitive dynamics, animals do not just sit still in dark rooms to maintain error-minimizing equilibria. Clark cites Friston's suggestion in response that “some species are equipped with prior expectations that they will engage in exploratory or social play” (Friston Reference Friston, Tschacher and Bergomi2011a; see sect. 3.2, para. 2, in the target article). However, good biological methodology recommends against positing speculative innate knowledge as inferences to best explanations conditional on one's hypothesis. The neuroeconomic model of striatal valuation makes this posit unnecessary – or, on another philosophical interpretation, replaces the dubious IBE by evidence for a mechanism – by suggesting that discovery of mismatches between expectations and consequences of action is the basis of phasic dopamine release, and such release is the foundation of reward, attention, and further action.
Second, allowing for a relatively encapsulated and cognitively impenetrable pre-frontal mechanism in striatum that integrates attention and action in a way that is partly independent of general cognition, allows us to straightforwardly model the disconnect Clark identifies between surprise to the brain (“surprisal”) and surprise to the agent. Clark's example is of a surprise-minimizing perceptual inference that surprises the agent. But disconnects in the other direction are also important. Gambling addiction may result from the fact that the midbrain reward circuit is incapable of learning that there is nothing to learn from repeatedly playing a slot machine, even after the mechanism's victim/owner has become sadly aware of this truth (Ross et al. Reference Ross, Sharp, Vuchinich and Spurrett2008).
The suggestion here is that neuroeconomics is one resource – of course we should expect there to be others – for addressing Clark's concern that “even taken together, the mathematical model (the Bayesian brain) and the hierarchical, action-oriented, predictive processing implementation fail to specify the overall form of a cognitive architecture. They fail to specify, for example, how the brain … divides its cognitive labors between multiple cortical and subcortical areas” (sect. 3.3, para. 4). But in that case it seems most natural to join the neuroeconomists in understanding sub-cognitive valuation as an input to cognition, rather than as something that a model of cognitive activity should reduce away.