Andy Clark acknowledges the “challenging vision” of prediction error minimization (PEM), according to which representation is inner and skull-bound such that perception is a fantasy that coincides with reality (Frith Reference Frith2007). This view does not require homunculi and sense-data but does convey a somehow indirect mind–world relation.
Clark resists indirectness. He states that PEM “makes structuring our worlds genuinely continuous with structuring our brains and sculpting our actions” (sect. 3.4, para. 1), and that “what we perceive is not some internal representation or hypothesis but (precisely) the world” (sect. 4.4, para. 3, emphasis Clark's).
The sentiment is right, but caution about directness is needed. Without indirectness we ignore how the mind is always precariously hostage to the urge to rid itself of prediction error. This urge forces very improbable and fantastical perceptions upon us when the world does not collaborate in its usual, uniform way. For example, in the contemporary swathe of rubber-hand and full-body illusions, we easily and compellingly experience having a rubber hand (or two), occupying another's body or a little doll's body, or having magnetic forces or spectral guns operating on our skin (Hohwy & Paton Reference Hohwy and Paton2010; Lenggenhager et al. Reference Lenggenhager, Tadi, Metzinger and Blanke2007; Petkova & Ehrsson Reference Petkova and Ehrsson2008). Moreover, more stable and fundamental aspects of mind, such as our sense of agency, privileged access to self, and mentalizing, all seem to make sense only in terms of perceptual fantasizing (Frith Reference Frith2007).
This leaves a puzzle. On PEM, the perceptual relation cannot be direct. But neither is it wholly indirect. The challenge is then to reconceive the mind–world relation to encompass both aspects. We suggest a causal conception, and use its internal aspect to leverage an understanding of situated and social cognition.
The implicit inversion of a generative model happens when prediction error is minimized between the model maintained in the brain and the sensory input (how the world impinges on the senses). This yields causal inference on the hidden causes (the states of affairs in the world) of the sensory input. This is a distinctly causal conception of how the brain recapitulates – provides a multilayered mirror image of – the causal structure of the world. This representational relation is direct in the sense that causation is direct: There is an invariant relation between the model and world, such that, given how the model is, it changes in certain ways when the world changes in certain ways. But, seen from the inside, there is indirectness in the sense that causal relata are distinct existences, giving rise to a need for causal inference on hidden, environmental causes.
Though the brain can optimize precisions on its prediction error, it is hostage to the causal link from environmental causes to sensory input. If the variance in the signal from the world to the senses is large, then there is only so much the brain can do there and then to ensure optimal encoding. Precisely because the mind is destined to be behind the veil of sensory input, it then makes sense for it to devise ways of optimizing the information channel from the world to the senses. Thus, through active inference prediction error is minimized, not only by selective sampling, but also by optimizing its precision: removing sources of noise in the environment and amplifying sensory input.
Many of the technical, social and cultural ways we interact with the world can be characterized as attempts to make the link between sensory input and environmental causes less volatile. We see this in the benefits of the built environment (letting us engage in activities unperturbed by wind and weather), in technical and electronic devices (radio lets us hear things directly rather than through hearsay), and in language (communicating propositional content). This picture relies on the internal nature of the neural mechanism that minimizes prediction error, relative to which all our cultural and technological trappings are external. Culture and technology situate the mind closer to the world through improving the reliability of its sensory input. But perception remains an inferred fantasy about what lies behind the veil of input.
By maintaining focus on the internal nature of perceptual processes, in this causal setting, we can appreciate another perspective on social interaction and culture than the “mutual prediction error reduction” that Clark rightly points to.
As Locke insisted, communication is the sharing of each other's hidden ideas. Ideas are well-hidden causes, so PEM is the tool for inferring them through a mix of prediction (“after saying A, he tends to say B”) and active inference (asking something to elicit a predicted answer). An overlooked aspect here is how this is facilitated not just by representing the other's mental states but also by aligning our mental states with each other in a process of neural hermeneutics – a fusion of expectation horizons. We do this, not to change the sensory input itself, but to enhance the precision with which we can probe each other's current mental states, perhaps to such an extent that the receiver in a social interaction ends up having more precise information about the sender's mental states than the sender him- or herself (Frith & Wentzer, in press).
Perhaps culture too, in a very wide sense, can be seen as, at least partly, a tool for precision optimization through shared context. Ritual, convention, and shared practices enhance mutual predictability between people's hidden mental states. This would make sense of cultural diversity because this process is concerned with signal reliability rather than with what the signals are about, and there are many different ways of using cultural tools to align our mental states. Furthermore, when precision has been optimized, alignment enables simple, information rich signaling and thereby communication efficiency.
If alignment of mental states is an integral part of how culture optimizes precision and communication efficiency, then culture should be seen as providing a set of frameworks for interpretation, rather than merely for scaffolding interpretation. If the brain is a hierarchical Bayesian network providing a perceptual fantasy of the world, then culture determines and constrains the hyperpriors needed by such a neural system.
Andy Clark acknowledges the “challenging vision” of prediction error minimization (PEM), according to which representation is inner and skull-bound such that perception is a fantasy that coincides with reality (Frith Reference Frith2007). This view does not require homunculi and sense-data but does convey a somehow indirect mind–world relation.
Clark resists indirectness. He states that PEM “makes structuring our worlds genuinely continuous with structuring our brains and sculpting our actions” (sect. 3.4, para. 1), and that “what we perceive is not some internal representation or hypothesis but (precisely) the world” (sect. 4.4, para. 3, emphasis Clark's).
The sentiment is right, but caution about directness is needed. Without indirectness we ignore how the mind is always precariously hostage to the urge to rid itself of prediction error. This urge forces very improbable and fantastical perceptions upon us when the world does not collaborate in its usual, uniform way. For example, in the contemporary swathe of rubber-hand and full-body illusions, we easily and compellingly experience having a rubber hand (or two), occupying another's body or a little doll's body, or having magnetic forces or spectral guns operating on our skin (Hohwy & Paton Reference Hohwy and Paton2010; Lenggenhager et al. Reference Lenggenhager, Tadi, Metzinger and Blanke2007; Petkova & Ehrsson Reference Petkova and Ehrsson2008). Moreover, more stable and fundamental aspects of mind, such as our sense of agency, privileged access to self, and mentalizing, all seem to make sense only in terms of perceptual fantasizing (Frith Reference Frith2007).
This leaves a puzzle. On PEM, the perceptual relation cannot be direct. But neither is it wholly indirect. The challenge is then to reconceive the mind–world relation to encompass both aspects. We suggest a causal conception, and use its internal aspect to leverage an understanding of situated and social cognition.
The implicit inversion of a generative model happens when prediction error is minimized between the model maintained in the brain and the sensory input (how the world impinges on the senses). This yields causal inference on the hidden causes (the states of affairs in the world) of the sensory input. This is a distinctly causal conception of how the brain recapitulates – provides a multilayered mirror image of – the causal structure of the world. This representational relation is direct in the sense that causation is direct: There is an invariant relation between the model and world, such that, given how the model is, it changes in certain ways when the world changes in certain ways. But, seen from the inside, there is indirectness in the sense that causal relata are distinct existences, giving rise to a need for causal inference on hidden, environmental causes.
Though the brain can optimize precisions on its prediction error, it is hostage to the causal link from environmental causes to sensory input. If the variance in the signal from the world to the senses is large, then there is only so much the brain can do there and then to ensure optimal encoding. Precisely because the mind is destined to be behind the veil of sensory input, it then makes sense for it to devise ways of optimizing the information channel from the world to the senses. Thus, through active inference prediction error is minimized, not only by selective sampling, but also by optimizing its precision: removing sources of noise in the environment and amplifying sensory input.
Many of the technical, social and cultural ways we interact with the world can be characterized as attempts to make the link between sensory input and environmental causes less volatile. We see this in the benefits of the built environment (letting us engage in activities unperturbed by wind and weather), in technical and electronic devices (radio lets us hear things directly rather than through hearsay), and in language (communicating propositional content). This picture relies on the internal nature of the neural mechanism that minimizes prediction error, relative to which all our cultural and technological trappings are external. Culture and technology situate the mind closer to the world through improving the reliability of its sensory input. But perception remains an inferred fantasy about what lies behind the veil of input.
By maintaining focus on the internal nature of perceptual processes, in this causal setting, we can appreciate another perspective on social interaction and culture than the “mutual prediction error reduction” that Clark rightly points to.
As Locke insisted, communication is the sharing of each other's hidden ideas. Ideas are well-hidden causes, so PEM is the tool for inferring them through a mix of prediction (“after saying A, he tends to say B”) and active inference (asking something to elicit a predicted answer). An overlooked aspect here is how this is facilitated not just by representing the other's mental states but also by aligning our mental states with each other in a process of neural hermeneutics – a fusion of expectation horizons. We do this, not to change the sensory input itself, but to enhance the precision with which we can probe each other's current mental states, perhaps to such an extent that the receiver in a social interaction ends up having more precise information about the sender's mental states than the sender him- or herself (Frith & Wentzer, in press).
Perhaps culture too, in a very wide sense, can be seen as, at least partly, a tool for precision optimization through shared context. Ritual, convention, and shared practices enhance mutual predictability between people's hidden mental states. This would make sense of cultural diversity because this process is concerned with signal reliability rather than with what the signals are about, and there are many different ways of using cultural tools to align our mental states. Furthermore, when precision has been optimized, alignment enables simple, information rich signaling and thereby communication efficiency.
If alignment of mental states is an integral part of how culture optimizes precision and communication efficiency, then culture should be seen as providing a set of frameworks for interpretation, rather than merely for scaffolding interpretation. If the brain is a hierarchical Bayesian network providing a perceptual fantasy of the world, then culture determines and constrains the hyperpriors needed by such a neural system.