Despite their theologically contradictory nature, both of these statements are true: the first is noting that details are important, and the second that getting the details right is difficult. It is for exactly this pair of reasons that we believe the predictive processing framework is limited in its ability to contribute, in a deep way, to our understanding of brain function.
This is not to deny that the brain does prediction. This is a view that has been beautifully articulated by Clark, and lies in a great tradition. For instance, in his 1943 book, Kenneth Craik devotes several chapters to his central hypothesis that: “One of the most fundamental properties of thought is its power of predicting events” (Craik Reference Craik1943, p. 50). The evidence for prediction-related signals is strong, and the high-level models are often tantalizing. However, we (and, in our experience, most neuroscientists) want more: We want specific neural mechanisms that are employed in specific circumstances, and we want to know how such models can be arranged to explain complex behavior (i.e., we want an architectural specification).
Unfortunately, as Clark himself points out, the predictive processing framework “fail[s] to specify the overall form of a cognitive architecture” and “leaves unanswered a wide range of genuine questions concerning the representational formats used by different brain areas” (sect. 3.3, para. 4). The extent of the predictive processing framework's architectural claims is that the brain is organized in a hierarchical manner, with error signals passing up the hierarchy and predictions of world state passing down. However, this description seems to miss all the interesting details: What is the specific form and function of the connections between levels of this hierarchy? In the human brain, along what neuroanatomical pathways should we expect to see this information flowing? And, more generally, how do different hierarchies interact? How does information pass between them? Is there a unifying representational format? The predictive processing framework leaves all of these details unspecified, but it strikes us that the filling-in of these details is where the framework would gain deep, empirical content.
It may seem as if some of these questions are answered. For instance, the primary method of representation in the brain is supposed to be through probability density functions across the possible states/concepts. However, as Clark mentions, these representations could be implemented with a “wide variety of different schemes and surface forms” (sect. 3.2, para. 4). For example, a probability density p(x) could be represented as a histogram (which explicitly stores how many times each state x has occurred) or as a summary model (e.g., storing just the mean and variance of a normal distribution). These different schemes have enormously different resource implications for a physical implementation. As long as the characterization of representation is left at the level of specifying a general, abstract form, it is difficult to empirically evaluate.
Even what seems to be the most specific claim of the predictive processing framework – that there exist functionally distinct “error” and “representation” units in the brain – is ambiguous. Given multidimensional neuron tuning (Townsend et al. Reference Townsend, Paninski and Lemon2006; Tudusciuc & Nieder Reference Tudusciuc and Nieder2009), units could be simultaneously sensitive to both error and representation, and still perform the relevant computations (Eliasmith & Anderson Reference Eliasmith and Anderson2003). This would be compatible with the neurophysiological evidence showing neurons responsive to prediction error, without requiring that there be a sharp division in the brain into these two different sub-populations. Again, the details matter.
One way to begin to fill in the missing details in the predictive processing framework is by being more specific as to what functions are computed. For example, Kalman filtersFootnote
1
(Kalman Reference Kalman1960) are standard control-theoretic structures that maintain an internal representation of the state of the world, and then use the difference between the predictions of that internal state and incoming data to update the internal model (as the predictive processing framework uses the prediction error signal to update its representations). Clark claims that the predictive processing framework differs from these structures in that it contains a richer error signal (see Note 9 in the target article). However, the Kalman filter is often employed in a multidimensional form (Villalon-Turrubiates et al. Reference Villalon-Turrubiates, Andrade-Lucio, Ibarra-Manzano and Chu2004; Wu Reference Wu1985), allowing the error signal to encode rich and complex information about the world. Making use of these parallels provides many potential advantages. For example, Clark describes the need to adjust the relative weight of the model's predictions versus the incoming information, but he does not indicate how that balance is to be achieved. This is a well-studied problem in Kalman filters, where there are specific mechanisms to adjust these weights depending on the measurement or estimate error (Brown & Hwang Reference Brown and Hwang1992). Thus, it may be possible to replace the poorly specified notion of “attention” used to control these weights in the predictive processing framework (sect. 2.3) with well-defined mechanisms, providing a more grounded and concrete description.
This is a way of providing computational details to the approach, but we advocate going further – providing implementational details as well. For instance, there is more than one way to implement a Kalman filter in a spiking neural network (Eliasmith & Anderson Reference Eliasmith and Anderson2003, Ch. 9), each of which has different implications for the neurophysiological behavior of those networks. Once a neural implementation has been specified, detailed comparisons between computational models and empirical data can be made. More critically, for the grander suggestion that the predictive processing framework is unifying, the implementation of some small set of mechanisms should explain a wide swath of empirical data (see, e.g., Eliasmith et al. [Reference Eliasmith, Stewart, Choo, Bekolay, DeWolf, Tang and Rasmussen2012] or Eliasmith [in press] for one such attempt).
The ideas presented by Clark are compelling, compatible with empirical data, and attempt to unify several interesting aspects of cognition. However, given the current lack of implementational detail or firm architectural commitments, it is impossible to determine whether the predictive processing framework is largely correct or empirically vacuous. The real test of these ideas will come when they are used to build a model that unifies perception, cognition, and action in a single system. Such an effort will require a deeper investigation of the details, and either fill them in with answers, or if answers are not to be found, require a reworking of the theory. Either way, the predictive processing framework will benefit enormously from the exercise.
Despite their theologically contradictory nature, both of these statements are true: the first is noting that details are important, and the second that getting the details right is difficult. It is for exactly this pair of reasons that we believe the predictive processing framework is limited in its ability to contribute, in a deep way, to our understanding of brain function.
This is not to deny that the brain does prediction. This is a view that has been beautifully articulated by Clark, and lies in a great tradition. For instance, in his 1943 book, Kenneth Craik devotes several chapters to his central hypothesis that: “One of the most fundamental properties of thought is its power of predicting events” (Craik Reference Craik1943, p. 50). The evidence for prediction-related signals is strong, and the high-level models are often tantalizing. However, we (and, in our experience, most neuroscientists) want more: We want specific neural mechanisms that are employed in specific circumstances, and we want to know how such models can be arranged to explain complex behavior (i.e., we want an architectural specification).
Unfortunately, as Clark himself points out, the predictive processing framework “fail[s] to specify the overall form of a cognitive architecture” and “leaves unanswered a wide range of genuine questions concerning the representational formats used by different brain areas” (sect. 3.3, para. 4). The extent of the predictive processing framework's architectural claims is that the brain is organized in a hierarchical manner, with error signals passing up the hierarchy and predictions of world state passing down. However, this description seems to miss all the interesting details: What is the specific form and function of the connections between levels of this hierarchy? In the human brain, along what neuroanatomical pathways should we expect to see this information flowing? And, more generally, how do different hierarchies interact? How does information pass between them? Is there a unifying representational format? The predictive processing framework leaves all of these details unspecified, but it strikes us that the filling-in of these details is where the framework would gain deep, empirical content.
It may seem as if some of these questions are answered. For instance, the primary method of representation in the brain is supposed to be through probability density functions across the possible states/concepts. However, as Clark mentions, these representations could be implemented with a “wide variety of different schemes and surface forms” (sect. 3.2, para. 4). For example, a probability density p(x) could be represented as a histogram (which explicitly stores how many times each state x has occurred) or as a summary model (e.g., storing just the mean and variance of a normal distribution). These different schemes have enormously different resource implications for a physical implementation. As long as the characterization of representation is left at the level of specifying a general, abstract form, it is difficult to empirically evaluate.
Even what seems to be the most specific claim of the predictive processing framework – that there exist functionally distinct “error” and “representation” units in the brain – is ambiguous. Given multidimensional neuron tuning (Townsend et al. Reference Townsend, Paninski and Lemon2006; Tudusciuc & Nieder Reference Tudusciuc and Nieder2009), units could be simultaneously sensitive to both error and representation, and still perform the relevant computations (Eliasmith & Anderson Reference Eliasmith and Anderson2003). This would be compatible with the neurophysiological evidence showing neurons responsive to prediction error, without requiring that there be a sharp division in the brain into these two different sub-populations. Again, the details matter.
One way to begin to fill in the missing details in the predictive processing framework is by being more specific as to what functions are computed. For example, Kalman filtersFootnote 1 (Kalman Reference Kalman1960) are standard control-theoretic structures that maintain an internal representation of the state of the world, and then use the difference between the predictions of that internal state and incoming data to update the internal model (as the predictive processing framework uses the prediction error signal to update its representations). Clark claims that the predictive processing framework differs from these structures in that it contains a richer error signal (see Note 9 in the target article). However, the Kalman filter is often employed in a multidimensional form (Villalon-Turrubiates et al. Reference Villalon-Turrubiates, Andrade-Lucio, Ibarra-Manzano and Chu2004; Wu Reference Wu1985), allowing the error signal to encode rich and complex information about the world. Making use of these parallels provides many potential advantages. For example, Clark describes the need to adjust the relative weight of the model's predictions versus the incoming information, but he does not indicate how that balance is to be achieved. This is a well-studied problem in Kalman filters, where there are specific mechanisms to adjust these weights depending on the measurement or estimate error (Brown & Hwang Reference Brown and Hwang1992). Thus, it may be possible to replace the poorly specified notion of “attention” used to control these weights in the predictive processing framework (sect. 2.3) with well-defined mechanisms, providing a more grounded and concrete description.
This is a way of providing computational details to the approach, but we advocate going further – providing implementational details as well. For instance, there is more than one way to implement a Kalman filter in a spiking neural network (Eliasmith & Anderson Reference Eliasmith and Anderson2003, Ch. 9), each of which has different implications for the neurophysiological behavior of those networks. Once a neural implementation has been specified, detailed comparisons between computational models and empirical data can be made. More critically, for the grander suggestion that the predictive processing framework is unifying, the implementation of some small set of mechanisms should explain a wide swath of empirical data (see, e.g., Eliasmith et al. [Reference Eliasmith, Stewart, Choo, Bekolay, DeWolf, Tang and Rasmussen2012] or Eliasmith [in press] for one such attempt).
The ideas presented by Clark are compelling, compatible with empirical data, and attempt to unify several interesting aspects of cognition. However, given the current lack of implementational detail or firm architectural commitments, it is impossible to determine whether the predictive processing framework is largely correct or empirically vacuous. The real test of these ideas will come when they are used to build a model that unifies perception, cognition, and action in a single system. Such an effort will require a deeper investigation of the details, and either fill them in with answers, or if answers are not to be found, require a reworking of the theory. Either way, the predictive processing framework will benefit enormously from the exercise.