Predictive coding is in vogue in cognitive neuroscience, probably for good reason, and we are no strangers to the idea in the domain of speech (Hickok et al. Reference Hickok, Houde and Rong2011; van Wassenhove et al. Reference van Wassenhove, Grant and Poeppel2005). The current trendsetters in predictive coding are the motor control crowd who have developed, empirically validated, and promoted the notion of internal forward models as a neural mechanism necessary for smooth, efficient motor control (Kawato, Reference Kawato1999; Shadmehr et al. Reference Shadmehr, Smith and Krakauer2010; Wolpert et al. Reference Wolpert, Ghahramani and Jordan1995). But the basic idea has been pervasive in cognitive science for decades in the form of theoretical proposals like analysis-by-synthesis (Stevens & Halle Reference Stevens, Halle and Walthen-Dunn1967) and in the form of empirical observations like priming, context and top-down effects, and the like. So Pickering & Garrod's (P&G's) claim that language comprehension involves prediction is nothing new. Nor is it a particularly novel claim, right or wrong, that the motor system might be involved in receptive language; it has gotten much attention in the domain of speech perception/phonemic processing, for example (Hickok et al. Reference Hickok, Houde and Rong2011; Rauschecker & Scott Reference Rauschecker and Scott2009; Sams et al. Reference Sams, Möttönen and Sihvonen2005; van Wassenhove et al. Reference van Wassenhove, Grant and Poeppel2005; Wilson & Iacoboni Reference Wilson and Iacoboni2006) and has been a component of at least some aspects of sentence processing models for decades (Crain & Fodor Reference Crain, Fodor, Dowty, Karttunen and Zwicky1985; Frazier & Flores d'Arcais Reference Frazier and Flores d'Arcais1989; Gibson & Hickok Reference Gibson and Hickok1993). What appears to be new here is the idea that prediction at the syntactic and semantic levels can come out of the action system rather than being part of a purely perceptual mechanism.
This is an interesting idea worth investigation, but it is important to note that there are computational reasons why motor prediction generally is an inefficient, or even maladaptive, source for predictive coding during receptive functions. Here's the heart of the problem. The computational goal of a motor prediction in the context of action control is to increase perceptual sensitivity to deviations from prediction (because something is wrong and correction is needed) and to decrease sensitivity to accurate predictions (all is well, carry on). Hence, if motor prediction were used in the context of perception, it would tend to suppress sensitivity to that which is predicted, whereas an efficient mechanism should enhance perception. Behavioral evidence bears this out. The system is less sensitive to the perceptual effects of self-generated actions (unless there is a deviation) than to externally generated perceptual events. Some of the “reafference cancellation” effects noted by P&G are good examples: inability to self-tickle, saccadic suppression of motion percepts, and the motor-induced suppression effect measured electrophysiologically. This contrasts with nonmotor forms of prediction, what P&G referred to as the association route, which might include context effects and priming and which tend to facilitate perceptual recognition. Put simply, motor prediction decreases perceptual sensitivity to the predicted sensory event, nonmotor prediction increases perceptual sensitivity to the predicted sensory event. Why, then, is there so much attention on motor-based prediction?
P&G argue that there is evidence to support a role for motor prediction in language-related perceptual processes–a good reason to focus attention on a motor-based prediction process. There are problems with the evidence they cite, however. One cannot infer causation from motor activation during perception (it could be pure associative priming [Heyes, Reference Heyes2010; Hickok, Reference Hickok2009a]), the transcranial magnetic stimulation (TMS) evidence in speech perception tasks is likely a response bias effect (Venezia et al. Reference Venezia, Saberi, Chubb and Hickok2012), and the studies showing effects of imitation training on perception do not necessarily imply that imitation is carried out during perception, which is the claim that P&G wish to make.
There is a better way to conceptualize the architecture of the system, one that flows naturally out of fairly well-established models of cortical organization (Hickok & Poeppel Reference Hickok and Poeppel2007; Milner & Goodale Reference Milner and Goodale1995). A dorsal stream subserves sensory-motor integration for motor control; it is a highly adaptable system (Catmur et al. Reference Catmur, Walsh and Heyes2007) that links sensory targets (objects in space, sequences of phonemes) with motor systems tuned to hit those targets under varying conditions. A ventral stream subserves the linkage between sensory inputs and conceptual memory systems; it is a more stable system designed to abstract over irrelevant sensory details. Both systems enlist predictive coding as a fundamental computational strategy (Friston et al. Reference Friston, Daunizeau, Kilner and Kiebel2010), but both in the service of what the systems are designed for computationally. Motor prediction facilitates motor behavioral (but suppresses perception) and “sensory” or “ventral stream” prediction facilitates perception (Hickok Reference Hickok2012b).
P&G underline that their approach blurs the line between comprehension and production and thus rejects the “cognitive sandwich” view, whereas the alternative perspective just outlined might be interpreted as preserving the comprehension-production distinction. In this context, it is worth pointing out that P&G do not actually blur the distinction between the two slices of bread all that much. They are quite distinct computational and representational components as their c and p notation attests, and they have even added some slices, an action implementation system (p), a forward production model (p-hat), a forward comprehension model (c-hat) and a perceptual system (c), each of which generates phonological, syntactic, and semantic representations–nearly a loaf of bread. They do argue, correctly in my view, and consistent with many speech scientists and motor control researchers as well as the classical aphasiologists (despite P&G's claims to the contrary), that comprehension and production systems must interact. We make the same claims of our dorsal stream (Hickok Reference Hickok2012a; Hickok & Poeppel Reference Hickok and Poeppel2007). But where P&G and others–including myself (Hickok et al. Reference Hickok, Houde and Rong2011)–have gone wrong, in my view, is that they are trying to shoehorn a motor-control-based mechanism into a perceptual system that it was not designed to serve.
Predictive coding is in vogue in cognitive neuroscience, probably for good reason, and we are no strangers to the idea in the domain of speech (Hickok et al. Reference Hickok, Houde and Rong2011; van Wassenhove et al. Reference van Wassenhove, Grant and Poeppel2005). The current trendsetters in predictive coding are the motor control crowd who have developed, empirically validated, and promoted the notion of internal forward models as a neural mechanism necessary for smooth, efficient motor control (Kawato, Reference Kawato1999; Shadmehr et al. Reference Shadmehr, Smith and Krakauer2010; Wolpert et al. Reference Wolpert, Ghahramani and Jordan1995). But the basic idea has been pervasive in cognitive science for decades in the form of theoretical proposals like analysis-by-synthesis (Stevens & Halle Reference Stevens, Halle and Walthen-Dunn1967) and in the form of empirical observations like priming, context and top-down effects, and the like. So Pickering & Garrod's (P&G's) claim that language comprehension involves prediction is nothing new. Nor is it a particularly novel claim, right or wrong, that the motor system might be involved in receptive language; it has gotten much attention in the domain of speech perception/phonemic processing, for example (Hickok et al. Reference Hickok, Houde and Rong2011; Rauschecker & Scott Reference Rauschecker and Scott2009; Sams et al. Reference Sams, Möttönen and Sihvonen2005; van Wassenhove et al. Reference van Wassenhove, Grant and Poeppel2005; Wilson & Iacoboni Reference Wilson and Iacoboni2006) and has been a component of at least some aspects of sentence processing models for decades (Crain & Fodor Reference Crain, Fodor, Dowty, Karttunen and Zwicky1985; Frazier & Flores d'Arcais Reference Frazier and Flores d'Arcais1989; Gibson & Hickok Reference Gibson and Hickok1993). What appears to be new here is the idea that prediction at the syntactic and semantic levels can come out of the action system rather than being part of a purely perceptual mechanism.
This is an interesting idea worth investigation, but it is important to note that there are computational reasons why motor prediction generally is an inefficient, or even maladaptive, source for predictive coding during receptive functions. Here's the heart of the problem. The computational goal of a motor prediction in the context of action control is to increase perceptual sensitivity to deviations from prediction (because something is wrong and correction is needed) and to decrease sensitivity to accurate predictions (all is well, carry on). Hence, if motor prediction were used in the context of perception, it would tend to suppress sensitivity to that which is predicted, whereas an efficient mechanism should enhance perception. Behavioral evidence bears this out. The system is less sensitive to the perceptual effects of self-generated actions (unless there is a deviation) than to externally generated perceptual events. Some of the “reafference cancellation” effects noted by P&G are good examples: inability to self-tickle, saccadic suppression of motion percepts, and the motor-induced suppression effect measured electrophysiologically. This contrasts with nonmotor forms of prediction, what P&G referred to as the association route, which might include context effects and priming and which tend to facilitate perceptual recognition. Put simply, motor prediction decreases perceptual sensitivity to the predicted sensory event, nonmotor prediction increases perceptual sensitivity to the predicted sensory event. Why, then, is there so much attention on motor-based prediction?
P&G argue that there is evidence to support a role for motor prediction in language-related perceptual processes–a good reason to focus attention on a motor-based prediction process. There are problems with the evidence they cite, however. One cannot infer causation from motor activation during perception (it could be pure associative priming [Heyes, Reference Heyes2010; Hickok, Reference Hickok2009a]), the transcranial magnetic stimulation (TMS) evidence in speech perception tasks is likely a response bias effect (Venezia et al. Reference Venezia, Saberi, Chubb and Hickok2012), and the studies showing effects of imitation training on perception do not necessarily imply that imitation is carried out during perception, which is the claim that P&G wish to make.
There is a better way to conceptualize the architecture of the system, one that flows naturally out of fairly well-established models of cortical organization (Hickok & Poeppel Reference Hickok and Poeppel2007; Milner & Goodale Reference Milner and Goodale1995). A dorsal stream subserves sensory-motor integration for motor control; it is a highly adaptable system (Catmur et al. Reference Catmur, Walsh and Heyes2007) that links sensory targets (objects in space, sequences of phonemes) with motor systems tuned to hit those targets under varying conditions. A ventral stream subserves the linkage between sensory inputs and conceptual memory systems; it is a more stable system designed to abstract over irrelevant sensory details. Both systems enlist predictive coding as a fundamental computational strategy (Friston et al. Reference Friston, Daunizeau, Kilner and Kiebel2010), but both in the service of what the systems are designed for computationally. Motor prediction facilitates motor behavioral (but suppresses perception) and “sensory” or “ventral stream” prediction facilitates perception (Hickok Reference Hickok2012b).
P&G underline that their approach blurs the line between comprehension and production and thus rejects the “cognitive sandwich” view, whereas the alternative perspective just outlined might be interpreted as preserving the comprehension-production distinction. In this context, it is worth pointing out that P&G do not actually blur the distinction between the two slices of bread all that much. They are quite distinct computational and representational components as their c and p notation attests, and they have even added some slices, an action implementation system (p), a forward production model (p-hat), a forward comprehension model (c-hat) and a perceptual system (c), each of which generates phonological, syntactic, and semantic representations–nearly a loaf of bread. They do argue, correctly in my view, and consistent with many speech scientists and motor control researchers as well as the classical aphasiologists (despite P&G's claims to the contrary), that comprehension and production systems must interact. We make the same claims of our dorsal stream (Hickok Reference Hickok2012a; Hickok & Poeppel Reference Hickok and Poeppel2007). But where P&G and others–including myself (Hickok et al. Reference Hickok, Houde and Rong2011)–have gone wrong, in my view, is that they are trying to shoehorn a motor-control-based mechanism into a perceptual system that it was not designed to serve.
ACKNOWLEDGMENT
Supported by NIH grant DC009659.