Integration and embodiment
There is a clear affinity between P&G's project and that of embodied cognition. The basic idea behind embodied cognition is that cognitive processes are partly constituted by wider bodily structures and processes. Glenberg (Reference Glenberg2010, p. 586) characterized it as the claim that “all psychological processes are influenced by body morphology, sensory systems, motor systems, and emotions.” Such influence clearly requires an intimate relationship between action, perception, and cognition.
Traditionally, researchers have assumed that language processing is inherently modular; they have been committed to what Hurley (Reference Hurley2008a) calls the classical sandwich. On this view, central cognition (the meat) intercedes between action and perception (the slices of bread). P&G argue that language production and comprehension are forms of action and action perception respectively. As they see it, receivers of linguistic messages actively compute action representations during perception to help them predict what they are about to perceive. Similarly, producers of a linguistic message actively compute perception representations to help them predict sensory feedback from their ongoing action. In violation of the classical sandwich, comprehension often involves production processes and production often involves comprehension processes.
One of P&G's primary theoretical innovations is their use of forward and inverse modeling to account for the dynamic nature of language processing. This clearly fits with the central role played by perceptual and motor simulation in many accounts of embodied cognition (for reviews, see Barsalou Reference Barsalou2008; Kemmerer Reference Kemmerer, Malt and Wolff2010; Martin & Zwaan Reference Martin and Zwaan2008). It also fits with the more recent suggestion that prediction is important to guiding action and perception (Gallese Reference Gallese2009).
The problem posed by intermediate representations
A core aspect of P&G's theory does not fit with embodied cognition: its reliance on disembodied representations. The problem begins with their acknowledgement that language is special: “Unlike many other forms of action and perception, language processing is clearly structured, incorporating well-defined levels of linguistic representation such as semantics, syntax, and phonology” (target article, sect. 1.3, para. 9).To handle the linguistic structure at these three levels, they posit “a series of intermediate representations between message and articulation” (sect. 3.1, para. 3). These intermediate representations are central to their account. Indeed, P&G define production processes as those that map “higher” linguistic representations to “lower” ones and comprehension processes as those that map “lower” linguistic representations to “higher” ones.
One of the difficulties facing any attempt to assess embodied cognition is that it has been associated with several distinct theses (Anderson Reference Anderson2003; Shapiro Reference Shapiro2011; Wilson Reference Wilson2002). There is, however, good reason to think that P&G's intermediate representations are incompatible with most versions of embodiment. Obviously, any appeal to representations excludes radical anti-representational forms of embodied cognitive science (Chemero Reference Chemero2009). Less radical forms of embodied cognitive science generally assume that embodiment requires, at a minimum, grounding in modality-specific input/output systems. Pezzulo et al. (Reference Pezzulo, Barsalou, Cangelosi, Fischer, McRae and Spivey2011, p. 3) outline a core feature of this grounding: “Perhaps the first and foremost attribute of a grounded computational model is the implementation of cognitive processes… as depending on modal representations and associated mechanisms for their processing… rather than on amodal representations, transductions, and abstract rule systems.” P&G's intermediate representations clearly fail to meet this criterion.
Traditional cognitive science posits amodal representations for a reason: They provide a means of integrating information associated with distinct perceptual and motor modalities. Psycholinguists often argue that amodal representations are needed for language processing because linguistic structure transcends the particulars of the various modalities associated with production and comprehension (e.g., Jackendoff Reference Jackendoff2002; Reference Jackendoff2007; Pinker Reference Pinker2007). On the syntactic front, the well-known structural similarity of signed and spoken languages is taken to provide further support for this claim (Goldin-Meadow Reference Goldin-Meadow2005; Poizner et al. Reference Poizner, Klima and Bellugi1987).
Although the need for amodal representations has typically been formulated against the assumption that production and comprehension are separate processes, this background assumption is not necessary. Indeed, as P&G show, amodal representations can serve as a bridge for the ongoing interaction between production and comprehension. To mangle a cliché, P&G provide a way to avoid throwing the meat out with the sandwich by identifying an important role for the sort of amodal representations posited by traditional cognitive science within a non-modular account of language processing.
Conclusion
P&G's appeal to intermediate representations leaves supporters of embodied cognition with something of a dilemma. On the one hand, they could try to liberalize the notion of embodiment in order to encompass such representations. This move is not without precedent. Meteyard et al. (Reference Meteyard, Cuadrado, Bahrami and Vigliocco2012), for example, argue that researchers need to consider the possibility that cognition is weakly embodied because it involvessupramodal representations that capture associations between distinct sensorimotor systems (Barsalou et al. Reference Barsalou, Simmons, Barbey and Wilson2003; Damasio & Damasio Reference Damasio, Damasio, Koch and Davis1994; Gallese & Lakoff Reference Gallese and Lakoff2005). The obvious danger of this strategy is that it could erode the force and novelty of the thesis that cognition is embodied. On the other hand, they could try to offer more robustly embodied accounts of phonological, syntactic, and semantic knowledge. This strategy faces a general challenge: In order to eliminate the need for amodal representations at a given level, it is not enough to show that some phenomena at that level can be handled in an embodied fashion. Instead, what needs to be shown is that all of the phenomena at that level can be handled in this way (Toni et al. Reference Toni, de Lange, Noordzij and Hagoort2008). This sets the bar very high, and we can reasonably doubt that it is achievable. As matters stand, neither of these options seems particularly promising.
Pickering & Garrod (P&G) maintain that their integrated approach to language production and comprehension is compatible with, but does not require, embodiment. I argue that it is incompatible. The fundamental role played by intermediate representations that capture phonological, syntactic, and semantic structure rules out embodiment and preserves important aspects of classical cognitive science.
Integration and embodiment
There is a clear affinity between P&G's project and that of embodied cognition. The basic idea behind embodied cognition is that cognitive processes are partly constituted by wider bodily structures and processes. Glenberg (Reference Glenberg2010, p. 586) characterized it as the claim that “all psychological processes are influenced by body morphology, sensory systems, motor systems, and emotions.” Such influence clearly requires an intimate relationship between action, perception, and cognition.
Traditionally, researchers have assumed that language processing is inherently modular; they have been committed to what Hurley (Reference Hurley2008a) calls the classical sandwich. On this view, central cognition (the meat) intercedes between action and perception (the slices of bread). P&G argue that language production and comprehension are forms of action and action perception respectively. As they see it, receivers of linguistic messages actively compute action representations during perception to help them predict what they are about to perceive. Similarly, producers of a linguistic message actively compute perception representations to help them predict sensory feedback from their ongoing action. In violation of the classical sandwich, comprehension often involves production processes and production often involves comprehension processes.
One of P&G's primary theoretical innovations is their use of forward and inverse modeling to account for the dynamic nature of language processing. This clearly fits with the central role played by perceptual and motor simulation in many accounts of embodied cognition (for reviews, see Barsalou Reference Barsalou2008; Kemmerer Reference Kemmerer, Malt and Wolff2010; Martin & Zwaan Reference Martin and Zwaan2008). It also fits with the more recent suggestion that prediction is important to guiding action and perception (Gallese Reference Gallese2009).
The problem posed by intermediate representations
A core aspect of P&G's theory does not fit with embodied cognition: its reliance on disembodied representations. The problem begins with their acknowledgement that language is special: “Unlike many other forms of action and perception, language processing is clearly structured, incorporating well-defined levels of linguistic representation such as semantics, syntax, and phonology” (target article, sect. 1.3, para. 9).To handle the linguistic structure at these three levels, they posit “a series of intermediate representations between message and articulation” (sect. 3.1, para. 3). These intermediate representations are central to their account. Indeed, P&G define production processes as those that map “higher” linguistic representations to “lower” ones and comprehension processes as those that map “lower” linguistic representations to “higher” ones.
One of the difficulties facing any attempt to assess embodied cognition is that it has been associated with several distinct theses (Anderson Reference Anderson2003; Shapiro Reference Shapiro2011; Wilson Reference Wilson2002). There is, however, good reason to think that P&G's intermediate representations are incompatible with most versions of embodiment. Obviously, any appeal to representations excludes radical anti-representational forms of embodied cognitive science (Chemero Reference Chemero2009). Less radical forms of embodied cognitive science generally assume that embodiment requires, at a minimum, grounding in modality-specific input/output systems. Pezzulo et al. (Reference Pezzulo, Barsalou, Cangelosi, Fischer, McRae and Spivey2011, p. 3) outline a core feature of this grounding: “Perhaps the first and foremost attribute of a grounded computational model is the implementation of cognitive processes… as depending on modal representations and associated mechanisms for their processing… rather than on amodal representations, transductions, and abstract rule systems.” P&G's intermediate representations clearly fail to meet this criterion.
Traditional cognitive science posits amodal representations for a reason: They provide a means of integrating information associated with distinct perceptual and motor modalities. Psycholinguists often argue that amodal representations are needed for language processing because linguistic structure transcends the particulars of the various modalities associated with production and comprehension (e.g., Jackendoff Reference Jackendoff2002; Reference Jackendoff2007; Pinker Reference Pinker2007). On the syntactic front, the well-known structural similarity of signed and spoken languages is taken to provide further support for this claim (Goldin-Meadow Reference Goldin-Meadow2005; Poizner et al. Reference Poizner, Klima and Bellugi1987).
Although the need for amodal representations has typically been formulated against the assumption that production and comprehension are separate processes, this background assumption is not necessary. Indeed, as P&G show, amodal representations can serve as a bridge for the ongoing interaction between production and comprehension. To mangle a cliché, P&G provide a way to avoid throwing the meat out with the sandwich by identifying an important role for the sort of amodal representations posited by traditional cognitive science within a non-modular account of language processing.
Conclusion
P&G's appeal to intermediate representations leaves supporters of embodied cognition with something of a dilemma. On the one hand, they could try to liberalize the notion of embodiment in order to encompass such representations. This move is not without precedent. Meteyard et al. (Reference Meteyard, Cuadrado, Bahrami and Vigliocco2012), for example, argue that researchers need to consider the possibility that cognition is weakly embodied because it involvessupramodal representations that capture associations between distinct sensorimotor systems (Barsalou et al. Reference Barsalou, Simmons, Barbey and Wilson2003; Damasio & Damasio Reference Damasio, Damasio, Koch and Davis1994; Gallese & Lakoff Reference Gallese and Lakoff2005). The obvious danger of this strategy is that it could erode the force and novelty of the thesis that cognition is embodied. On the other hand, they could try to offer more robustly embodied accounts of phonological, syntactic, and semantic knowledge. This strategy faces a general challenge: In order to eliminate the need for amodal representations at a given level, it is not enough to show that some phenomena at that level can be handled in an embodied fashion. Instead, what needs to be shown is that all of the phenomena at that level can be handled in this way (Toni et al. Reference Toni, de Lange, Noordzij and Hagoort2008). This sets the bar very high, and we can reasonably doubt that it is achievable. As matters stand, neither of these options seems particularly promising.