Inner speech as a forward model?

Gary M. Oppenheim

doi:10.1017/S0140525X12002798

Inner speech as a forward model?

Published online by Cambridge University Press: 24 June 2013

Gary M. Oppenheim

Show author details

Gary M. Oppenheim*: Affiliation:
Center for Research in Language, University of California San Diego, La Jolla, CA 92093-0526. goppenheim@crl.ucsd.eduhttp://crl.ucsd.edu/~goppenheim/

Article contents

Abstract
References

Rights & Permissions

Abstract

Pickering & Garrod (P&G) consider the possibility that inner speech might be a product of forward production models. Here I consider the idea of inner speech as a forward model in light of empirical work from the past few decades, concluding that, while forward models could contribute to it, inner speech nonetheless requires activity from the implementers.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 36 , Issue 4 , August 2013 , pp. 369 - 370

DOI: https://doi.org/10.1017/S0140525X12002798 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

Pickering & Garrod (P&G) argue that coarse predictions from forward models can help detect errors of overt speech production before they occur. This error-detecting function is often assigned to inner speech (e.g., Levelt Reference Levelt1983; Levelt et al. Reference Levelt, Roelofs and Meyer1999; Nooteboom Reference Nooteboom, Sciarone, van Essen and van Raad1969): the little voice in one's head, better known for its role in conscious thought. It is therefore tempting to identify inner speech as a product of these forward models, with ${\hat p}\rightarrow{\hat c}$ providing what we know as the internal loop. In fact, conceiving of inner speech as a forward model could elegantly address three key questions. First, why do we have inner speech at all? Inner speech is a by-product of speakers' need to control their overt verbal behavior. Second, why does inner speech develop so long after overt speech (e.g., Vygotsky Reference Vygotsky, Hanfmann and Vakar1962)? Inner speech develops as the speaker learns to simulate their verbal behavior, which may lag behind the ability to produce that behavior. And third, how are people able to produce inner speech without actually speaking aloud? If inner speech is simply the offline use of forward models ( ${\hat p}\rightarrow{\hat c}$ ), then speakers never need to engage the production and comprehension implementers ( ${p}\rightarrow{c}$ ) that are the traditional generators and perceivers of inner speech.

P&G's framework would specifically address two more recently demonstrated qualities of inner speech. First, inner speech involves attenuated access to subphonemic representations. When people say tongue-twisters in their heads, their reported errors are less influenced by subphonemic similarities than their reported errors when saying them aloud (Oppenheim & Dell Reference Oppenheim and Dell2008; Reference Oppenheim and Dell2010; also Corley et al. Reference Corley, Brocklehurst and Moat2011, as noted by Oppenheim Reference Oppenheim2012). For instance, /g/ shares more features with /k/ than with /v/, so someone trying to say GOAT aloud would more likely slip to COAT than VOTE, but this tendency is less pronounced for inner slips. As P&G note, this finding is predicted if the forward models underlying inner speech produce phonologically impoverished predictions (and thus might not reflect the production implementer). Second, inner speech is flexible enough to incorporate additional detail. Although inner slips show less pronounced similarity effects than overt speech, adding silent articulation is sufficient to boost their similarity effect, apparently coercing inner speech to include more subphonemic detail (Oppenheim & Dell Reference Oppenheim and Dell2010). Such flexibility could be problematic for models that assign inner speech to a specific level of the production process (e.g., Levelt et al. Reference Levelt, Roelofs and Meyer1999), but P&G's account specifically suggests that forward models simulate multiple levels of representation, so it might accommodate the subphonemic flexibility of inner speech by adding motoric predictions ( $\hat{p}$ [sem,syn,phon,art]; forward models' more traditional jurisdiction) that are tied to motor planning.

But forward model simulations cannot provide a complete account of inner speech. One would still need to use what P&G would call “the production implementer” (target article, sect. 3, para. 2). First, inner rehearsal facilitates overt speech production (MacKay Reference MacKay1981; Rauschecker et al. Reference Rauschecker, Pringle and Watkins2008; but cf. Dell & Repka Reference Dell, Repka and Baars1992), suggesting that some aspects of the production implementer are also employed in inner speech. Second, there is abundant evidence that people easily detect their inner speech errors (Corley et al. Reference Corley, Brocklehurst and Moat2011; Dell Reference Dell and Paradis1978; Dell & Repka Reference Dell, Repka and Baars1992; Hockett Reference Hockett1967; Meringer & Meyer Reference Meringer and Meyer1895, cited in MacKay Reference MacKay and Reisberg1992; Oppenheim & Dell Reference Oppenheim and Dell2008; Reference Oppenheim and Dell2010; Postma & Noordanus Reference Postma and Noordanus1996). But since monitoring is described as the resolution of predicted and actual percepts (from forward models and implementers, respectively), it is unclear how one could detect and identify inner slips without having engaged the production implementer. (Conflict monitoring, e.g., Nozari et al. Reference Nozari, Dell and Schwartz2011, within forward models might at least allow error detection, but its use there seems to lack independent motivation, and still leaves the problem of how a speaker could identify the content of an inner slip.) Third, analogues of overt speech effects are often reported for experiments substituting inner-speech-based tasks. For instance, inner slips tend to create words, just like their overt counterparts (Corley et al. Reference Corley, Brocklehurst and Moat2011; Oppenheim & Dell Reference Oppenheim and Dell2008; Reference Oppenheim and Dell2010), and their distributions resemble overt slips in other ways (Dell Reference Dell and Paradis1978; Postma & Noordanus Reference Postma and Noordanus1996). And though inner and overt speech can diverge, they tend to elicit similar behavioral and neurophysiological effects in other domains (e.g., Kan & Thompson-Schill Reference Kan and Thompson-Schill2004), and their impairments are highly correlated (e.g., Geva et al. Reference Geva, Bennett, Warburton and Patterson2011). Though more ink is spilled cautioning differences between inner and overt speech, similarities between the two are the rule rather than the exception (at least for pre-articulatory aspects).

Given the impoverished character of P&G's forward models, it seems difficult to account for such parallels without assuming a role for production implementers in the creation of inner speech. Therefore, we could posit that inner speech works much like overt speech production, recalling P&G's acknowledgment that offline simulations could engage the implementers, actively truncating the process before articulation; forward models would supply a necessary monitoring component. This more-explicit account of inner speech allows us to question P&G's suggestion that the subphonemic attenuation of inner speech might reflect impoverishment of the forward model instead of the generation of an abstract phonological code by the production implementer. Having clarified the role of forward models as error detection, their suggestion now boils down to the idea that inner slips might be hard to “hear.” Empirical work suggests that is not the case. Experiments using noise-masked overt speech (Corley et al. Reference Corley, Brocklehurst and Moat2011) and silently mouthed speech (Oppenheim & Dell Reference Oppenheim and Dell2010) showed that each acts much like normal overt speech in terms of similarity effects (see also Oppenheim Reference Oppenheim2012). And, by explicitly modeling biased error detections, Oppenheim and Dell (Reference Oppenheim and Dell2010) formally ruled out the suggestion that their evidence for abstraction merely reflected such biases. Thus, better specifying the role of forward models in inner speech allows the conclusion that the subphonemic attenuation of inner speech does have its basis in the production implementer. More generally, conceiving of forward models as components of inner speech can wed strengths of the forward model account with the fidelity of implementer-based simulations.

References

Corley, M., Brocklehurst, P. H. & Moat, H. S. (2011) Error biases in inner and overt speech: Evidence from tongue twisters. Journal of Experimental Psychology: Learning, Memory, and Cognition 37(1):162–75. DOI:10.1037/a0021321.Google Scholar PubMed

Dell, G. S. (1978) Slips of the mind. In: The fourth Lacus forum, ed. Paradis, M., pp. 69–75. Hornbeam Press.Google Scholar

Dell, G. S. & Repka, R. J. (1992) Errors in inner speech. In: Experimental slips and human error: Exploring the architecture of volition, ed. Baars, B. J., pp. 237–62. Plenum.Google Scholar

Geva, S., Bennett, S., Warburton, E. a. & Patterson, K. (2011) Discrepancy between inner and overt speech: Implications for post-stroke aphasia and normal language processing. Aphasiology 25(3):323–43. DOI:10.1080/02687038.2010.511236.Google Scholar

Hockett, C. F. (1967) Where the tongue slips, there slip I. In: To honor Roman Jakobson, pp. 910–36. Mouton.Google Scholar

Kan, I. P. & Thompson-Schill, S. L. (2004) Effect of name agreement on prefrontal activity during overt and covert picture naming. Cognitive, Affective & Behavioral Neuroscience 4(1):43–57.Google Scholar

Levelt, W. J. M. (1983) Monitoring and self-repair in speech. Cognition 14:41–104.CrossRef Google Scholar PubMed

Levelt, W. J. M., Roelofs, A. & Meyer, A. S. (1999) A theory of lexical access in speech production. Behavioral and Brain Sciences 22(1):1–75.CrossRef Google Scholar PubMed

MacKay, D. G. (1981) The problem of rehearsal or mental practice. Journal of Motor Behavior 13(4):274–85.Google Scholar

MacKay, D. G. (1992) Constraints on theories of inner speech. In: Auditory imagery, ed. Reisberg, D., pp. 121–49. Erlbaum.Google Scholar

Meringer, R. & Meyer, K. (1895) Versprechen und verlesen. Behrs Verlag.Google Scholar

Nooteboom, S. G. (1969) The tongue slips into patterns. In: Leyden studies in linguistics and phonetics, ed. Sciarone, A. G., van Essen, A. J. & van Raad, A. A., pp. 114–32. Mouton.Google Scholar

Nozari, N., Dell, G. S. & Schwartz, M. F. (2011) Is comprehension necessary for error detection? A conflict-based account of monitoring in speech production. Cognitive Psychology 63(1):1–33. DOI:10.1016/j.cogpsych.2011.05.001.Google Scholar

Oppenheim, G. M. (2012) The case for subphonemic attenuation in inner speech: Comment on Corley, Brocklehurst, and Moat (2011). Journal of Experimental Psychology: Learning, Memory, and Cognition 38(3):502–12. DOI:10.1037/a0025257.Google Scholar

Oppenheim, G. M. & Dell, G. S. (2008) Inner speech slips exhibit lexical bias, but not the phonemic similarity effect. Cognition 106(1):528–37. DOI:10.1016/j.cognition.2007.02.006.Google Scholar

Oppenheim, G. M. & Dell, G. S. (2010) Motor movement matters: The flexible abstractness of inner speech. Memory & cognition 38(8):1147–60. DOI:10.1016/j.cognition.2007.02.006.Google Scholar

Postma, A. & Noordanus, C. (1996) Production and detection of speech errors in silent, mouthed, noise-masked, and normal auditory feedback speech. Language and Speech 39(4):375–92.Google Scholar

Rauschecker, A. M., Pringle, A. & Watkins, K. E. (2008) Changes in neural activity associated with learning to articulate novel auditory pseudowords by covert repetition. Human brain mapping 29(11):1231–42. DOI:10.1002/hbm.20460.Google Scholar