Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-02-06T15:06:22.183Z Has data issue: false hasContentIssue false

What does it mean to predict one's own utterances?

Published online by Cambridge University Press:  24 June 2013

Antje S. Meyer
Affiliation:
Max Planck Institute for Psycholinguistics, 6500 AH Nijmegen, The Netherlands. antje.meyer@mpi.nlwww.mpi.nlpeter.hagoort@mpi.nl Radboud University Nijmegen, 6525 HP Nijmegen, The Netherlands.
Peter Hagoort
Affiliation:
Max Planck Institute for Psycholinguistics, 6500 AH Nijmegen, The Netherlands. antje.meyer@mpi.nlwww.mpi.nlpeter.hagoort@mpi.nl Radboud University Nijmegen, 6525 HP Nijmegen, The Netherlands.

Abstract

Many authors have recently highlighted the importance of prediction for language comprehension. Pickering & Garrod (P&G) are the first to propose a central role for prediction in language production. This is an intriguing idea, but it is not clear what it means for speakers to predict their own utterances, and how prediction during production can be empirically distinguished from production proper.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2013 

Pickering & Garrod (P&G) offer an integrated framework of speech production and comprehension, highlighting the importance of predicting upcoming utterances. Given the growing evidence for commonalities between production and comprehension processes and for the importance of prediction in comprehension, we find their proposal timely and interesting.

Our comment focuses mainly on the role of prediction in language production. P&G propose that speakers predict aspects of their utterance plans and compare these predictions against the actual utterance plans. This monitoring process happens at each processing level, that is, minimally at the semantic, syntactic, and phonological level.

Given the important role of prediction in comprehension and the well-attested similarities between production and comprehension, the idea that prediction should play a role in speech production follows quite naturally. Nevertheless, to us the proposal that speakers predict their utterance plans does not have immediate appeal. This is because, in everyday parlance, prediction and the predicted event have some degree of independence. It is because of this independence that predictions may or may not be borne out. It makes sense to say a person predicts the outcomes of their hand or jaw movements, as these outcomes are not fully determined by the cognitive processes underlying the predictions, but depend, among other things, on properties of the physical environment that may not be known to the person planning the movement. Similarly, it makes sense to say that a listener predicts what a speaker will say because the speaker's utterances are not caused by the same cognitive processes as those that lead to the listener's prediction. Speaker and listener each have their own, private cognition and therefore the listener's expectations about the speaker's utterance may or may not be met.

We can predict our own utterances. For instance, based on memory of past experience, I can predict how I will greet my family. However, such predictions concern overt behavior rather than plans for behavior, and they occur offline rather than in parallel with the predicted behavior. Just like predictions about other persons, my predictions of my own utterances may or may not be borne out, depending on circumstances not known at the moment of prediction. I may, for instance, deviate from my predicted greeting if I find my family standing on their heads.

Such offline predictions of overt behavior differ from the predictions proposed by P&G. In their framework, speakers predict their utterance plans as they plan them, with prediction at each planning level running somewhat ahead of the actual planning. Importantly, the predictions are based on the same information as the predicted behavior, namely, the speaker's intention (target article, sect. 3.1, “production command” in Fig. 5) and involve closely related cognitive processes, although the plans are more detailed than the predictions and can therefore be created faster. With respect to the information encoded in both representations, plan and prediction will be identical. Discrepancies can only arise when the plan and/or prediction include a fault. This is different from predictions a listener might generate about a speaker's upcoming utterance; no matter how well aligned speaker and listener are in a dialogue, their intentions are not identical, and their cognitive processes are not shared.

P&G invoke prediction during production to support self-monitoring. It is not entirely clear how the monitoring processes would work and how beneficial they would be. Key properties of the predicted representations are that they are more abstract and that they are created faster than the speech plans and not necessarily in the same order. This raises the issues of how it is decided which information to include in the prediction and what to omit and, given that planning and predicting need not follow the same time course, whether and how the cognitive processes leading to plans and predictions differ. It is also not clear why predictions would be more likely to be correct than plans, and how a speaker detects errors concerning features of the utterance that are not included in the predictions. Finally, as P&G point out, there is strong evidence for the involvement of forward modeling in motor planning, but there is as yet no empirical evidence demonstrating that this approach scales up in the way they envision. Finding this evidence is likely to be extremely challenging, as it will, for instance, involve separating the time course of planning and predicting and the properties of the planned and predicted representations.

The question of what and when to predict is also relevant for prediction during comprehension. P&G assume, correctly in our view, that in dialogue there is often not sufficient information in the mind of the comprehender to generate reliable predictions at all conceptual and linguistic levels. This results in two possibilities. One is that predictions are always made, but will often be highly unreliable, creating a need for correction that would not exist in the absence of prediction. The other option is that prediction occurs only if there is sufficient information for making a valid prediction. P&G seem to favor the latter option (“comprehenders make whatever linguistic predictions they can”; sect. 3.2, para. 1). However, their model should then specify a mechanism or procedure that determines when to predict and when not to do so. In passing, we note another gap in P&G's proposal: According to P&G, predictions can be generated via an association route and a simulation route. But how are the contributions of the two prediction routes weighted and integrated? What happens if these predictions do not fully match?

In sum, we are not convinced that prediction plays a similar crucial role in speech production as it does in comprehension. Whereas my listener can at best guess (predict) what I might say next, as a speaker I know perfectly well where I am heading, and plans and predictions cannot be separated. Moreover, the account of prediction in both production and comprehension needs further specification of what triggers the decision to predict and of how predictions are derived.