Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-06T15:26:23.614Z Has data issue: false hasContentIssue false

Intentional strategies that make co-actors more predictable: The case of signaling

Published online by Cambridge University Press:  24 June 2013

Giovanni Pezzulo
Affiliation:
Istituto di Linguistica Computazionale “Antonio Zampolli,”CNR, 56124 Pisa, Italy. Istituto di Scienze e Tecnologie della Cognizione, CNR, 00185 Roma, Italy. giovanni.pezzulo@istc.cnr.ithttp://www.istc.cnr.it/people/giovanni-pezzulo
Haris Dindo
Affiliation:
Computer Science Engineering, University of Palermo, 90128 Palermo, Italy. haris.dindo@unipa.ithttp://roboticslab.dinfo.unipa.it/index.php/People/HarisDindo

Abstract

Pickering & Garrod (P&G) explain dialogue dynamics in terms of forward modeling and prediction-by-simulation mechanisms. Their theory dissolves a strict segregation between production and comprehension processes, and it links dialogue to action-based theories of joint action. We propose that the theory can also incorporate intentional strategies that increase communicative success: for example, signaling strategies that help remaining predictable and forming common ground.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2013 

We highly appreciate Pickering & Garrod's (P&G's) theory for four main reasons. First, P&G address dialogue from a joint action perspective, rather than in isolation from action, perception, and interaction dynamics, as most linguistic theories do. P&G's theory thereby points toward the naturalization of linguistic communication and might help our understanding of how it develops on top of the nonlinguistic “interaction engine” of our earlier evolutionary ancestors (Levinson Reference Levinson, Enfield and Levinson2006; Pezzulo Reference Pezzulo2011b).

Second, to explain how interlocutors predict one another and monitor the ongoing interaction, P&G use the notions of forward modeling and prediction-by-simulation (Dindo et al. Reference Dindo, Zambuto and Pezzulo2011; Grush Reference Grush2004; Wolpert et al. Reference Wolpert, Doya and Kawato2003). These notions are increasingly adopted in cognitive and social neuroscience; language studies could greatly benefit from linking to the same mechanistic framework. Note that P&G's theory does not overlook the specificities of language processing, and it assumes that such a processing is structured along multiple levels: semantic, syntactic, and phonological.

Third, P&G provide a theoretically sound motivation for the use of production processes in (language) comprehension and comprehension processes in (language) production, dissolving a strict production-comprehension dichotomy. P&G's analysis links well to a large body of evidence documenting the interactions between perception and production processes outside linguistic communication (Bargh & Chartrand Reference Bargh and Chartrand1999; Frith & Frith Reference Frith and Frith2008; Sebanz et al. Reference Sebanz, Bekkering and Knoblich2006a), making it an excellent entry point to study dialogue within an action-based framework.

Fourth, P&G's theory explains how prediction and covert imitation increase communication success and produce the automatic alignment of linguistic representations, which they have extensively documented empirically (Garrod & Pickering Reference Garrod and Pickering2004; Pickering & Garrod Reference Pickering and Garrod2004).

These four reasons notwithstanding, however, we propose not only that producers and comprehenders predict and covertly imitate their interlocutors, but also that they adopt intentional strategies that make their actions and intentions more predictable and comprehensible (D'Ausilio et al. Reference D'Ausilio, Badino, Li, Tokay, Craighero, Canto, Aloimonos and Fadiga2012a; Pezzulo Reference Pezzulo2011c; Sartori et al. Reference Sartori, Becchio, Bara and Castiello2009; Vesper et al. Reference Vesper, van der Wel, Knoblich and Sebanz2011). In other words, to increase communication success, we propose that they facilitate another's predictive (but also perceptual, inferential, attention, and memory) processes. For example, they can adopt signaling strategies to remain predictable and form common ground.

There are various demonstrations in which producers modulate their behavior (e.g., loudness, choice of words, speech rate) – depending on contextual factors (e.g., amount of noise, prior knowledge or uncertainty of the comprehender) – to help the comprehender's predicting and understanding. A well-known case is the exaggeration of the vowels in child-directed speech (“motherese,” Kuhl et al. Reference Kuhl, Andruski, Chistovich, Chistovich, Kozhevnikova, Ryskina, Stolyarova, Sundberg and Lacerda1997). Not only are these modulations used during teaching, but also when comprehension is difficult, as is evident to those finding themselves speaking more loudly and over-articulating in noisy environments (the so-called “Lombard effect”).

Signaling strategies can be characterized in terms of efficient management of resources (e.g., articulatory effort, time) within an optimization framework that optimizes the joint goal of communication success (Pezzulo Reference Pezzulo2011c; Pezzulo & Dindo Reference Pezzulo and Dindo2011); see also Moore (Reference Moore2007). Signaling consists of the intentional modulation of one's own behavior (e.g., over-articulation) so that, in addition to its usual pragmatic or communicative goals (e.g., informing the interlocutor), the performed action fulfills the additional goal of facilitating the interlocutor's prediction and monitoring processes (e.g., lowers the uncertainty or cognitive load). Compared to an optimal action, this modulation comes at a “cost” (e.g., a motor cost associated with over-articulation). However, as signaling ultimately helps to maximize the joint goal of communicative success, it is part of a joint action optimization process and is not (necessarily) altruistic.

Within the optimization framework, a cost-benefit analysis determines the decision to signal or not. This implies that signaling should be more frequent in uncertain contexts, when prediction is harder. To assess the theory, we designed a (nonlinguistic) joint task in which signaling determined a motor cost. We reported that the producers' signaling probability depended on the comprehenders' uncertainty; producers stopped signaling when it was low (Pezzulo & Dindo Reference Pezzulo and Dindo2011). As this was the case even when producers received no online feedback from the comprehender, we hypothesized that they maintained an internal model of the comprehender's uncertainty. Computational and empirical arguments suggest that the choice of signaling was strategic and intentional (although not necessarily conscious) rather than a by-product of interaction dynamics.

In repeated interactions, signaling and other forms of sensorimotor communication help in sharing representations and maintaining a reliable common ground, too (Clark Reference Clark1996; Pezzulo & Dindo Reference Pezzulo and Dindo2011; Sebanz et al. Reference Sebanz, Bekkering and Knoblich2006a). For example, by emphasizing the change of topic during a dialogue, a producer can reduce the comprehender's uncertainty at the level of task representations (say, dialogue topics) rather than only relative to the current utterance and so form a common ground (i.e., “align” the task representations of the interlocutors). Considerations of parsimony apply, also: Although costly to maintain, the common ground facilitates the continuation of the interaction, as both interlocutors can use it to predict what comes next. Furthermore, alignment entails parsimony: The shared part need not be maintained in two distinct forward models (one for each interlocutor), but the same forward model can be used as a “production model” for one interlocutor and “recognition model” for the other. We modeled the interplay between shared task representations and online action predictions using a hierarchical generative (Bayesian) architecture in which the former provide priors to the latter, and signaling strategies can be used to share task representations intentionally (Pezzulo Reference Pezzulo2011c; Pezzulo & Dindo Reference Pezzulo and Dindo2011).

Our proposals on signaling and joint action optimization can be expressed in the neurocomputational architecture of P&G's theory. For example, producers can modulate their behavior online by using prior knowledge and a forward model of the comprehender's comprehension processes (e.g., they can over-articulate if the environment is noisy or if they foresee prediction errors). In turn, comprehenders can use the feedback channel strategically to expose their mental states and uncertainty by using a forward model of the producer's prediction and monitoring process. Furthermore, producers can use offline predictions (briefly mentioned in P&G's theory) to maintain a model of the comprehender's prior knowledge and uncertainty, and to foresee the long-term communicative effects of their intended messages (a form of recipient design).

By incorporating these (and similar) mechanisms, P&G's theory can explain intentional strategies such as signaling that – we argue – act in concert with automatic processes of alignment and mutual imitation to facilitate prediction, align representations, and form common ground.

References

Bargh, J. A. & Chartrand, T. L. (1999) The unbearable automaticity of being. American Psychologist 54:462–79.Google Scholar
Clark, H. H. (1996) Using language. Cambridge University Press.CrossRefGoogle Scholar
D'Ausilio, A., Badino, L., Li, Y., Tokay, S., Craighero, L., Canto, R., Aloimonos, Y. & Fadiga, L. (2012a) Leadership in orchestra emerges from the causal relationships of movement kinematics. PLoS ONE 7(5): e35757. DOI:10.1371/journal.pone.0035757.Google Scholar
Dindo, H., Zambuto, D. & Pezzulo, G. (2011) Motor simulation via coupled internal models using sequential Monte Carlo. Proceedings of IJCAI 2011:2113–19.Google Scholar
Frith, C. D. & Frith, U. (2008) Implicit and explicit processes in social cognition. Neuron 60(3):503–10. DOI:10.1016/j.neuron.2008.10.032.Google Scholar
Garrod, S. & Pickering, M. J. (2004) Why is conversation so easy? Trends in Cognitive Sciences 8(1):811.CrossRefGoogle ScholarPubMed
Grush, R. (2004) The emulation theory of representation: Motor control, imagery, and perception. Behavioral and Brain Sciences 27(3):377–96.Google Scholar
Kuhl, P. K., Andruski, J. E., Chistovich, I. A., Chistovich, L. A., Kozhevnikova, E. V., Ryskina, V. L., Stolyarova, E. I., Sundberg, U. & Lacerda, F. (1997) Cross-language analysis of phonetic units in language addressed to infants. Science 277(5326):684–86.Google Scholar
Levinson, S. C. (2006) On the human “interaction engine.” In: Roots of human sociality: Culture, cognition and interaction, ed. Enfield, N. J. & Levinson, S. C. (Cur.), pp. 3969. Berg.Google Scholar
Moore, R. K. (2007) PRESENCE: A human-inspired architecture for speech-based human–machine interaction. IEEE Transactions on Computers 56(9):1176–88.Google Scholar
Pezzulo, G. (2011b) The “interaction engine”: A common pragmatic competence across linguistic and non-linguistic interactions. IEEE Transactions on Autonomous Mental Development. 4(2):105–23.Google Scholar
Pezzulo, G. (2011c) Shared representations as coordination tools for interactions. Review of Philosophy and Psychology. 2(2):303–33.Google Scholar
Pezzulo, G. & Dindo, H. (2011) What should I do next? Using shared representations to solve interaction problems. Experimental Brain Research 211(3):613630.CrossRefGoogle Scholar
Pickering, M. J. & Garrod, S. (2004) Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences 27(2):169226.CrossRefGoogle Scholar
Sartori, L., Becchio, C., Bara, B. G. & Castiello, U. (2009) Does the intention to communicate affect action kinematics? Consciousness and Cognition 18(3):766–72. DOI: 10.1016/j.concog.2009.06.004.Google Scholar
Sebanz, N., Bekkering, H. & Knoblich, G. (2006a) Joint action: Bodies and minds moving together. Trends in Cognitive Sciences 10(2):7076.CrossRefGoogle ScholarPubMed
Vesper, C., van der Wel, R. P. R. D., Knoblich, G. & Sebanz, N. (2011) Making oneself predictable: Reduced temporal variability facilitates joint action coordination. Experimental Brain Research 211(3–4):517–30. DOI:10.1007/s00221-011-2706-z.CrossRefGoogle ScholarPubMed
Wolpert, D. M., Doya, K. & Kawato, M. (2003) A unifying computational framework for motor control and social interaction. Philosophical Transactions of the Royal Society B: Biological Sciences 358(1431):593602. DOI:10.1098/rstb.2002.1238.Google Scholar