Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-02-05T05:26:18.676Z Has data issue: false hasContentIssue false

“Well, that's one way”: Interactivity in parsing and production

Published online by Cambridge University Press:  24 June 2013

Christine Howes
Affiliation:
Queen Mary University of London, Cognitive Science Research Group, School of Electronic Engineering and Computer Science, London E1 4NS, United Kingdom. c.howes@qmul.ac.ukhttp://www.eecs.qmul.ac.uk/~chrizba/ph@eecs.qmul.ac.ukarash@eecs.qmul.ac.ukjulian.hough@eecs.qmul.ac.uk
Patrick G. T. Healey
Affiliation:
Queen Mary University of London, Cognitive Science Research Group, School of Electronic Engineering and Computer Science, London E1 4NS, United Kingdom. c.howes@qmul.ac.ukhttp://www.eecs.qmul.ac.uk/~chrizba/ph@eecs.qmul.ac.ukarash@eecs.qmul.ac.ukjulian.hough@eecs.qmul.ac.uk
Arash Eshghi
Affiliation:
Queen Mary University of London, Cognitive Science Research Group, School of Electronic Engineering and Computer Science, London E1 4NS, United Kingdom. c.howes@qmul.ac.ukhttp://www.eecs.qmul.ac.uk/~chrizba/ph@eecs.qmul.ac.ukarash@eecs.qmul.ac.ukjulian.hough@eecs.qmul.ac.uk
Julian Hough
Affiliation:
Queen Mary University of London, Cognitive Science Research Group, School of Electronic Engineering and Computer Science, London E1 4NS, United Kingdom. c.howes@qmul.ac.ukhttp://www.eecs.qmul.ac.uk/~chrizba/ph@eecs.qmul.ac.ukarash@eecs.qmul.ac.ukjulian.hough@eecs.qmul.ac.uk

Abstract

We present empirical evidence from dialogue that challenges some of the key assumptions in the Pickering & Garrod (P&G) model of speaker-hearer coordination in dialogue. The P&G model also invokes an unnecessarily complex set of mechanisms. We show that a computational implementation, currently in development and based on a simpler model, can account for more of this type of dialogue data.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2013 

Pickering & Garrod's (P&G's) programmatic aim is to develop an integrated model of production and comprehension that can explain intra-individual and inter-individual language processing (Pickering & Garrod Reference Pickering and Garrod2004; Reference Pickering and Garrod2007). The mechanism they propose, built on an analogy to neuro-computational theories of hand movements, involves producing and comparing two representations of each utterance; a full one containing all the structure necessary to produce the utterance and an “impoverished” efference copy that can predict the approximate shape the utterance should have.

Although not our central concern, there is a tension between endowing the efference copy with enough structure to be able to predict semantic, syntactic, and phonetic features of an utterance and nonetheless making it reduced enough that it can be produced ahead of the utterance itself. To avoid a situation in which the “impoverishment” proposed for the efference copy is just those things not required to fit the data, we need independently motivated constraints on its structure.

Neuro-computational considerations might provide such constraints, but there are dis-analogies with the models of motor control P&G use as motivation. Efferent copies were originally proposed to enable rapid cancellation of self-produced sensory feedback, for example, to maintain a stable retinal image by cancelling out changes due to eye-movements. However, the claim that we use an analogous mechanism to predict, and correct, linguistic structure before an utterance is produced involves something conceptually different. The awkwardness of phrases such as “semantic percept” highlight this difference; until the utterance is actually produced there is nothing to generate the appropriate sensory percept. Conversely, if the “percept” is internal we are still in the cognitive sandwich.

These points aside, the target article provides a valuable overview of the evidence that language production and comprehension are tightly interwoven. P&G's main target, the “traditional model”, treats whole sentences, “messages” or utterances as the basic unit of production and comprehension. However, there is evidence from cognitive psycholinguistics and neuroscience to show that language processing is tightly interleaved around smaller units. The close interconnections between production and comprehension are especially clear in dialogue where fragmentary utterances are commonplace and people often actively collaborate with each other in the production of each turn (Goodwin Reference Goodwin and Psathas1979).

It is unclear if the interleaving of production and comprehension requires internally structured predictive models. Recent progress on incremental models of dialogue suggest a more parsimonious approach. In our computational implementation based on Dynamic Syntax (Purver et al. Reference Purver, Cann and Kempson2006; Reference Purver, Eshghi and Hough2011; Hough Reference Hough2011), the burden of predicting full utterances does not need to be employed in parsing, as speakers and hearers have incremental access to representations of utterances as these emerge. Contrarily, P&G's approach to self-repairs is analogous to Skantze and Hjalmarsson's (Reference Skantze and Hjalmarsson2010), which compares string-based plans and computes the difference between the input speech plan and the current state of realisation. In our model, instead of having to regenerate a new speech plan from scratch, we can repair the necessary increments, reusing representations already built up in context, which are accessible to both speaker and hearer. Currently, it is difficult to distinguish empirically between a dual-path model with predictions and a single-path incremental model because both combine production and comprehension.

As the paper highlights, the “vertical” issue of interleaving production and comprehension is independent from the “horizontal” problem of accounting for how language use is coordinated in dialogue. Nonetheless, this article extends previous Pickering and Garrod work (2004; 2007) in claiming that the model of intra-individual processing can be extended to inter-individual language processing (conversation). Unlike previous work, the new model operates in different ways for speakers and hearers, and the potential for differences between people's dialogue contexts is acknowledged (although not directly modelled).

The problem with this generalisation is that in dialogue we do not just predict what people are going to say, we also respond. Even if I could predict what question you are about to ask, this does not determine my answer (although it might allow me to respond more quickly). In terms of turn structure, all a prediction can do is make it easier for me to repeat you. Repetition does occur in dialogue but is rare and limited to special contexts. Corpus studies (Healey et al. Reference Healey, Purver and Howes2010) indicate that we repeat few words (less than 4%) and little more syntactic structure (less than 1%) than would be expected by chance. Crudely, a cross-person prediction model of production-comprehension cannot explain 96% of what is actually said in ordinary conversation.

One conversational context that seems to depend on the ability to make online predictions about what someone is about to say is compound contributions, in which one dialogue contribution continues another, as in this excerpt from Lerner (Reference Lerner1991):

Daughter: Oh here Dad, one way to get those corners out;

Father: is to stick your fingers inside;

Daughter: Well, that's one way.

Although it is unclear whether a predictive model better accounts for the father's continuation than one in which he is building a response based on his partial parse of the linguistic input, the daughter's response seems to be based on the mismatch between what was said and what she had planned to say. Although possible she was predicting he would say what she herself had planned to, there is no need for this additional assumption. Many cases of other-repair (Schegloff Reference Schegloff1992) such as clarification requests asking what was meant by what was said (e.g., “what?”) also seem to require that any predictability used is impoverished at precisely the level it might be useful.

In a study on responses to incomplete utterances in dialogue (Howes et al. Reference Howes, Healey, Purver and Eshghi2012), increased syntactic predictability led to more clarification requests. Although participants made use of different types of predictability in producing continuations, predictability was neither necessary nor sufficient to prompt completion, and, in extremely predictable cases, participants did not complete the utterance, responding as if the predictable elements had been produced. Our assumption is that it is the things we cannot predict that are the most important parts of conversation. Otherwise, it is hard to see why we should speak at all.

References

Goodwin, C. (1979) The interactive construction of a sentence in natural conversation. In: Everyday language: Studies in ethnomethodology, ed. Psathas, G., 97121. Irvington Publishers.Google Scholar
Healey, P. G. T., Purver, M. & Howes, C. (2010) Structural divergence in dialogue. Proceedings of 20th Annual Meeting of the Society for Text & Discourse.Google Scholar
Hough, J. (2011) Incremental semantics driven natural language generation with self-repairing capability. Proceedings of RANLP 2011 Student Conference, September 2011, Hissar, Bulgaria, 7984.Google Scholar
Howes, C., Healey, P. G. T., Purver, M. & Eshghi, A. (2012) Finishing each other's… Responding to incomplete contributions in dialogue. In: Proceedings of the 34th Annual Conference of the Cognitive Science Society. August 2012, Sapporo, Japan, 479–85.Google Scholar
Lerner, G. (1991) On the syntax of sentences-in-progress. Language in Society 20(3):441–58.Google Scholar
Pickering, M. J. & Garrod, S. (2004) Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences 27(2):169226.Google Scholar
Pickering, M. J. & Garrod, S. (2007) Do people use language production to make predictions during comprehension? Trends in Cognitive Sciences 11(3): 105–10.Google Scholar
Purver, M., Cann, R. & Kempson, R. (2006) Grammars as parsers: The dialogue challenge. Research in Language and Computation 4:289326.Google Scholar
Purver, M., Eshghi, A. & Hough, J. (2011) Incremental semantic construction in a dialogue system. Proceedings of the 9th International Conference on Computational Semantics (IWCS). January 2011, Oxford, UK, 365–69.Google Scholar
Schegloff, E. A. (1992) Repair after next turn: The last structurally provided defense of intersubjectivity in conversation. American Journal of Sociology 97(5): 1295–345.CrossRefGoogle Scholar
Skantze, G. & Hjalmarsson, A. (2010) Towards incremental speech generation in dialogue systems. Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 18.Google Scholar