Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-02-06T14:10:14.388Z Has data issue: false hasContentIssue false

Goals are not implied by actions, but inferred from actions and contexts

Published online by Cambridge University Press:  08 April 2008

Iris van Rooij
Affiliation:
Nijmegen Institute for Cognition and Information, Radboud University Nijmegen, 6500 HE Nijmegen, The Netherlands. i.vanrooij@nici.ru.nlw.haselager@nici.ru.nlh.bekkering@nici.ru.nl
Willem Haselager
Affiliation:
Nijmegen Institute for Cognition and Information, Radboud University Nijmegen, 6500 HE Nijmegen, The Netherlands. i.vanrooij@nici.ru.nlw.haselager@nici.ru.nlh.bekkering@nici.ru.nl
Harold Bekkering
Affiliation:
Nijmegen Institute for Cognition and Information, Radboud University Nijmegen, 6500 HE Nijmegen, The Netherlands. i.vanrooij@nici.ru.nlw.haselager@nici.ru.nlh.bekkering@nici.ru.nl
Rights & Permissions [Opens in a new window]

Abstract

People cannot understand intentions behind observed actions by direct simulation, because goal inference is highly context dependent. Context dependency is a major source of computational intractability in traditional information-processing models. An embodied embedded view of cognition may be able to overcome this problem, but then the problem needs recognition and explication within the context of the new, layered cognitive architecture.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2008

Susan Hurley proposes a layered cognitive architecture to model, among other things, the human capacity for understanding people's actions. We applaud the effort because we believe cognitive science can benefit from pursuing alternatives to the traditional cognitive-sandwich account, especially when it comes to higher cognition (Haselager et al. 2003; van Rooij et al. Reference van Rooij, Bongers and Haselager2002). We do see one potential problem with Hurley's conception of how layers 3 and 4 of the shared circuits model (SCM) implement our ability to understand the goals that drive people's actions.

According to the SCM, people understand why people act by “mirroring” the “means/ends structure of observed actions” (sect. 4, para. 5 [layer 3]). From reading the target article, it is less than clear what mechanism underlies the activity of mirroring, but Hurley seems to have in mind a non-inferential mechanism in which goals and actions are directly coupled. According to Hurley, this is made possible by the fact that humans can reverse the direction of the goal – action associations generated by their own goal-directed actions. As a result, Hurley argues, “observing movements generates motor signals in the observer that tend to cause similar movements” (sect. 4, para. 5 [layer 3]). When the motor outputs are inhibited to prevent overt copying, then the system is able to engage in a form of “mirroring [that] simulates in the observer the causes of observed action” (sect. 3.4, para. 5, layer 4 of the SCM).

This conception of inferred goals and their relationship to observed actions is not unproblematic. It seems implausible that a simple one-to-one association between action and goal can account for the intelligent ways in which humans infer goals from observed actions. Research shows that the goals that people infer depend in complex ways on the context in which the actions are observed. For example, the action “pushing a button with one's head” can suggest the goal “that the button be pushed” (e.g., when the person's hands are occupied holding a towel), or the goal “that the button be pushed with the head” (when the hands are free to do the pushing as well). Even infants are sensitive to such contextual factors, leading them to push the button with their hands after seeing an adult push it with her head while holding a towel in her hands, but pushing the button with their heads when the adult's hands were free during the action (Gergely et al. Reference Gergely, Bekkering and Király2002). These observations underscore the problematic nature of Hurley's idea that “observing movements generates motor signals in the observer that tend to cause similar movements” (sect. 4, para. 5 [layer 3]). From the perspective of motor plans, after all, pushing a button with the hand is very dissimilar from pushing it with the head, yet infants will “copy” observed actions of adults in dissimilar ways if appropriate given the context.

Two defenses of the SCM could be formulated at this point: First, one could propose that the action-goal associations in the SCM are not necessarily one to one. That is, multiple goals could become associated with one and the same action (e.g., picking up a pen could be associated with writing, pointing, giving, etc.), and multiple actions could become associated with one and the same goal (the goal to go to work can be associated with walking, biking, driving, etc.). By “mirroring” one could then retrieve multiple (hypothetical) goals for any given observed action. Although it is conceivable that our brains build complexes of action-goal associations, the question remains how it selects which of the – potentially very many – possible goals is the most plausible or likely goal in the current context. It is known that context sensitivity of such abductive inferences can lead traditional information-processing models into the problem of computational intractability, be they logicist (Bylander et al. Reference Bylander, Allemang, Tanner and Josephson1991), connectionist (Thagard Reference Thagard2000), or Bayesian models (Cooper Reference Cooper1990). It remains a challenge for the SCM, or other layered architectures, to incorporate abductive inference processes that can circumvent this classical intractability problem (e.g., Cuijpers et al. Reference Cuijpers, van Schie, Koppen, Erlhagen and Bekkering2006).

Second, one could argue that from the perspective of the observer, two actions do not constitute one and the same observed action if the context of the actions differs. The argument could go as follows: the notion of “observed action” is to be understood to include relevant parts of the context (in our foregoing example: whether the hands are occupied or not); then a unique mapping from action-context pairs to goals can possibly be achieved by a mere “mirroring.” Note, however, that such a proposal serves only to move the problem from understanding the role of context in goal inference to the problem of understanding how people decide which aspects of the context are relevant parts of the current action. This is one of the many disguises in which the infamous frame-problem shows itself (Ford & Pylyshyn Reference Ford and Pylyshyn1996; Haselager Reference Haselager1997; Pylyshyn Reference Pylyshyn1987): Figuring out the proper demarcation of what constitutes an “action” is computationally no less challenging than finding the most likely goal in a set of possible goals.

By claiming that goal understanding involves in part an inferential process, we do not mean to suggest that the process is necessarily conscious, controlled, or reasoned in any way. The mechanism can be highly automatic, unconscious, and even build on associative principles. Its implementation may involve the so-called mirror neuron system (Newman-Norlund et al. Reference Newman-Norlund, van Schie, van Zuijlen and Bekkering2007), but it may also draw upon different neural systems, depending on the nature or complexity of the inferential task (e.g., de Lange et al., submitted). We see it as a challenge for future research to reconcile functional, computational, and neural explanations of goal inference in a way that explains how people can effectively and efficiently make plausible inferences about other people's goals and intentions in contexts of real-world complexity. So far, traditional information-processing models have failed in this pursuit, due to the apparently insurmountable problem of computational intractability. This is not the place for a full sketch of our views, but we would like to suggest that an embodied embedded view of cognition may prove useful in addressing this problem. First of all, Hurley's layered (rather than “sandwiched”) view of the cognitive architecture may invite an alternative, nontraditional conception of the inferential task posed to the brain (e.g., van Dijk et al., in press). Secondly, properties of world and body can serve as cognitive resources that may reduce the computational complexity of the inferential task (van Rooij & Wareham, in press).

In sum, Hurley's model is to be welcomed as a nontraditional model of action understanding, but the mechanisms behind layers 3 and 4 need clarification in view of the computational problems they are supposed to be solving. Embodiment and embeddedness may help to provide clues for such clarification, although currently this is more a way to formulate the challenge than to answer it.

References

Bylander, T., Allemang, D., Tanner, M. C. & Josephson, J. R. (1991) The computational complexity of abduction. Artificial Intelligence 49:2560.CrossRefGoogle Scholar
Cooper, G. F. (1990) The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence 42(2–3):393405.CrossRefGoogle Scholar
Cuijpers, R. H., van Schie, H. T., Koppen, M., Erlhagen, W. & Bekkering, H. (2006) Goals and means in action observation: A computational approach. Neural Networks 19:311–22.CrossRefGoogle ScholarPubMed
de Lange, F. P., Spronk, M., Willems, R. M., Toni, I. & Bekkering, H. (submitted) Complementary systems for understanding action intentions.Google Scholar
Ford, K. M. & Pylyshyn, Z. W., eds. (1996) The robot's dilemma revisited: The frame problem in artificial intelligence. Ablex.Google Scholar
Gergely, G., Bekkering, H. & Király, I. (2002) Rational imitation in preverbal infants. Nature 415:755.CrossRefGoogle ScholarPubMed
Haselager, W. F. G. (1997) Cognitive science and folk psychology: The right frame of mind. Sage.Google Scholar
Newman-Norlund, R. D., van Schie, H. T., van Zuijlen, A. M. J., & Bekkering, H. (2007) The mirror neuron system is more active during complementary compared with imitative action. Nature Neuroscience 10(7):817–18.CrossRefGoogle ScholarPubMed
Pylyshyn, Z. W., ed. (1987) The robot's dilemma: The frame problem in artificial intelligence. Ablex.Google Scholar
Thagard, P. (2000) Coherence in thought and action. MIT Press.CrossRefGoogle Scholar
van Dijk, J., Kerkhofs, R., van Rooij, I. & Haselager, P. (in press) Can there be such a thing as embodied embedded cognitive neuroscience? Theory and Psychology.Google Scholar
van Rooij, I., Bongers, R. M. & Haselager, W. F. G. (2002) A non-representational approach to imagined action. Cognitive Science 26(3):345–75.Google Scholar
van Rooij, I., & Wareham, T. (in press) Parameterized complexity in cognitive modeling: Foundations, applications and opportunities. Computer Journal.Google Scholar