Hostname: page-component-745bb68f8f-f46jp Total loading time: 0 Render date: 2025-02-07T05:07:40.450Z Has data issue: false hasContentIssue false

Mirror representations innate versus determined by experience: A viewpoint from learning theory

Published online by Cambridge University Press:  29 April 2014

Martin A. Giese*
Affiliation:
Section for Computational Sensomotorics, Department of Cognitive Neurology, Hertie Institute for Clinical Brain Research, and Centre for Integrative Neuroscience, University Clinic Tübingen, D-72076 Tübingen, Germany. martin.giese@uni-tuebingen.dehttp://www.compsens.uni-tuebingen.de

Abstract

From the viewpoint of pattern recognition and computational learning, mirror neurons form an interesting multimodal representation that links action perception and planning. While it seems unlikely that all details of such representations are specified by the genetic code, robust learning of such complex representations likely requires an appropriate interplay between plasticity, generalization, and anatomical constraints of the underlying neural architecture.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2014 

Mirror neurons (MNs) have stimulated extensive discussions in cognitive neuroscience and related disciplines, often based on relatively limited empirical data. The article by Cook et al. provides an excellent overview of an ongoing discussion concerning the possible origins of MNs, and, especially, about the question whether their properties are innate or learned.

Mirror neurons, originally found in premotor and parietal cortex, represent an interesting representation that links the perceptual processing of actions with motor planning (Rizzolatti et al. Reference Rizzolatti, Fogassi and Gallese2001; Rizzolatti & Sinigaglia Reference Rizzolatti and Sinigaglia2008). Meanwhile, MN-like sensory-motor representations have been found in a large variety of systems, for example, at different sites in the primate brain (Mukamel et al. Reference Mukamel, Ekstrom, Kaplan, Iacoboni and Fried2010; Shepherd et al. Reference Shepherd, Klein, Deaner and Platt2009; Tkach et al. Reference Tkach, Reimer and Hatsopoulos2007), and even in non-primates such as birds (Prather et al. Reference Prather, Peters, Nowicki and Mooney2008), that is, substrates that are not homologous to the primate mirror neuron system (MNS). Following the arguments of Cook et al., this suggests that MN-like properties might emerge from mechanisms that apply to brains in general, instead of being pre-programmed in detail by the genetic code or evolutionary processes that changed a particular subsystem in the brain.

From the viewpoint of pattern recognition, MNs seem jointly to encode equivalent classes of perceived actions, and fragments or primitives of motor programs relevant for the control of action execution. Also, MNs have also been associated with the encoding of “semantic properties” of actions (Arbib Reference Arbib2008; Kemmerer & Gonzalez-Castillo Reference Kemmerer and Gonzalez-Castillo2010; Pulvermüller Reference Pulvermüller2005), where the precise mathematical definition of action semantics or the critical underlying features remains an open problem. Although it seems likely that MNs represent certain aspects of actions that are invariant, and specifically useful for motor planning, the principles of the neural encoding of such properties within populations of MNs are completely unknown. For example, a recent experiment shows that many mirror neurons are view-dependent, contradicting the interpretation that MNs encode abstract semantic properties, invariant with respect to visual stimulus parameters such as the view (Caggiano et al. Reference Caggiano, Fogassi, Rizzolatti, Pomper, Thier, Giese and Casile2011). Even less is known about neural mechanisms supporting the efficient learning of such representations and their critical invariance properties.

In the domain of sensory pattern recognition, substantial progress has been made with respect to the understanding of computational and neural principles of the learning of complex sensory patterns, e.g., in vision (Poggio & Edelman Reference Poggio and Edelman1990; Tarr & Bülthoff Reference Tarr and Bülthoff1998; Ullman Reference Ullman1996). Sensory pattern recognition is based on learning, and the efficiency of such learned representations is essentially dependent on maintaining a balance between their selectivity (the accuracy with which individual complex features are encoded) and the invariance against unimportant, semantically irrelevant details of encoded patterns (Vapnik Reference Vapnik1998). For object as well as action recognition, it has been shown that this problem can be solved by hierarchical architectures of learned detectors, or classifiers, that increase feature complexity and invariance along the hierarchy, and such architectures have been used to account for visual properties of mirror neurons (Fleischer et al. Reference Fleischer, Caggiano, Thier and Giese2013). The same problem of balancing selectivity versus invariance applies equally for the encoding and recognition of motor behavior (Poggio & Bizzi Reference Poggio and Bizzi2004), and thus also for the encoding of sensorimotor patterns in the MNS. However, it is much less clear how selectivity and generalization in spaces of complex and goal-directed motor patterns can be appropriately defined. Recent work has started to explore which learned structures might enable generalization between different motor tasks, and how expected reward might interact with the control of motor behavior (Wolpert et al. Reference Wolpert, Diedrichsen and Flanagan2011). The influence of expected reward is likely important for the understanding of the function of MNs, since many of them are encoding the expected amount of reward (Caggiano et al. Reference Caggiano, Fogassi, Rizzolatti, Casile, Giese and Thier2012).

Extensive research in visual pattern recognition has investigated how hierarchical representations with good generalization properties can be learned. Whereas initial approaches optimized intermediate feature detectors by learning, often using large amounts of training data (Olshausen & Field Reference Olshausen and Field1996; Serre et al. Reference Serre, Wolf, Bileschi, Riesenhuber and Poggio2007; Ullman Reference Ullman2007), more recent approaches, often referred to as “deep learning,” try to learn whole hierarchical recognition architectures in an unsupervised manner, enabling generalization even from very limited datasets (Bengio & Le Cun Reference Bengio, Le Cun, Bottou, Chapelle, DeCoste and Weston2007; Hinton Reference Hinton2007). To make such architectures work, it is essential to constrain the local learning processes, the overall learning strategy, as well as to choose general network architectures with bottom-up and top-down connections that ensure an efficient information transfer through the network during learning. Recent work suggests that similar hierarchical architectures might be suitable also for the encoding and recognition motor patterns (Taylor et al. Reference Taylor, Hinton and Roweis2011; Yildiz & Kiebel Reference Yildiz and Kiebel2011), and it has been postulated that hierarchical predictive architectures might be essential in the MN system (Grafton & Hamilton Reference Grafton and Hamilton2007; Kilner et al. Reference Kilner, Friston and Frith2007a).

Although we are still quite far from an understanding of the principles of the robust learning of flexible representations for action encoding, the lessons learned from sensory pattern recognition suggest a slightly different view of the debate as to whether MNs are learned or innate. Efficient learning in mirror representations likely will depend on an interplay between anatomical constraints (e.g., basic connectivity patterns between specialized areas, specific local circuitry principles, or “canonical microcircuits”; Bastos et al. Reference Bastos, Usrey, Adams, Mangun, Fries and Friston2012; Douglas & Martin Reference Douglas and Martin2004) that ensure sparse encoding and dynamic network stability, and potentially suitable forms of bottom-up top-down connectivity (e.g., Bastos et al. Reference Bastos, Usrey, Adams, Mangun, Fries and Friston2012). These principles might in fact be genetically encoded. In addition, an appropriate control and scheduling of relevant plasticity processes (e.g., ensuring local and layer-wise learning vs. closed-loop optimization of larger parts of the representation exploiting top-down predictions) might be critical. This factor might also depend additionally on ontogenetic factors, for example, how sensorimotor patterns are acquired and trained during human development. Beyond these factors, as stressed by Cook et al., the efficient context- and attention-dependent control of the activity of MNs and related plasticity processes is critical to avoid spurious learning, and such control seems compatible with recent electrophysiological results from mirror neurons (Caggiano et al. Reference Caggiano, Fogassi, Rizzolatti, Thier and Casile2009; Reference Caggiano, Fogassi, Rizzolatti, Casile, Giese and Thier2012).

ACKNOWLEDGMENTS

Supported by EC grants FP7-ICT-249858 TANGO, FP7-ICT-248311 AMARSi, FP7-PEOPLE-2011-ITN, ABC PITN-GA-011-290011, FP7-ICT-2013-FET-F/ 604102 HBP; FP7-ICT-2013-10/ 611909 KOROIBOT; Deutsche Forschungsgemeinschaft: DFG GI 305/4-1, DFG GZ: KA 1258/15-1; and German Federal Ministry of Education and Research: BMBF; FKZ: 01GQ1002A.

References

Arbib, M. (2008) From grasp to language: Embodied concepts and the challenge of abstraction. Journal of Physiology (Paris) 102(1–3):420.Google Scholar
Bastos, A. M., Usrey, W. M., Adams, R. A., Mangun, G. R., Fries, P. & Friston, K. J. (2012) Canonical microcircuits for predictive coding. Neuron 76(4):695711.Google Scholar
Bengio, Y. & Le Cun, Y. (2007) Scaling learning algorithms towards AI. In: Large-scale kernel machines, ed. Bottou, L., Chapelle, O., DeCoste, D. & Weston, J., pp. 321–88. MIT Press.Google Scholar
Caggiano, V., Fogassi, L., Rizzolatti, G., Casile, A., Giese, M. A. & Thier, P. (2012) Mirror neurons encode the subjective value of an observed action. Proceedings of the National Academy of Sciences USA 109(29):11848–53.Google Scholar
Caggiano, V, Fogassi, L, Rizzolatti, G., Pomper, J. K., Thier, P., Giese, M. A. & Casile, A. (2011) View-based encoding of actions in mirror neurons of area F5 in macaque premotor cortex. Current Biology 21(2):144–48.Google Scholar
Caggiano, V., Fogassi, L., Rizzolatti, G., Thier, P. & Casile, A. (2009) Mirror neurons differentially encode the peripersonal and extrapersonal space of monkeys. Science 324(5925):403406.CrossRefGoogle ScholarPubMed
Douglas, R. J. & Martin, K. A. C. (2004) Neuronal circuits of the neocortex. Annual Review of Neuroscience 27:419–51.Google Scholar
Fleischer, F., Caggiano, V., Thier, P. & Giese, M. A. (2013) Physiologically inspired model for the visual recognition of transitive hand actions. Journal of Neuroscience 33(15):6563–80.Google Scholar
Grafton, S. T. & Hamilton, A. F. (2007) Evidence for a distributed hierarchy of action representation in the brain. Human Movement Science 26(4):590616.Google Scholar
Hinton, G. E. (2007) Learning multiple layers of representation. Trends in Cognitive Sciences 11:428–34.CrossRefGoogle ScholarPubMed
Kemmerer, D. & Gonzalez-Castillo, J. (2010) The two-level theory of verb meaning: An approach to integrating the semantics of action with the mirror neuron system. Brain and Language 112(1):5476.Google Scholar
Kilner, J. M., Friston, K. J. & Frith, C. D. (2007a) Predictive coding: An account of the mirror neuron system. Cognitive Processing 8(3):159–66.CrossRefGoogle ScholarPubMed
Mukamel, R., Ekstrom, A. D., Kaplan, J., Iacoboni, M. & Fried, I. (2010) Single-neuron responses in humans during execution and observation of actions. Current Biology 20(8):750–56.Google Scholar
Olshausen, B. A. & Field, D. J. (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381:607609.Google Scholar
Poggio, T. & Bizzi, E. (2004) Generalization in vision and motor control. Nature 431:768–74.CrossRefGoogle ScholarPubMed
Poggio, T. & Edelman, S. E. (1990) A network that learns to recognize 3D objects. Nature 343:263–66.Google Scholar
Prather, J. F., Peters, S., Nowicki, S. & Mooney, R. (2008) Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature 451(7176):305–10.Google Scholar
Pulvermüller, F. (2005) Brain mechanisms linking language and action. Nature Reviews Neuroscience 6 (7):576–82.Google Scholar
Rizzolatti, G., Fogassi, L. & Gallese, V. (2001) Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews Neuroscience 2(9): 661–70.Google Scholar
Rizzolatti, G. & Sinigaglia, C. (2008) Mirrors in the brain. How our minds share actions and emotions. Oxford University Press.Google Scholar
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M. & Poggio, T. (2007) Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(3):411–26.Google Scholar
Shepherd, S. V., Klein, J. T., Deaner, R. O. & Platt, M. L. (2009) Mirroring of attention by neurons in macaque parietal cortex. Proceedings of the National Academy of Sciences USA 106:9489–94.CrossRefGoogle ScholarPubMed
Tarr, M. J. & Bülthoff, H. H. (1998) Image based object recognition in man, monkey and machine. Cognition 67:120.CrossRefGoogle ScholarPubMed
Taylor, G. W., Hinton, G. & Roweis, S. (2011) Two distributed-state models for generating high-dimensional time series. Journal of Machine Learning Research 12:1025–68.Google Scholar
Tkach, D., Reimer, J. & Hatsopoulos, N. G. (2007) Congruent activity during action and action observation in motor cortex. Journal of Neuroscience 27(48):13241–50.CrossRefGoogle ScholarPubMed
Ullman, S. (1996) High-level vision: Object recognition and visual cognition. MIT Press.CrossRefGoogle Scholar
Ullman, S. (2007) Object recognition and segmentation by a fragment-based hierarchy. Trends in Cognitive Sciences 11(2):5864.Google Scholar
Vapnik, V. N. (1998) Statistical learning theory. Wiley-Interscience.Google Scholar
Wolpert, D. M., Diedrichsen, J. & Flanagan, J. R. (2011) Principles of sensorimotor learning. Nature Reviews Neuroscience 12:739–51.CrossRefGoogle ScholarPubMed
Yildiz, I. B. & Kiebel, S. J. (2011) A hierarchical neuronal model for generation and online recognition of birdsongs. PLOS Computational Biology 7(12):e1002303.Google Scholar