The target article argues for the importance of “developmental start-up software” (sects. 4.1 and 5.1), but neglects the nature of that software and how it is acquired. The embodied interaction of an organism with its environment, provides a foundation for its understanding of “intuitive physics” and physical causality. Animal nervous systems control their complex physical bodies in their complex physical environments in real time, and this competence is a consequence of innate developmental processes and, especially in more complex species, subsequent developmental processes that fine-tune neural control, such as prenatal and postnatal “motor babbling” (non-goal-directed motor activity) (Meltzoff & Moore Reference Meltzoff and Moore1997). Through these developmental processes, animals acquire a non-conceptual understanding of their bodies and physical environments, which provides a foundation for higher-order imaginative and conceptual physical understanding.
Animals acquire physical competence through interaction with their environments (both phylogenetic through evolution and ontogenetic through development), and robots can acquire physical competence similarly, for example, through motor babbling (Mahoor et al. Reference Mahoor, MacLennan and MacBride2016), and this is one goal of epigenetic and developmental robotics (Lungarella et al. Reference Lungarella, Metta, Pfeifer and Sandini2003). In principle, comparable competence can be acquired by simulated physical agents behaving in simulated physical environments, but it is difficult to develop sufficiently accurate physical simulations so that agents acquire genuine physical competence (i.e., competence in the real world, not some simulated world). It should be possible to transfer physical competence from one agent to others that are sufficiently similar physically, but the tight coupling of body and nervous system suggests that physical competence will remain tied to a “form of life.”
Animals are said to be situated because cognition primarily serves behavior, and behavior is always contextual. For most animals, situatedness involves interaction with other animals; it conditions the goals, motivations, and other factors that are causative in an animal's own behavior, and can be projected onto other agents, providing a foundation for “intuitive psychology.” Psychological competence is grounded in the fact that animals are situated physical agents with interests, desires, goals, fears, and so on. Therefore, they have a basis for non-conceptual understanding of other agents (through imagination, mental simulation, projection, mirror neurons, etc.). In particular, they can project their experience of psychological causality onto other animals. This psychological competence is acquired through phylogenetic and ontogenetic adaptation.
The problem hindering AI systems from acquiring psychological competence is that most artificial agents do not have interests, desires, goals, fears, and so on that they can project onto others or use as a basis for mental simulation. For example, computer vision systems do not “care” in any significant way about the images they process. Because we can be injured and die, because we can feel fear and pain, we perceive immediately (i.e., without the mediation of conceptual thought) the significance of a man being dragged by a horse, or a family fleeing a disaster (Lake et al., Fig. 6). Certainly, through artificial evolution and reinforcement learning, we can train artificial agents to interact competently with other (real or simulated) agents, but because they are a different form of life, it will be difficult to give them the same cares and concerns as we have and that are relevant to many of our practical applications.
The target article does not directly address the important distinction between explicit and implicit models. Explicit models are the sort scientists construct, generally in terms of symbolic (lexical-level) variables; we expect to be able to understand explicit models conceptually, to communicate them in language, and to reason about them discursively (including mathematically). Implicit models are the sort that neural networks construct, generally in terms of large numbers of sub-symbolic variables, densely interrelated. Implicit models often allow an approximate emergent symbolic description, but such descriptions typically capture only the largest effects and interrelationships implicit in the sub-symbolic model. Therefore, they may lack the subtlety and context sensitivity of implicit models, which is why it is difficult, if not impossible, to capture expert behavior in explicit rules (Dreyfus & Dreyfus Reference Dreyfus and Dreyfus1986). Therefore, terms such as “intuitive physics,” “intuitive psychology,” and “theory of mind” are misleading because they connote explicit models, but implicit models (especially those acquired by virtue of embodiment and situatedness) are more likely to be relevant to the sorts of learning discussed in the target article. It is less misleading to refer to competencies, because humans and other animals can use their physical and psychological understanding to behave competently even in the absence of explicit models.
The target article shows the importance of hierarchical compositionality to the physical competence of humans and other animals (sect. 4.2.1); therefore, it is essential to understand how hierarchical structure is represented in implicit models. Recognizing the centrality of embodiment can help, for our bodies are hierarchically articulated and our physical environments are hierarchically structured. The motor affordances of our bodies provide a basis for non-conceptual understanding of the hierarchical structure of objects and actions. However, it iss important to recognize that hierarchical decompositions need not be unique; they may be context dependent and subject to needs and interests, and a holistic behavior may admit multiple incompatible decompositions.
The target article points to the importance of simulation-based and imagistic inference (sect. 4.1.1). Therefore, we need to understand how they are implemented through implicit models. Fortunately, neural representations, such as topographic maps, permit analog transformations, which are better than symbolic digital computation for simulation-based and imagistic inference. The fact of neural implementation can reveal modes of information processing and control beyond the symbolic paradigm.
Connectionism consciously abandoned the explicit models of symbolic AI and cognitive science in favor of implicit, neural network models, which had a liberating effect on cognitive modeling, AI, and robotics. With 20-20 hindsight, we know that many of the successes of connectionism could have been achieved through existing statistical methods (e.g., Bayesian inference), without any reference to the brain, but they were not. Progress had been retarded by the desire for explicit, human-interpretable models, which connectionism abandoned in favor of neural plausibility. We are ill advised to ignore the brain again.
The target article argues for the importance of “developmental start-up software” (sects. 4.1 and 5.1), but neglects the nature of that software and how it is acquired. The embodied interaction of an organism with its environment, provides a foundation for its understanding of “intuitive physics” and physical causality. Animal nervous systems control their complex physical bodies in their complex physical environments in real time, and this competence is a consequence of innate developmental processes and, especially in more complex species, subsequent developmental processes that fine-tune neural control, such as prenatal and postnatal “motor babbling” (non-goal-directed motor activity) (Meltzoff & Moore Reference Meltzoff and Moore1997). Through these developmental processes, animals acquire a non-conceptual understanding of their bodies and physical environments, which provides a foundation for higher-order imaginative and conceptual physical understanding.
Animals acquire physical competence through interaction with their environments (both phylogenetic through evolution and ontogenetic through development), and robots can acquire physical competence similarly, for example, through motor babbling (Mahoor et al. Reference Mahoor, MacLennan and MacBride2016), and this is one goal of epigenetic and developmental robotics (Lungarella et al. Reference Lungarella, Metta, Pfeifer and Sandini2003). In principle, comparable competence can be acquired by simulated physical agents behaving in simulated physical environments, but it is difficult to develop sufficiently accurate physical simulations so that agents acquire genuine physical competence (i.e., competence in the real world, not some simulated world). It should be possible to transfer physical competence from one agent to others that are sufficiently similar physically, but the tight coupling of body and nervous system suggests that physical competence will remain tied to a “form of life.”
Animals are said to be situated because cognition primarily serves behavior, and behavior is always contextual. For most animals, situatedness involves interaction with other animals; it conditions the goals, motivations, and other factors that are causative in an animal's own behavior, and can be projected onto other agents, providing a foundation for “intuitive psychology.” Psychological competence is grounded in the fact that animals are situated physical agents with interests, desires, goals, fears, and so on. Therefore, they have a basis for non-conceptual understanding of other agents (through imagination, mental simulation, projection, mirror neurons, etc.). In particular, they can project their experience of psychological causality onto other animals. This psychological competence is acquired through phylogenetic and ontogenetic adaptation.
The problem hindering AI systems from acquiring psychological competence is that most artificial agents do not have interests, desires, goals, fears, and so on that they can project onto others or use as a basis for mental simulation. For example, computer vision systems do not “care” in any significant way about the images they process. Because we can be injured and die, because we can feel fear and pain, we perceive immediately (i.e., without the mediation of conceptual thought) the significance of a man being dragged by a horse, or a family fleeing a disaster (Lake et al., Fig. 6). Certainly, through artificial evolution and reinforcement learning, we can train artificial agents to interact competently with other (real or simulated) agents, but because they are a different form of life, it will be difficult to give them the same cares and concerns as we have and that are relevant to many of our practical applications.
The target article does not directly address the important distinction between explicit and implicit models. Explicit models are the sort scientists construct, generally in terms of symbolic (lexical-level) variables; we expect to be able to understand explicit models conceptually, to communicate them in language, and to reason about them discursively (including mathematically). Implicit models are the sort that neural networks construct, generally in terms of large numbers of sub-symbolic variables, densely interrelated. Implicit models often allow an approximate emergent symbolic description, but such descriptions typically capture only the largest effects and interrelationships implicit in the sub-symbolic model. Therefore, they may lack the subtlety and context sensitivity of implicit models, which is why it is difficult, if not impossible, to capture expert behavior in explicit rules (Dreyfus & Dreyfus Reference Dreyfus and Dreyfus1986). Therefore, terms such as “intuitive physics,” “intuitive psychology,” and “theory of mind” are misleading because they connote explicit models, but implicit models (especially those acquired by virtue of embodiment and situatedness) are more likely to be relevant to the sorts of learning discussed in the target article. It is less misleading to refer to competencies, because humans and other animals can use their physical and psychological understanding to behave competently even in the absence of explicit models.
The target article shows the importance of hierarchical compositionality to the physical competence of humans and other animals (sect. 4.2.1); therefore, it is essential to understand how hierarchical structure is represented in implicit models. Recognizing the centrality of embodiment can help, for our bodies are hierarchically articulated and our physical environments are hierarchically structured. The motor affordances of our bodies provide a basis for non-conceptual understanding of the hierarchical structure of objects and actions. However, it iss important to recognize that hierarchical decompositions need not be unique; they may be context dependent and subject to needs and interests, and a holistic behavior may admit multiple incompatible decompositions.
The target article points to the importance of simulation-based and imagistic inference (sect. 4.1.1). Therefore, we need to understand how they are implemented through implicit models. Fortunately, neural representations, such as topographic maps, permit analog transformations, which are better than symbolic digital computation for simulation-based and imagistic inference. The fact of neural implementation can reveal modes of information processing and control beyond the symbolic paradigm.
Connectionism consciously abandoned the explicit models of symbolic AI and cognitive science in favor of implicit, neural network models, which had a liberating effect on cognitive modeling, AI, and robotics. With 20-20 hindsight, we know that many of the successes of connectionism could have been achieved through existing statistical methods (e.g., Bayesian inference), without any reference to the brain, but they were not. Progress had been retarded by the desire for explicit, human-interpretable models, which connectionism abandoned in favor of neural plausibility. We are ill advised to ignore the brain again.