Susan Hurley has assembled an impressive work describing the model of social cognition. In particular, I was surprised and gratified that the theory we have proposed can be fit into the shared circuits model (SCM) hypothesis without inconsistency. We have been studying mathematical models for mutual and recursive mindreading relations (Makino & Aihara Reference Makino and Aihara2003; Reference Makino and Aihara2006; Makino et al. Reference Makino, Hirayama and Aihara2005), and we have proposed the self-observation principle (SOP), stating that, to achieve mindreading, one needs to develop a prediction model of the observation of movement of oneself, which can be driven only from the observation of oneself without an efferent copy of motor commands. The SOP can be regarded as a natural extension of the predictive simulation circuit in the SCM's layer 2 and can provide the basis of the “first-person plural” mirroring system in layer 3. I am happy that our work can be placed within the common framework provided by the SCM.
However, from the aspect of SOP, I found one possible point of improvement for the SCM, namely, monitoring output inhibition. Although I agree that human adults have selective inhibition of imitation, as demonstrated in Lhermitte's imitation syndrome (Lhermitte 1986), the same may not be true for a phylogenetic (and possibly ontogenetic) explanation. In the following, I discuss why it may be better to avoid monitoring selective inhibition, and I propose an alternative: that is, monitoring the failure to perform actions.
It is clear that the output inhibition in the SCM needs to be selective. The target article assumes that the mirror/canonical neurons in layer 3 hold the representation of actions, without information on whether it is the action of oneself or the observed action of another. Hence, the output inhibition, which operates somewhere in the route from the representation of actions to their motor outputs, needs to be selective; otherwise, no action could be performed at all.
However, the target article does not discuss what causes this selectivity in inhibition. Two questions remain unanswered: (1) Upon what criteria is the inhibition selected? (2) Why is the monitoring on the output inhibition, rather than on the selection of inhibition, when the latter would be an easier alternative?
Regarding question (1), the criteria cannot depend on the self/other distinction because it is introduced by monitoring the inhibition. One possible criterion might be the estimation of benefit; that is, an action is inhibited if it is estimated to be non-beneficial or hazardous. However, I am skeptical that a creature that cannot distinguish its own action from that of another would be able to distinguish its own benefit from that of others.
I argue that both the questions are answered if one assumes that failure of performing actions, instead of output inhibition, is monitored. It is not difficult to imagine a failure to perform an action. Consider a creature with a primitive, immature mirror system, in a phylogenetically transitional phase from layer 2 to layer 3. The primitive mirror system would be activated when observing another's action, and produce some weak and partial representation for the action within the shared circuit. In cases where the representation is close enough to the full representation for the observer's equivalent action and the contextual input matches with the action (including posture and other environmental conditions), as well, the shared circuit would cause the represented action to be performed, resulting in priming or imitation. These cases would be rare, however, because the mirror system is still primitive. In most cases, a partially represented action or mismatched contextual input would cause the incomplete performance of action, resulting in failure. If the representation is weaker, or if the context is too distant (that is, if the mismatch is too big), then the shared circuit would totally fail to trigger the action, and, as a result, no imitation would occur at all.
As an answer to question 1, failure gives a good criterion for inhibition. If a creature has only a partial representation or mismatched context for an observed action, it would be more likely to bring undesirable results for itself; so the creature would be better off inhibiting the imitation of the action. Failure in triggering the action implements this in a simple way.
Question 2 is also answered because the failure is not controlled or selected. One can know the failure of action not by monitoring the control signal, but by monitoring the result of the action, including motor output and its reafferent feedback input. Note that this requires a change in the SCM, which originally monitors only the motor output, but I believe that this change is consistent with the design of the SCM.
Moreover, the phylogenetic origin of both the monitoring and the output inhibition can be explained better if one assumes that the action failure is monitored. I suggest that the failure is monitored as a result of exaptation from the detection of the prediction error. This may be more likely because error detection within the learning of simulative prediction in layer 2 is essentially the same information process as failure monitoring. The failure monitoring can also explain output inhibition in humans. Since it is better to inhibit actions that are about to fail, it is reasonable to assume phylogenetic development of the output inhibition function by using monitored inhibition. Such an inhibition would naturally be extended to be more selective, possibly by using self/other distinction.
The discussion so far, about the inhibition of mirrored action in layer 3, can be applied to the inhibition of simulated action in layer 2. The same questions, about the criteria for selecting inhibition and the reason not for monitoring selection, will be answered by assuming failure monitoring. Failure would occur in a primitive version of instrumental deliberation in layers 2+4, which might sometimes succeed to take the simulated action in advance, but would fail in most cases because of partial representation of the action or contextual mismatch. In such cases, the action should fail to avoid undesirable results, but the failure would be monitored. Later in phylogenetic time, the monitored failure is used to distinguish actual from possible actions, as well as to selectively inhibit simulated actions to be actually performed.
My point is that layer 4 should not depend on inhibition. Rather, failure monitoring can be used for both the actual/possible and self/other distinction. This provides a more concrete basis for the SCM than does the original formulation, which uses monitored inhibition for these distinctions.
Some predictions, different from those in section 3.4 of the target article, derive from failure monitoring. First, if some species have copying without inhibition, they rarely show such copying, unlike patients with imitation syndrome or echopraxia. Second, there may be creatures with capacities to inhibit copying, but not with self/other distinction. Another hypothesis might include that the imitation probability depends on contextual difference.
ACKNOWLEDGMENTS
I express my deep gratitude to Prof. Kazuyuki Aihara for his valuable advice and for his patience. I am indebted to Prof. Toshihisa Takagi, Prof. Steven Kraines, and Mr. Yohei Akada for their kind support.