WHAT?
For perhaps millions of years, the overwhelming prevalence of right-handedness has provided human beings, and perhaps even their Australopithecine ancestors, with evidence that their brains, no less than their bodies, are asymmetric (see Corballis, Reference Corballis2009). However, it was not until the seminal discoveries of Paul Broca that the scientific investigation of cerebral hemispheric asymmetry began (Pearce, Reference Pearce2009). Perhaps surprisingly, the countless studies that have since been conducted have given us no more than a general understanding of the domains and flavors of this asymmetry—testimony to the enormous challenge of the problem. We are thwarted by the number, complexity, and subtlety of the asymmetries; the mix of complementarity and redundancy of hemispheric function; inter-individual and gender-related variation in asymmetries; our tendency to anthropomorphize neural and network function, imputing behaviorally transparent roles to neurons throughout the brain when in fact, but for the input and output ends, the brain is composed of “hidden units” whose functions tend to be complex and inscrutable (see below); and by the limited tools we have to investigate cerebral function.
WHY?
The “why” of cerebral hemispheric asymmetry substantially eludes us. One way of approaching this problem is to recognize that the brain is almost entirely composed of hidden units involved in computational processes, and to ask, what might be the computational advantages of hemispheric asymmetry? We know that, as a first approximation, the left hemisphere is predominantly involved in linear, sequential processes (e.g., language and praxis), and the right hemisphere in Gestalt, all at once processes (e.g., facial recognition and visuospatial function). Are there specific neural network architectures that might be advantageous in supporting these two domains of function? I am not aware of such for right hemispheric function, but for 20 years computational neuroscientists in the field of parallel distributed processing (PDP) have been investigating neural networks, known as recurrent networks, that are particularly adept at acquisition of knowledge required to support linear, sequential processes.
A simple recurrent network (SRN) is composed of a layer of input units, a layer of hidden units, a layer of context units, and a layer of output units (Figure 1). Every unit in each layer is connected to every unit in connected layers. The strengths of the connections constitute the knowledge instantiated in the network. Input and output representations consist of patterns of activity across the units in their domains. Thus, they are distributed. The representations at the hidden unit layer cannot be related in any transparent way to input or output, but the patterns of activity of units in the hidden unit layer can provide important clues to the regularities in the knowledge that the network acquires through experience. The activity levels and the outputs of the units throughout the network are nonlinear. By virtue of the hidden unit layer and this nonlinearity, the network is capable of learning associations between essentially orthogonal domains (e.g., word meaning and word sound).
Fig. 1. A simple recurrent network.
The network is trained by successively presenting input representations as patterns of activity across the input layer. The activity of these input units then spreads through the network to eventually elicit a pattern of activity in units of the output layer. The activity of the units in the context layer represents a copy of the pattern of activity across the hidden units layer from the prior cycle of training. During training, the pattern of activity across the units of the output layer is compared quantitatively with the targeted output pattern. The strengths of the connections throughout the network are then adjusted slightly, in proportion to the magnitude of their contribution to the error, such that with the next presentation of the same input, the output pattern elicited will be closer to the target. Over time, the network asymptotically approaches a high level of performance.
The crucial role of the context units is that they enable the network to learn the patterns of sequential relationships between inputs. Thus, in very early work in this field, Jeffrey Elman demonstrated the remarkable capacity of an SRN to learn substantial English syntax, including such things as embedded clauses (Elman, Reference Elman1990). There are now many studies exploring the application of SRNs to syntax (Cartling, Reference Cartling2008; Joanisse & Seidenberg, Reference Joanisse and Seidenberg2003; St. John & Gernsbacher, Reference St. John, Gernsbacher, Healy and Bourne1998). More recently, SRNs have been used successfully to emulate the function of the neural network linking the substrates for acoustic representations in auditory association cortex and articulatory representations in Broca’s area in a way that supports both phonological function and capacity for auditory verbal short term memory (Botvinick & Plaut, Reference Botvinick and Plaut2006).
The intricacies of primate cortical network function are not sufficiently well understood for us to determine the neural instantiation of something approximating an SRN. We do know that PDP networks emulate the organization of the brain in their nonlinear processing dynamics; their population encoding of representations; their representation of knowledge and skills as strengths of connections; and their acquisition of knowledge through experience, and with it, implicit rules corresponding to statistical regularities of that experience. Simulations involving these networks have been remarkably successful in recapitulating behavior in normal and brain injured individuals. There is good evidence from nonhuman primate single unit studies that there exist networks capable of learning sequences (Carpenter, Georgopoulos, & Pellizer, Reference Carpenter, Georgopoulos and Pellizer1999).
To return to the “why” of human hemispheric asymmetry, in a proto-brain in which networks throughout the cortex have basically the same architecture, although the networks might be characterizable as “jacks of all trades,” they will be expert at none. The evolution of the extraordinary human capacity for sequential processes must have depended on the evolution of networks with SRN properties. This would come at a cost, however, because these networks would lose their jack of all trades capabilities, unless there were co-evolution of separate networks uniquely qualified to support Gestalt processes (although I do not mean to suggest that one had primacy over the other). Thus, viewed in computational terms, there is a plausible and relatively transparent explanation for evolution of regional cortical specialization. Why this should be along substantially hemispheric lines remains unclear. Why there should be a consistent pattern of cerebral hemispheric asymmetry at the population level is also unclear, although the molecular basis for certain organ asymmetries in the body has been understood for some time. For example, the characteristic locations of the vital organs (heart and spleen on left; liver on right), and the direction of rotation of our intestines, are dependent on the presence of specialized cilia at the anterior end of the primitive streak in the embryo that, in their beating, induce a net leftward flow (Nonaka, Shiratori, Saljoh, & Hamada, Reference Nonaka, Shiratori, Saljoh and Hamada2002). People who are homozygous for certain mutations of a gene coding an essential protein in these cilia have Kartagener’s syndrome, which among other things, is characterized by situs inversus, in which the laterality of internal organs is the mirror image of normal.
AT WHAT COST?
It is often thought that any human capability that has evolved must be evolutionarily advantageous. However, as Richard Dawkins has pointed out, Mother Nature’s modus operandi is as a compromiser (Dawkins, Reference Dawkins2009). Any change that evolves comes at a cost, and the nature of the organism at any one stage of evolution represents the balance of costs and benefits provided by that adaptation in the particular environmental context in which the species lives. Individual bodies are but vessels for genes, so from the point of view of genes, the success of the gene pool at the population level is the bottom line (Dawkins, Reference Dawkins1976). Examples of benefits and costs abound. The manifold advantages of human upright posture are offset by the toll taken because of maladapted lower backs and overloaded hip and knee joints. The tremendous value of our giant brains is offset by the high neonatal and maternal mortality associated with delivery, and our extended period of vulnerability as infants and children as we grow and wire our brains. Homosexuality carries with it a terminal evolutionary disadvantage, but we can presume that it continues to be prevalent because Mother Nature continues to be willing to roll a set of genetic dice that may occasionally yield nonprocreative individuals but also a strong evolutionary advantage at the population level.
Thus, there is no reason that, uniquely, all variations of human hemispheric asymmetry, manifested in all its various ways, including hand preference and dexterity, should have evolutionary advantage. Furthermore, it is conceivable that an extreme, for example, marked left handedness, might be associated, on average, with slightly reduced cognitive ability, even as it might be associated with increased probability of a rarified skill, for example, great mathematical ability.
EMPIRICAL STUDIES
In this issue of JINS, Nicholls and colleagues (Nicholls, Chapman, Loetscher, & Grimshaw, Reference Nicholls, Chapman, Loetscher and Grimshaw2010) report a study of general cognitive ability (GCA) in 825 individuals as a function of hand preference (Annett Handedness Questionnaire) and hand performance (finger tapping). The results reveal a strong association between GCA and hand performance. GCA was slightly but significantly lower in individuals with a strong performance asymmetry (left or right), and left handers had lower GCA scores than right handers. These results were thought to be consistent with the genetic model of handedness proposed by Annett (Reference Annett1985), which posits that a gene associated with equal probability of left and right handedness and reduced cognitive ability persists because of the survival advantage conferred by greater cognitive ability associated with the heterozygous state—the pairing of this allele with an allele coding for right handedness.
Hand performance asymmetries are clearly far less dramatic than other lateralized cerebral functional asymmetries, for example, language, and they are complex in their origins. Nonetheless, one can presume that on average, they dimly reflect hemispheric biases in the performance of sequential movements, hence hemispheric asymmetries in mix of neural network structures. However, we cannot know why right handers have, on average, a very slight cognitive advantage over left handers, as Nicholls et al. (this issue) showed, without knowing the molecular biology underlying the population basis for left brain language/superior right hand deftness (the brain counterpart of the molecular biology underlying body organ asymmetry). We cannot know the reasons for the modest cognitive disadvantage of extreme right or left handers without knowing more about the factors defining the optimal mixes of network types in the two hemispheres. Although the left hemisphere is substantially involved in linear, sequential processes, processes that likely benefit from the neural instantiation of recurrent networks, there are many left hemisphere processes, for example, semantic representations and access (Rogers et al., Reference Rogers, Lambon Ralph, Garrard, Bozeat, McClelland and Hodges2004), that are not likely to be intrinsically sequential. Thus, one can easily envision how extremes of lateralization of network types might confer cognitive disadvantage, at least as measured by broad batteries, consistent with the findings of Nicholls et al. (this issue).
Even more provocative are studies suggesting a dip in cognitive performance in subjects with no hand preference (Corballis, Hattie, & Fletcher, Reference Corballis, Hattie and Fletcher2008; Crow, Crow, Done, & Leask, Reference Crow, Crow, Done and Leask1998). This finding cannot be reconciled with Annett’s theory, but it is entirely consistent with the concept that ambidextrous subjects may have the least hemispheric asymmetry in neural network types. Further studies, with a focus on specific functions (e.g., specific components of language: phonological processing and grammar are intrinsically sequential, semantics is not), and using purer measures of differential hemispheric engagement (e.g., function imaging asymmetries) can begin to define the topology of optimal network engagement in these various domains. Nicholls and colleagues (this issue) did not find an “ambidextrous dip.” They offer two plausible explanations for this, both warranting further study: 1) that the means of measuring asymmetry of hand performance/hand preference may be important and that the “ambidextrous dip” may be most apparent for ambidexterity in writing; and 2) the “ambidexerous dip” may be, in part, an ontogenetic phenomenon, evident in the 11-year old children studied by Crow and colleagues (Crow et al., Reference Crow, Crow, Done and Leask1998) but not in the study of adults by Nicholls et al. (this issue)—a transient performance deficit that is ultimately compensated by adaptive neuroplasticity.
WHAT?
For perhaps millions of years, the overwhelming prevalence of right-handedness has provided human beings, and perhaps even their Australopithecine ancestors, with evidence that their brains, no less than their bodies, are asymmetric (see Corballis, Reference Corballis2009). However, it was not until the seminal discoveries of Paul Broca that the scientific investigation of cerebral hemispheric asymmetry began (Pearce, Reference Pearce2009). Perhaps surprisingly, the countless studies that have since been conducted have given us no more than a general understanding of the domains and flavors of this asymmetry—testimony to the enormous challenge of the problem. We are thwarted by the number, complexity, and subtlety of the asymmetries; the mix of complementarity and redundancy of hemispheric function; inter-individual and gender-related variation in asymmetries; our tendency to anthropomorphize neural and network function, imputing behaviorally transparent roles to neurons throughout the brain when in fact, but for the input and output ends, the brain is composed of “hidden units” whose functions tend to be complex and inscrutable (see below); and by the limited tools we have to investigate cerebral function.
WHY?
The “why” of cerebral hemispheric asymmetry substantially eludes us. One way of approaching this problem is to recognize that the brain is almost entirely composed of hidden units involved in computational processes, and to ask, what might be the computational advantages of hemispheric asymmetry? We know that, as a first approximation, the left hemisphere is predominantly involved in linear, sequential processes (e.g., language and praxis), and the right hemisphere in Gestalt, all at once processes (e.g., facial recognition and visuospatial function). Are there specific neural network architectures that might be advantageous in supporting these two domains of function? I am not aware of such for right hemispheric function, but for 20 years computational neuroscientists in the field of parallel distributed processing (PDP) have been investigating neural networks, known as recurrent networks, that are particularly adept at acquisition of knowledge required to support linear, sequential processes.
A simple recurrent network (SRN) is composed of a layer of input units, a layer of hidden units, a layer of context units, and a layer of output units (Figure 1). Every unit in each layer is connected to every unit in connected layers. The strengths of the connections constitute the knowledge instantiated in the network. Input and output representations consist of patterns of activity across the units in their domains. Thus, they are distributed. The representations at the hidden unit layer cannot be related in any transparent way to input or output, but the patterns of activity of units in the hidden unit layer can provide important clues to the regularities in the knowledge that the network acquires through experience. The activity levels and the outputs of the units throughout the network are nonlinear. By virtue of the hidden unit layer and this nonlinearity, the network is capable of learning associations between essentially orthogonal domains (e.g., word meaning and word sound).
Fig. 1. A simple recurrent network.
The network is trained by successively presenting input representations as patterns of activity across the input layer. The activity of these input units then spreads through the network to eventually elicit a pattern of activity in units of the output layer. The activity of the units in the context layer represents a copy of the pattern of activity across the hidden units layer from the prior cycle of training. During training, the pattern of activity across the units of the output layer is compared quantitatively with the targeted output pattern. The strengths of the connections throughout the network are then adjusted slightly, in proportion to the magnitude of their contribution to the error, such that with the next presentation of the same input, the output pattern elicited will be closer to the target. Over time, the network asymptotically approaches a high level of performance.
The crucial role of the context units is that they enable the network to learn the patterns of sequential relationships between inputs. Thus, in very early work in this field, Jeffrey Elman demonstrated the remarkable capacity of an SRN to learn substantial English syntax, including such things as embedded clauses (Elman, Reference Elman1990). There are now many studies exploring the application of SRNs to syntax (Cartling, Reference Cartling2008; Joanisse & Seidenberg, Reference Joanisse and Seidenberg2003; St. John & Gernsbacher, Reference St. John, Gernsbacher, Healy and Bourne1998). More recently, SRNs have been used successfully to emulate the function of the neural network linking the substrates for acoustic representations in auditory association cortex and articulatory representations in Broca’s area in a way that supports both phonological function and capacity for auditory verbal short term memory (Botvinick & Plaut, Reference Botvinick and Plaut2006).
The intricacies of primate cortical network function are not sufficiently well understood for us to determine the neural instantiation of something approximating an SRN. We do know that PDP networks emulate the organization of the brain in their nonlinear processing dynamics; their population encoding of representations; their representation of knowledge and skills as strengths of connections; and their acquisition of knowledge through experience, and with it, implicit rules corresponding to statistical regularities of that experience. Simulations involving these networks have been remarkably successful in recapitulating behavior in normal and brain injured individuals. There is good evidence from nonhuman primate single unit studies that there exist networks capable of learning sequences (Carpenter, Georgopoulos, & Pellizer, Reference Carpenter, Georgopoulos and Pellizer1999).
To return to the “why” of human hemispheric asymmetry, in a proto-brain in which networks throughout the cortex have basically the same architecture, although the networks might be characterizable as “jacks of all trades,” they will be expert at none. The evolution of the extraordinary human capacity for sequential processes must have depended on the evolution of networks with SRN properties. This would come at a cost, however, because these networks would lose their jack of all trades capabilities, unless there were co-evolution of separate networks uniquely qualified to support Gestalt processes (although I do not mean to suggest that one had primacy over the other). Thus, viewed in computational terms, there is a plausible and relatively transparent explanation for evolution of regional cortical specialization. Why this should be along substantially hemispheric lines remains unclear. Why there should be a consistent pattern of cerebral hemispheric asymmetry at the population level is also unclear, although the molecular basis for certain organ asymmetries in the body has been understood for some time. For example, the characteristic locations of the vital organs (heart and spleen on left; liver on right), and the direction of rotation of our intestines, are dependent on the presence of specialized cilia at the anterior end of the primitive streak in the embryo that, in their beating, induce a net leftward flow (Nonaka, Shiratori, Saljoh, & Hamada, Reference Nonaka, Shiratori, Saljoh and Hamada2002). People who are homozygous for certain mutations of a gene coding an essential protein in these cilia have Kartagener’s syndrome, which among other things, is characterized by situs inversus, in which the laterality of internal organs is the mirror image of normal.
AT WHAT COST?
It is often thought that any human capability that has evolved must be evolutionarily advantageous. However, as Richard Dawkins has pointed out, Mother Nature’s modus operandi is as a compromiser (Dawkins, Reference Dawkins2009). Any change that evolves comes at a cost, and the nature of the organism at any one stage of evolution represents the balance of costs and benefits provided by that adaptation in the particular environmental context in which the species lives. Individual bodies are but vessels for genes, so from the point of view of genes, the success of the gene pool at the population level is the bottom line (Dawkins, Reference Dawkins1976). Examples of benefits and costs abound. The manifold advantages of human upright posture are offset by the toll taken because of maladapted lower backs and overloaded hip and knee joints. The tremendous value of our giant brains is offset by the high neonatal and maternal mortality associated with delivery, and our extended period of vulnerability as infants and children as we grow and wire our brains. Homosexuality carries with it a terminal evolutionary disadvantage, but we can presume that it continues to be prevalent because Mother Nature continues to be willing to roll a set of genetic dice that may occasionally yield nonprocreative individuals but also a strong evolutionary advantage at the population level.
Thus, there is no reason that, uniquely, all variations of human hemispheric asymmetry, manifested in all its various ways, including hand preference and dexterity, should have evolutionary advantage. Furthermore, it is conceivable that an extreme, for example, marked left handedness, might be associated, on average, with slightly reduced cognitive ability, even as it might be associated with increased probability of a rarified skill, for example, great mathematical ability.
EMPIRICAL STUDIES
In this issue of JINS, Nicholls and colleagues (Nicholls, Chapman, Loetscher, & Grimshaw, Reference Nicholls, Chapman, Loetscher and Grimshaw2010) report a study of general cognitive ability (GCA) in 825 individuals as a function of hand preference (Annett Handedness Questionnaire) and hand performance (finger tapping). The results reveal a strong association between GCA and hand performance. GCA was slightly but significantly lower in individuals with a strong performance asymmetry (left or right), and left handers had lower GCA scores than right handers. These results were thought to be consistent with the genetic model of handedness proposed by Annett (Reference Annett1985), which posits that a gene associated with equal probability of left and right handedness and reduced cognitive ability persists because of the survival advantage conferred by greater cognitive ability associated with the heterozygous state—the pairing of this allele with an allele coding for right handedness.
Hand performance asymmetries are clearly far less dramatic than other lateralized cerebral functional asymmetries, for example, language, and they are complex in their origins. Nonetheless, one can presume that on average, they dimly reflect hemispheric biases in the performance of sequential movements, hence hemispheric asymmetries in mix of neural network structures. However, we cannot know why right handers have, on average, a very slight cognitive advantage over left handers, as Nicholls et al. (this issue) showed, without knowing the molecular biology underlying the population basis for left brain language/superior right hand deftness (the brain counterpart of the molecular biology underlying body organ asymmetry). We cannot know the reasons for the modest cognitive disadvantage of extreme right or left handers without knowing more about the factors defining the optimal mixes of network types in the two hemispheres. Although the left hemisphere is substantially involved in linear, sequential processes, processes that likely benefit from the neural instantiation of recurrent networks, there are many left hemisphere processes, for example, semantic representations and access (Rogers et al., Reference Rogers, Lambon Ralph, Garrard, Bozeat, McClelland and Hodges2004), that are not likely to be intrinsically sequential. Thus, one can easily envision how extremes of lateralization of network types might confer cognitive disadvantage, at least as measured by broad batteries, consistent with the findings of Nicholls et al. (this issue).
Even more provocative are studies suggesting a dip in cognitive performance in subjects with no hand preference (Corballis, Hattie, & Fletcher, Reference Corballis, Hattie and Fletcher2008; Crow, Crow, Done, & Leask, Reference Crow, Crow, Done and Leask1998). This finding cannot be reconciled with Annett’s theory, but it is entirely consistent with the concept that ambidextrous subjects may have the least hemispheric asymmetry in neural network types. Further studies, with a focus on specific functions (e.g., specific components of language: phonological processing and grammar are intrinsically sequential, semantics is not), and using purer measures of differential hemispheric engagement (e.g., function imaging asymmetries) can begin to define the topology of optimal network engagement in these various domains. Nicholls and colleagues (this issue) did not find an “ambidextrous dip.” They offer two plausible explanations for this, both warranting further study: 1) that the means of measuring asymmetry of hand performance/hand preference may be important and that the “ambidextrous dip” may be most apparent for ambidexterity in writing; and 2) the “ambidexerous dip” may be, in part, an ontogenetic phenomenon, evident in the 11-year old children studied by Crow and colleagues (Crow et al., Reference Crow, Crow, Done and Leask1998) but not in the study of adults by Nicholls et al. (this issue)—a transient performance deficit that is ultimately compensated by adaptive neuroplasticity.