In this volume, Savage et al. and Mehr et al. propose two theories of the evolution of musicality. Both theories stress the prosocial function of musicality, but whereas Savage et al. emphasize the role of music as a social bonding facilitator (the MSB hypothesis), Mehr et al. argue that music evolved as a credible signal in at least two distinct contexts, namely coalitional interactions and infant care.
Before considering these theories, let us backtrack slightly. Synchronization to an isochronous beat, which may be construed as a prerequisite for both Savage et al.'s (sect. 2.2) and Mehr et al.'s (sect. 5.1) hypotheses, can be viewed as a specific instantiation of more general self-organization phenomena widely seen in social animals from a wide range of taxa (O'Keeffe, Hong, & Strogatz, Reference O'Keeffe, Hong and Strogatz2017). Along the same lines, mimicry, as well as emotional entrainment and/or movement synchronization, occur spontaneously in pairs or larger groups of humans (Néda, Ravasz, Brechet, Vicsek, & Barabási, Reference Néda, Ravasz, Brechet, Vicsek and Barabási2000; Páez, Rimé, Basabe, Wlodarczyk, & Zumeta, Reference Páez, Rimé, Basabe, Wlodarczyk and Zumeta2015; Zivotofsky, Gruendlinger, & Hausdorff, Reference Zivotofsky, Gruendlinger and Hausdorff2012). It is worth noting that gait synchronization is facilitated much more by tactile and auditory feedback than by visual feedback (Zivotofsky et al., Reference Zivotofsky, Gruendlinger and Hausdorff2012), which may explain why auditory signals, and more specifically music, are preferentially used to facilitate synchronization, particularly in large groups where tactile feedback would be impractical (Savage et al., this volume).
Thus, many, if not most, prolonged interactions between conspecifics lead to spontaneous interpersonal synchrony, particularly with respect to mood and movement. Over time, this interpersonal synchrony could conceivably have coalesced into prototypical dance or musical forms, especially when reinforced by movement-generated acoustic feedback such as audible steps or clapping. At the same time, enhancing this naturally occurring interpersonal synchrony using external acoustic stimulation (which could be anything from a basic isochronous beat to more complex rhythmic and musical structures) would also yield behaviors akin to dance or music and would presumably facilitate bonding and/or signal greater group cohesion, as suggested by a large literature (see Rennung & Göritz, Reference Rennung and Göritz2016, for a review). Both mechanisms outlined in this paragraph may in fact co-exist, making it difficult to establish a cause-and-effect relationship, or, indeed, an ontogenetic pathway for musicality in the context of social bonding.
The challenge of pinpointing an adaptive function for musicality is compounded by the fact that most musical genres and traditions combine distinct components that can be found in isolation or in non-musical contexts, and may have different evolutionary histories. Thus, rhythm constitutes an efficient tool for synchronization (and by extension social bonding) even in the absence of melody, for instance in drumming. Similarly, melody can convey emotions and affects in the absence of rhythm, for example in the case of the lament. The model proposed by Mehr et al. acknowledges this diversity of components by postulating separate selection pressures for rhythm and melody (sect. 5.1).
Along these lines, the perception and recognition of musical affect is based on several distinct mechanisms ranging from brainstem reflex to episodic memory (Juslin, Reference Juslin2013). Musical expectancy, or more generally the ability to predict upcoming musical events, is but one of these mechanisms (Juslin, Barradas, & Eerola, Reference Juslin, Barradas and Eerola2015). Basic acoustic parameters such as sound intensity, rate of change, or frequency spectrum play a major role in conveying emotions induced by music (Gingras, Marin, & Fitch, Reference Gingras, Marin and Fitch2014) and indeed by a wide range of environmental sounds, including non-biological ones (Ma & Thompson, Reference Ma and Thompson2015). On the contrary, the predictive rewards associated with musical expectancy (Huron, Reference Huron2006) only apply to certain auditory stimuli and furthermore involve culture-specific features such as tonality. From an evolutionary perspective, it may be more sensible to focus primarily on emotion-inducing mechanisms that have a broader purview, and are presumably phylogenetically more ancient, than to emphasize more specialized ones such as prediction.
Besides providing a credible explanation for the adaptive purposes of musicality, a plausible theory for the origins of musicality should account both for music's shared features across cultures and for the remarkable variability and complexity of musical styles. The signal elaboration and cultural ritualization aspects mentioned by Mehr et al. are, in my view, critical in this regard. Indeed, a simple rhythmic structure, augmented with just enough variation in repetitive melodic formulas to be distinguishable from the productions of other groups or cultures, would suffice to evoke most of the social bonding effects (including outgroup exclusion) predicted by the MSB, thus rendering any additional complexity and diversity superfluous, except as increasingly sophisticated indicators of group identity (Savage et al., sects. 2.2.3 and 2.2.4). On the contrary, the mechanisms suggested by Mehr et al. may help explain the diversity observed in many musical cultures.
In conclusion, music's efficacy as a credible signal and/or as a tool for social bonding appears to piggyback on a diverse set of biological and cognitive processes, implying different proximate mechanisms. Rhythmic synchronization, which stems from the basic tendency to mimic and imitate our associates, is tied to group cohesion and sociality. Melody, or more generally the systematic use of pitch variation, is related to prosody and is tied to emotion communication and mood regulation. Finally, basic acoustic attributes such as intensity, rate of change, and frequency range broadly convey information about power, speed, size, and distance, and lead to appropriate responses (e.g., fear). It is likely this multiplicity of mechanisms that explains why is it so difficult to account for music's putative biological role(s), as well as its possible origins, by proposing a single adaptive function.
In this volume, Savage et al. and Mehr et al. propose two theories of the evolution of musicality. Both theories stress the prosocial function of musicality, but whereas Savage et al. emphasize the role of music as a social bonding facilitator (the MSB hypothesis), Mehr et al. argue that music evolved as a credible signal in at least two distinct contexts, namely coalitional interactions and infant care.
Before considering these theories, let us backtrack slightly. Synchronization to an isochronous beat, which may be construed as a prerequisite for both Savage et al.'s (sect. 2.2) and Mehr et al.'s (sect. 5.1) hypotheses, can be viewed as a specific instantiation of more general self-organization phenomena widely seen in social animals from a wide range of taxa (O'Keeffe, Hong, & Strogatz, Reference O'Keeffe, Hong and Strogatz2017). Along the same lines, mimicry, as well as emotional entrainment and/or movement synchronization, occur spontaneously in pairs or larger groups of humans (Néda, Ravasz, Brechet, Vicsek, & Barabási, Reference Néda, Ravasz, Brechet, Vicsek and Barabási2000; Páez, Rimé, Basabe, Wlodarczyk, & Zumeta, Reference Páez, Rimé, Basabe, Wlodarczyk and Zumeta2015; Zivotofsky, Gruendlinger, & Hausdorff, Reference Zivotofsky, Gruendlinger and Hausdorff2012). It is worth noting that gait synchronization is facilitated much more by tactile and auditory feedback than by visual feedback (Zivotofsky et al., Reference Zivotofsky, Gruendlinger and Hausdorff2012), which may explain why auditory signals, and more specifically music, are preferentially used to facilitate synchronization, particularly in large groups where tactile feedback would be impractical (Savage et al., this volume).
Thus, many, if not most, prolonged interactions between conspecifics lead to spontaneous interpersonal synchrony, particularly with respect to mood and movement. Over time, this interpersonal synchrony could conceivably have coalesced into prototypical dance or musical forms, especially when reinforced by movement-generated acoustic feedback such as audible steps or clapping. At the same time, enhancing this naturally occurring interpersonal synchrony using external acoustic stimulation (which could be anything from a basic isochronous beat to more complex rhythmic and musical structures) would also yield behaviors akin to dance or music and would presumably facilitate bonding and/or signal greater group cohesion, as suggested by a large literature (see Rennung & Göritz, Reference Rennung and Göritz2016, for a review). Both mechanisms outlined in this paragraph may in fact co-exist, making it difficult to establish a cause-and-effect relationship, or, indeed, an ontogenetic pathway for musicality in the context of social bonding.
The challenge of pinpointing an adaptive function for musicality is compounded by the fact that most musical genres and traditions combine distinct components that can be found in isolation or in non-musical contexts, and may have different evolutionary histories. Thus, rhythm constitutes an efficient tool for synchronization (and by extension social bonding) even in the absence of melody, for instance in drumming. Similarly, melody can convey emotions and affects in the absence of rhythm, for example in the case of the lament. The model proposed by Mehr et al. acknowledges this diversity of components by postulating separate selection pressures for rhythm and melody (sect. 5.1).
Along these lines, the perception and recognition of musical affect is based on several distinct mechanisms ranging from brainstem reflex to episodic memory (Juslin, Reference Juslin2013). Musical expectancy, or more generally the ability to predict upcoming musical events, is but one of these mechanisms (Juslin, Barradas, & Eerola, Reference Juslin, Barradas and Eerola2015). Basic acoustic parameters such as sound intensity, rate of change, or frequency spectrum play a major role in conveying emotions induced by music (Gingras, Marin, & Fitch, Reference Gingras, Marin and Fitch2014) and indeed by a wide range of environmental sounds, including non-biological ones (Ma & Thompson, Reference Ma and Thompson2015). On the contrary, the predictive rewards associated with musical expectancy (Huron, Reference Huron2006) only apply to certain auditory stimuli and furthermore involve culture-specific features such as tonality. From an evolutionary perspective, it may be more sensible to focus primarily on emotion-inducing mechanisms that have a broader purview, and are presumably phylogenetically more ancient, than to emphasize more specialized ones such as prediction.
Besides providing a credible explanation for the adaptive purposes of musicality, a plausible theory for the origins of musicality should account both for music's shared features across cultures and for the remarkable variability and complexity of musical styles. The signal elaboration and cultural ritualization aspects mentioned by Mehr et al. are, in my view, critical in this regard. Indeed, a simple rhythmic structure, augmented with just enough variation in repetitive melodic formulas to be distinguishable from the productions of other groups or cultures, would suffice to evoke most of the social bonding effects (including outgroup exclusion) predicted by the MSB, thus rendering any additional complexity and diversity superfluous, except as increasingly sophisticated indicators of group identity (Savage et al., sects. 2.2.3 and 2.2.4). On the contrary, the mechanisms suggested by Mehr et al. may help explain the diversity observed in many musical cultures.
In conclusion, music's efficacy as a credible signal and/or as a tool for social bonding appears to piggyback on a diverse set of biological and cognitive processes, implying different proximate mechanisms. Rhythmic synchronization, which stems from the basic tendency to mimic and imitate our associates, is tied to group cohesion and sociality. Melody, or more generally the systematic use of pitch variation, is related to prosody and is tied to emotion communication and mood regulation. Finally, basic acoustic attributes such as intensity, rate of change, and frequency range broadly convey information about power, speed, size, and distance, and lead to appropriate responses (e.g., fear). It is likely this multiplicity of mechanisms that explains why is it so difficult to account for music's putative biological role(s), as well as its possible origins, by proposing a single adaptive function.
Financial support
This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Conflict of interest
None.