Human musicality poses a longstanding evolutionary puzzle (Darwin, Reference Darwin1871), and Savage et al. and Mehr et al. provide much needed updates. Their perspectives consolidate and refine ideas from the past two decades of research, marking an important milestone (cf. Brown, Merker, & Wallin, Reference Brown, Merker and Wallin2000). We focus on Mehr et al., which argues that music's origins lie in credibly signaling coalition quality and parental attention. These adaptive hypotheses are formulated within the well-established framework of signaling theory in evolutionary biology, and build upon comparative evidence for musical behavior in nonhuman animals. We find tremendous value in the breadth and specificity of this work, but weaknesses in its dismissal of alternative hypotheses show that the historical genesis of music remains unclear, if indeed there is a principal one.
Mehr et al. dismiss the music and social bonding (MSB) hypothesis on three counts. The first derives from the premise that primate sociality evolved under predation pressure associated with diurnal foraging. Mehr et al. imply that this ultimate-level pressure renders superfluous any fitness benefits that accrue from variation in group social dynamics. This conflates the selection pressures that drive the evolution of social versus solitary living, with those that drive the evolution of social behavior within a group. We are not aware of any evidence that ties variation in the social group dynamics to differences in fitness, but differences in fitness between groups are self-evident, and we see no reason for assuming that environmental and/or genetic shifts that facilitate social bonding cannot have profound consequences in this context.
Mehr et al.'s second argument against MSB is that it conflates ultimate and proximate levels of explanation by connecting music's function to the neurobiology of social reward (Machin & Dunbar, Reference Machin and Dunbar2011; Savage et al., target article; Tarr, Launay, & Dunbar, Reference Tarr, Launay and Dunbar2014). They correctly point out that music causing social bonding today (assayed behaviorally or neurobiologically) is not evidence that it evolved to do so. This recalls Gould and Lewontin's (Reference Gould and Lewontin1979) critique of adaptationism. Current function tells us little about evolutionary process, particularly in complex aspects of human behavior/cognition (like musicality), where exaptations are expected to comprise “a mountain to the adaptive molehill” (Gould, Reference Gould1991). However, we object to the implication that this fundamental issue uniquely undercuts MSB. Mehr et al.'s own adaptive hypotheses derive the majority of their empirical support from current functions of musical behavior (in war, intimidation, territoriality, alliance-forging, and infant-directed song). We are all trapped by the present, and inappropriate evidentiary standards for identifying adaptations are not specific to any particular theory (Andrews, Gangestad, & Mathew, Reference Andrews, Gangestad and Mathew2002; Williams, Reference Williams1966).
Mehr et al.'s third argument against MSB is that music is poorly designed to coordinate groups. They derive this counterfactual from the notion that language is a superior facilitator of coordinated collective action, offering the example of a coxswain's use of language (rather than music) to coordinate rowing as support. This reflects a dubious imposition of the modern distinction between music and language onto their evolutionary foundations. Language (or its primary behavioral manifestation speech) exists on a continuum with music and many intermediates (public oratory, poetry, rap, chant, etc.). Features that are held in common across this continuum (e.g., auditory-vocal channel is default, highly ordered, infinitely generative, fundamentally social) exceed those which may be considered unique to either pole (e.g., music's spectrotemporal regularity, speech's explicit referentiality). From this perspective, the coxswain's rhythmic calls to “row!” appear more musical than linguistic. Their support to coordination, in particular, seems musical, as temporal regularity characterizes music more than speech (Brown & Jordania, Reference Brown and Jordania2013; Dauer, Reference Dauer1983). By contrast, more linguistic features (like the meaning of the word “row”) are inessential; a nonsense word or a drum beat (the norm in Chinese dragon boat racing) works just fine. Undoubtedly, speech is superior for coordinating rational thought and planning, but music, and the more musical aspects of speech, clearly support temporal and emotional coordination (Filippi, Hoeschele, Spierings, & Bowling, Reference Filippi, Hoeschele, Spierings and Bowling2019). The MSB hypothesis is not undone by language.
Finally, Mehr et al. dismiss the mate quality hypothesis (Darwin, Reference Darwin1871). The crux of their argument is that if music evolved via sexual selection in a substantive way, human musicality would be sexually dimorphic, which they argue it is not. There are a number of problems here. One is that it contradicts the author's earlier acknowledgement that current function does not imply original function. Another is that sexual selection does not always produce sexual dimorphism (Darwin, Reference Darwin1871; Hooper & Miller, Reference Hooper and Miller2008; Jones & Ratterman, Reference Jones and Ratterman2009). Another is that sexual selection has almost certainly shaped the evolution of primate loud calls, which Mehr et al. identify as musical precursors (Delgado, Reference Delgado2006; Dunn et al., Reference Dunn, Halenar, Davies, Cristobal-Azkarate, Reby, Sykes and Knapp2015). But, a more pressing problem is the claim that there are no sex differences in human musicality relevant to this argument. This seems premature given how few studies have addressed the issue directly, particularly when considering the difficulty of separating predisposition from experience at this level (a point which Mehr et al. also acknowledge). The authors’ assertion that musical behavior is invariant across the human lifespan is also suspect. Musical preferences emerge as a critical part of self-identity during adolescence, musical performances peaks in young adulthood when courtship is most intense, and musical tastes support strong assortative mating (Miller, Reference Miller, Brown, Merker and Wallin2000; North & Hargreaves, Reference North and Hargreaves1999). Finally, it should be noted that humans are more sexually dimorphic in voice frequency than any other ape (Puts et al., Reference Puts, Hill, Bailey, Walker, Rendall, Wheatley and Ramos-Fernandez2016). Male and female singing voices fall roughly an octave apart (Titze, Reference Titze2000), which has potential implications for the esthetics of chorusing (Bowling & Purves, Reference Bowling and Purves2015; Hoeschele, Reference Hoeschele2017).
In sum, we find Mehr et al.'s proposed hypothesis of music evolution to be extremely valuable for its integration with evolutionary biology, breadth, and specificity, but we see no present reason to rule out any of the other hypotheses discussed above as (co-)functional drivers of human musicality.
Human musicality poses a longstanding evolutionary puzzle (Darwin, Reference Darwin1871), and Savage et al. and Mehr et al. provide much needed updates. Their perspectives consolidate and refine ideas from the past two decades of research, marking an important milestone (cf. Brown, Merker, & Wallin, Reference Brown, Merker and Wallin2000). We focus on Mehr et al., which argues that music's origins lie in credibly signaling coalition quality and parental attention. These adaptive hypotheses are formulated within the well-established framework of signaling theory in evolutionary biology, and build upon comparative evidence for musical behavior in nonhuman animals. We find tremendous value in the breadth and specificity of this work, but weaknesses in its dismissal of alternative hypotheses show that the historical genesis of music remains unclear, if indeed there is a principal one.
Mehr et al. dismiss the music and social bonding (MSB) hypothesis on three counts. The first derives from the premise that primate sociality evolved under predation pressure associated with diurnal foraging. Mehr et al. imply that this ultimate-level pressure renders superfluous any fitness benefits that accrue from variation in group social dynamics. This conflates the selection pressures that drive the evolution of social versus solitary living, with those that drive the evolution of social behavior within a group. We are not aware of any evidence that ties variation in the social group dynamics to differences in fitness, but differences in fitness between groups are self-evident, and we see no reason for assuming that environmental and/or genetic shifts that facilitate social bonding cannot have profound consequences in this context.
Mehr et al.'s second argument against MSB is that it conflates ultimate and proximate levels of explanation by connecting music's function to the neurobiology of social reward (Machin & Dunbar, Reference Machin and Dunbar2011; Savage et al., target article; Tarr, Launay, & Dunbar, Reference Tarr, Launay and Dunbar2014). They correctly point out that music causing social bonding today (assayed behaviorally or neurobiologically) is not evidence that it evolved to do so. This recalls Gould and Lewontin's (Reference Gould and Lewontin1979) critique of adaptationism. Current function tells us little about evolutionary process, particularly in complex aspects of human behavior/cognition (like musicality), where exaptations are expected to comprise “a mountain to the adaptive molehill” (Gould, Reference Gould1991). However, we object to the implication that this fundamental issue uniquely undercuts MSB. Mehr et al.'s own adaptive hypotheses derive the majority of their empirical support from current functions of musical behavior (in war, intimidation, territoriality, alliance-forging, and infant-directed song). We are all trapped by the present, and inappropriate evidentiary standards for identifying adaptations are not specific to any particular theory (Andrews, Gangestad, & Mathew, Reference Andrews, Gangestad and Mathew2002; Williams, Reference Williams1966).
Mehr et al.'s third argument against MSB is that music is poorly designed to coordinate groups. They derive this counterfactual from the notion that language is a superior facilitator of coordinated collective action, offering the example of a coxswain's use of language (rather than music) to coordinate rowing as support. This reflects a dubious imposition of the modern distinction between music and language onto their evolutionary foundations. Language (or its primary behavioral manifestation speech) exists on a continuum with music and many intermediates (public oratory, poetry, rap, chant, etc.). Features that are held in common across this continuum (e.g., auditory-vocal channel is default, highly ordered, infinitely generative, fundamentally social) exceed those which may be considered unique to either pole (e.g., music's spectrotemporal regularity, speech's explicit referentiality). From this perspective, the coxswain's rhythmic calls to “row!” appear more musical than linguistic. Their support to coordination, in particular, seems musical, as temporal regularity characterizes music more than speech (Brown & Jordania, Reference Brown and Jordania2013; Dauer, Reference Dauer1983). By contrast, more linguistic features (like the meaning of the word “row”) are inessential; a nonsense word or a drum beat (the norm in Chinese dragon boat racing) works just fine. Undoubtedly, speech is superior for coordinating rational thought and planning, but music, and the more musical aspects of speech, clearly support temporal and emotional coordination (Filippi, Hoeschele, Spierings, & Bowling, Reference Filippi, Hoeschele, Spierings and Bowling2019). The MSB hypothesis is not undone by language.
Finally, Mehr et al. dismiss the mate quality hypothesis (Darwin, Reference Darwin1871). The crux of their argument is that if music evolved via sexual selection in a substantive way, human musicality would be sexually dimorphic, which they argue it is not. There are a number of problems here. One is that it contradicts the author's earlier acknowledgement that current function does not imply original function. Another is that sexual selection does not always produce sexual dimorphism (Darwin, Reference Darwin1871; Hooper & Miller, Reference Hooper and Miller2008; Jones & Ratterman, Reference Jones and Ratterman2009). Another is that sexual selection has almost certainly shaped the evolution of primate loud calls, which Mehr et al. identify as musical precursors (Delgado, Reference Delgado2006; Dunn et al., Reference Dunn, Halenar, Davies, Cristobal-Azkarate, Reby, Sykes and Knapp2015). But, a more pressing problem is the claim that there are no sex differences in human musicality relevant to this argument. This seems premature given how few studies have addressed the issue directly, particularly when considering the difficulty of separating predisposition from experience at this level (a point which Mehr et al. also acknowledge). The authors’ assertion that musical behavior is invariant across the human lifespan is also suspect. Musical preferences emerge as a critical part of self-identity during adolescence, musical performances peaks in young adulthood when courtship is most intense, and musical tastes support strong assortative mating (Miller, Reference Miller, Brown, Merker and Wallin2000; North & Hargreaves, Reference North and Hargreaves1999). Finally, it should be noted that humans are more sexually dimorphic in voice frequency than any other ape (Puts et al., Reference Puts, Hill, Bailey, Walker, Rendall, Wheatley and Ramos-Fernandez2016). Male and female singing voices fall roughly an octave apart (Titze, Reference Titze2000), which has potential implications for the esthetics of chorusing (Bowling & Purves, Reference Bowling and Purves2015; Hoeschele, Reference Hoeschele2017).
In sum, we find Mehr et al.'s proposed hypothesis of music evolution to be extremely valuable for its integration with evolutionary biology, breadth, and specificity, but we see no present reason to rule out any of the other hypotheses discussed above as (co-)functional drivers of human musicality.
Financial support
DLB is supported by NIMH grant K01-MH122730-01; JCD is supported by Royal Society grant RSG/R1/180340.
Conflict of interest
None.