Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-05T23:45:31.740Z Has data issue: false hasContentIssue false

Toward inclusive theories of the evolution of musicality

Published online by Cambridge University Press:  30 September 2021

Patrick E. Savage
Affiliation:
Faculty of Environment and Information Studies, Keio University, Fujisawa252-0882, Japanpsavage@sfc.keio.ac.jp, http://PatrickESavage.com
Psyche Loui
Affiliation:
College of Arts, Media and Design, Northeastern University, Boston, MA02115, USAp.loui@northeastern.edu, http://www.psycheloui.com
Bronwyn Tarr
Affiliation:
Department of Experimental Psychology, Institute of Cognitive and Evolutionary Anthropology, University of Oxford, OxfordOX2 6PN, UKbronwyn.tarr@anthro.ox.ac.uk, bronwyntarr01@gmail.com, https://www.anthro.ox.ac.uk/people/dr-bronwyn-tarr
Adena Schachner
Affiliation:
Department of Psychology, University of California San Diego, La Jolla, CA92093, USAschachner@ucsd.edu, https://madlab.ucsd.edu
Luke Glowacki
Affiliation:
Department of Anthropology, Boston University, Boston, MA02215, USAglowacki@fas.harvard.edu, https://www.hsb-lab.org/
Steven Mithen
Affiliation:
Department of Archaeology, University of Reading, ReadingRG6 6AB, UKs.j.mithen@reading.ac.uk, http://www.reading.ac.uk/archaeology/about/staff/s-j-mithen.aspx
W. Tecumseh Fitch
Affiliation:
Department of Behavioral and Cognitive Biology, University of Vienna, Vienna1090, Austria. tecumseh.fitch@univie.ac.at, https://homepage.univie.ac.at/tecumseh.fitch/

Abstract

We compare and contrast the 60 commentaries by 109 authors on the pair of target articles by Mehr et al. and ourselves. The commentators largely reject Mehr et al.'s fundamental definition of music and their attempts to refute (1) our social bonding hypothesis, (2) byproduct hypotheses, and (3) sexual selection hypotheses for the evolution of musicality. Instead, the commentators generally support our more inclusive proposal that social bonding and credible signaling mechanisms complement one another in explaining cooperation within and competition between groups in a coevolutionary framework (albeit with some confusion regarding terminologies such as “byproduct” and “exaptation”). We discuss the proposed criticisms and extensions, with a focus on moving beyond adaptation/byproduct dichotomies and toward testing of cross-species, cross-cultural, and other empirical predictions.

Type
Authors’ Response
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press

“Music is the most powerful form of communication in the world. It brings us all together.”

— Sean Combs aka Puff Daddy (Poggi, Reference Poggi2013)

“Nirgends können zwei Menschen leichter Freunde werden als beim Musizieren.”

(There is no easier way for two people to become friends than by making music together)

— Hermann Hesse, Das Glasperlenspiel (Reference Hesse1943, p. 51)

“who hears music, feels his solitude Peopled at once.”

— Robert Browning, Balaustion's adventure (Reference Browning1871, lines 323–324)

R1. Introduction

The joint publication of our target article, the companion target article by Mehr et al., and 60 commentaries on these target articles by 109 experts represents a chance to synthesize in a single discussion the complex debate about the origins of music. Such debates date back at least to Rousseau (Reference Rousseau1760/1998), were developed by Darwin (Reference Darwin1871), and have expanded dramatically in the past few decades – notably with the publication of edited volumes and special issues published by MIT Press and Philosophical Transactions of the Royal Society B (Honing, Reference Honing2018; Honing et al., Reference Honing, Cate, Peretz and Trehub2015; Wallin et al., Reference Wallin, Merker and Brown2000).

Although the Behavioral and Brain Sciences editors only required us to respond to the commentaries specifically addressing our own target article, they provided us with all accepted responses, including those addressing Mehr et al.'s target article. It became clear when reading these responses that doing justice to the debate would require us to simultaneously address responses to both target articles. This is especially true because Mehr et al. not only describe their own “credible signaling” hypothesis, but also devote substantial space to critiquing three of the most prominent alternative hypotheses: (1) the social bonding hypothesis detailed in our target article; (2) the hypothesis originally proposed by Darwin (Reference Darwin1871) and championed most notably by Miller (Reference Miller, Wallin, Merker and Brown2000) that musicality evolved through sexual selection; and (3) the hypothesis popularized by Pinker (Reference Pinker1997) that musicality is a byproduct of the evolution of language or other adaptations (memorably captured by Pinker's description of music as “auditory cheesecake”).

The combined 60 responses analyze all four hypotheses (credible signaling, social bonding, sexual selection, and byproduct). Because all commentaries focus on one or both target articles, we have created Figure R1 and Table R1 to visualize the degree to which – in our subjective evaluation – each commentary is supportive or critical of the ideas proposed in each of the two target articles. This allows us to easily visualize the broad space of agreement/disagreement among the responses and highlight the relationships between particularly notable commentaries.

Figure R1. Visual comparison of the 60 commentaries responding to the pair of target articles, based on our subjective evaluation of the degree to which they are supportive or critical of each target article. Figure R1 plots the average of subjective ratings by PES and PL on a scale from −10 (“strongly critical”) to 10 (“strongly supportive”). Agreement between the two raters was high (intraclass correlation coefficient (ICC) = 0.89). See github.com/comp-music-lab/social-bonding for full data and code. Responses published with our target article are ordered using numbers (1–35; colored blue), whereas those published with Mehr et al. are ordered using letters (A–Y; colored red). Key commentaries discussed in detail in our response are highlighted in bold.

Table R1. List of the 60 commentaries accompanying target articles by Mehr et al. and ourselves

Across all commentaries, four key themes repeatedly emerge: (1) defining “music” and “musicality”; (2) relationships between the social bonding and credible signaling hypotheses, (3) distinguishing between adaptations and byproducts; and (4) extensions/applications/tests of the hypotheses. We have highlighted in bold the 16 commentaries that we believe most comprehensively capture the full spectrum of debate. In the following sections, we will address each of these key themes in detail, with a particular emphasis on these 16 commentaries.

R2. Defining “music” and “musicality”

The definitions of the fundamental terms “music” and “musicality” were critiqued by a number of commentators. We avoided providing a precise definition of “music,” citing long-standing debates regarding “practical and ethical challenges involved in defining and comparing ‘music’ and ‘musicality’ in cross-culturally valid ways.” Mehr et al. offered the following definition: “Music is an auditory display built from melodies and rhythms.”

Cross; Iyer; Margulis; and Wald-Fuhrmann, Pearson, Roeske, Grüny, & Polak (Wald-Fuhrmann et al.) all noted the dangers of ethnocentrism in defining music as a purely auditory phenomenon in terms derived from European heritage. Dissanayake; Sievers & Wheatley; and Trehub also pointed out the need for a multimodal treatment including movement (e.g., dance), touch, and so on in addition to sound.

We have previously explained why cross-culturally universal definitions of “music” are not possible, particularly when it comes to delineating speech from song or music from dance (Savage, Reference Savage and Sturman2019b; Savage et al., Reference Savage, Brown, Sakai and Currie2015). Instead, a more useful definition cited in our target article is Honing's (Reference Honing2018) distinction between “music” as cultural products (songs, instruments, dance styles, etc.) and “musicality” as the set of biological capacities underlying the creation of those products. Although this circular definition leaves unanswered the unanswerable question of defining music itself, it does allow us to focus on the ways that cultural and biological evolution can work in tandem and in parallel to produce the diverse products around the world that many recognize as “music” or “music-like.”

Honing and Wald-Fuhrmann et al. accurately note that at times both target articles fail to carefully distinguish between “music” and “musicality,” and that in some cases we might have more appropriately focused on musicality, not on music. Indeed, in retrospect a more accurate title for our target article might have been “Musicality as a coevolved system for social bonding” (just as a more accurate title for Mehr et al. might have been “Origins of musicality in credible signaling”).

R3. Social bonding versus credible signaling

We start by noting significant areas of agreement and/or synergy between the two target articles. First, both articles agree that music's social aspects are the strongest candidate for adaptive functions. Second, our focus on gene-culture coevolution in our paper is endorsed by Mehr et al., although they do not pursue this idea in depth. Third, we agree that musicality has deep roots in nonhuman animal vocalizations.

One primary difference between the two target articles is that Mehr et al. spent the bulk of their article refuting alternative theories, whereas we attempted to synthesize several existing theories into a broader, more inclusive framework. Based on our reading, as well as the commentaries, we argue below that Mehr et al.'s critiques do not succeed in showing that credible signaling is the sole or primary cause of the evolution of musicality. Instead, we believe that the credible signaling hypothesis can be incorporated as one sub-component of our broader, more inclusive framework.

R3.1 Social bonding and credible signaling are complementary, not mutually exclusive

The two target articles have an asymmetrical relationship. Although we did attempt to describe ways in which the social bonding and credible signaling hypotheses might produce contrasting predictions that could be tested experimentally (cf. sect. 6.5), ultimately we stated that “Bonding and signaling hypotheses are not mutually exclusive, but rather complementary.” In contrast, Mehr et al. devote over 2,000 words to categorically rejecting the social bonding hypothesis, arguing that “music does not directly cause social cohesion: rather, it signals existing social cohesion that was obtained by other means” (target article, sect. 4.2.1, para. 14).

Overall, the most consistent point unifying multiple commentaries was a consensus in favor of our argument of complementarity (e.g., Benítez-Burraco; Gingras; Honing; Juslin; Morrison; Trainor), and against Mehr et al.'s of mutual exclusivity. Only three commentaries (Kennedy & Radford; Pinker; and Zentner) appeared convinced by Mehr et al.'s arguments against social bonding.Footnote 1 In contrast, many commentators rejected these arguments, for a variety of reasons, including: (1) they turn the origin of the social cohesion being signaled into “somebody else's problem” (Rendell, Doolittle, Garland, & South [Rendell et al.]); (2) they are inconsistent with substantial experimental evidence showing causal effects of synchrony on cooperation (Gabriel & Paravati; Wood); (3) they incorrectly assume that music-making is a purely altruistic sacrifice that does not benefit the performer (Harrison & Seale); (4) their criticisms of social bonding apply equally to their own favored hypothesis (Bowling, Hoeschele, & Dunn [Bowling et al.]); and (5) they rely on a “misguided” adaptation–byproduct dichotomy (Killin, Brusse, Currie, & Planer [Killin et al.]) that “do[es] not reflect the nuance of current evolutionary thinking” (Rendell et al.). We will return to this adaptation–byproduct dichotomy in detail in section R4, as it is a primary source of confusion and disagreement.

Our social bonding account incorporates some discussion of ways music may function as an honest social signal (e.g., of social or cultural background), and how this likely contributes to social bonding, rather than simply reflecting pre-existing bonds (sects 2.2.4 and 3.3). Dubourg, André, & Baumard; Harrison & Seale; Kennedy & Radford; and Killin et al. further argue that the two hypotheses are even more complementary than we had implied, suggesting that the social bonding hypothesis would be enhanced by more explicitly integrating the role of signaling. However, as Rendell et al. put it: “one surely has to have a social bond before one can credibly signal about it,” a sequence also endorsed by Benítez-Burraco and Hattori. Popescu, Oesch & Buck go even further to characterize credible signaling as “a special case of [social bonding], albeit that signalling focuses on between-groups and social-bonding focuses on within-group relations,” a distinction also echoed by Hansen & Keller.

After reading the commentaries, we agree that credible signaling should be integrated into our hypothesis to more explicitly account for interactions between groups. Such integration follows naturally from our discussion in section 6.4 of our target article on “Parochial altruism and outgroup exclusion,” and from Figure 1 in our target article, which showed that we see the “war songs and lullabies” (Washington State University, 2020) championed by Mehr et al. as “sub-components of a broader social bonding function.” This also is consistent with Pinker's critique that Mehr et al.'s (Reference Mehr, Singh, Knox, Ketter, Pickens-Jones, Atwood and Glowacki2019) own study found that war songs and lullabies were not more widespread than any of the other 18 genres that they analyzed, all 20 of which we argued represent different expressions of social bonding.

Importantly, Mehr and colleagues' critiques are directed at an omnibus “social bonding hypothesis” for which they list 33 references, not including our own (Mehr et al., sect. 3.2, para. 1). This means that many of their critiques do not apply to our current hypothesis (which was intended to extend and clarify previous study). For example, their argument that “the” social bonding hypothesis conflates proximate- and ultimate-level reasoning does not bear on our proposal: We explicitly distinguish between functional and mechanistic levels of explanation, and add phylogenetic and ontogenetic levels (cf. Fig. 2 in our target article). The same applies regarding their requirement for genetic group selection in the evolution of musicality: this is Steven Brown's hypothesis, not ours (we explicitly eschew any such requirement, see section 6.2 in our target article and section R3.2).

There are three major specific differences between our and Mehr and colleagues' arguments: (1) We posit a broad and inclusive hypothesis about the adaptive functions of musicality (which includes both the infant-directed songs and coalition signaling proposed by Mehr et al. as special cases; cf. Fig. 1 in our target article). (2) We argue that the design features of music make it better suited to social bonding than other ancestral bonding mechanisms (ABMs) such as grooming, or than language. Mehr et al. assert that “language adequately provides whatever social functions grooming may have” and that “music thus appears to have no advantages over language and many disadvantages” (sect. 3.2.2, para. 5). We disagree, and our target article specifies how multiple specific features of musicality outperform the functions of group coordination and bonding relative to language or ABMs (cf. sects 2, 5.1, and 6.1 in our target article, and cf. Bowling et al.). (3) Mehr et al. see group music-making as broadcasting an honest signal of social bonds, but crucially argue that these bonds are formed through some other unspecified means. In contrast, we see music as providing a medium or domain in which such bonds can be developed and strengthened, and see this as parsimoniously related to the idea that music also serves as a signal of these bonds.

By Mehr et al.'s hypothesis, group singing is a simple, direct signal of coalitionary strength, directed outside the group, that indexes past practice: “a high level of synchronous coordination among signalers requires considerable effort to achieve” (sect. 4.2.1, para. 4) If so, why does group singing have features, such as steady rhythm, that make it easy for an outsider to join in (cf. Wood)? Why isn't maximization of raw acoustic energy – an honest signal of group size and coordination, achieved by simultaneous calling in many insects and frogs (Greenfield, Reference Greenfield2005) – the norm in group performances? By our account, rhythm provides a rich domain enabling multiple types of meaningful social interactions, including “crutches” allowing easy engagement (e.g., isochronicity), AND space for individual embellishment and showing high levels of skill (e.g., meter), AND the potential for cultural embellishments that could serve as shibboleths for group membership. For example, Balkan additive meters can be easily parsed by infants but are difficult to process for North American adults (Hannon & Trehub, Reference Hannon and Trehub2005) – just the developmental characteristics expected for a shibboleth. Each of these expressive channels can serve both inter- and intra-group signaling, and it seems procrustean to single out one as the “proper function” (cf. Gingras) – particularly once cultural evolutionary processes are overlain on ordinary biological evolution by genetic change. Finally, Mehr et al.'s argument that stress-reduction is “superfluous” because “the net fitness benefits of sociality exceed those of solitary life” ignores the fact that once group living is established in a species (as it is for most terrestrial primates; cf. Shultz et al., Reference Shultz, Opie and Atkinson2011) any additional adaptations that further reduce the costs of group living and/or increase its benefits will be selected (e.g., better cooperation for group defense or hunting; cf. Bowling et al.).

In summary, we do not see our hypothesis as “diametrically opposed” to that of Mehr et al. (contra Kennedy & Radford), but rather see ours as a broader and more inclusive superset, encompassing aspects of the hypotheses of Mehr et al. and many others.

R3.2 Multilevel selection

The idea that social bonding and credible signaling may be working in parallel at within- and between-group levels provides a potential solution to the issue of multilevel selection raised by Brown; Eirdosh & Hanisch; and Moser, Ackerman, Dayer, Proksch, & Smaldino. These authors were not convinced by our brief attempt in section 6.2 to side-step long-standing debates about group selection by arguing it is not required for our hypothesis. Eirdosh & Hanisch, in particular, argue that the social bonding hypothesis logically requires us to embrace group selection, because “one would be hard pressed to argue that [social bonding] functions of musicality increase the relative fitness of individuals compared to their (presumably equally socially bonding) group members.” We disagree: This statement assumes that musical performance bonds all group members identically. In contrast, (1) within any group individual variation exists, and (2) individuals can and do form sub-groups who share stronger bonds than with others in the group. Individual selection at a local level, because of some group members accruing more or stronger bonds than others, can drive the genetic evolution of musicality without the need for genetic group selection at a global level (although it obviously does not preclude additional between-group selection).

Although we embrace cultural group selection (Boyd & Richerson, Reference Boyd and Richerson1985; Richerson et al., Reference Richerson, Baldini, Bell, Demps, Frost, Hillis and Zefferman2016), we think it is crucial to distinguish this from the genetic group selection endorsed by David Sloan Wilson and colleagues (Eldakar & Wilson, Reference Eldakar and Wilson2011; Sober & Wilson, Reference Sober and Wilson1998), particularly when gene-culture coevolution is under discussion (Brown & Richerson, Reference Brown and Richerson2014; West, Griffin, & Gardner, Reference West, Griffin and Gardner2008). Despite some differences among us (the target authors) regarding our enthusiasm for multi-level selection theory, we agree in rejecting Eirdosh & Hanisch's claim that it is logically necessary for our hypothesis to work.

R3.3 Signaling theory

Contra Kennedy & Radford, we neither reject signaling theory, nor dispute the idea that music conveys information. At issue here is what type of information music conveys, and to whom. We find Mehr et al.'s claim that we focus “on the neurobiology of the performers, rather than…information encoded in music” a false dichotomy: both domains are important and interact, as shown in Figure 2 in our target article. Indeed, as noted by Margulis, we specifically gathered a team of authors with expertise spanning neuroscience, musicology, psychology, anthropology, evolutionary biology, and other fields in order to synthesize these domains and avoid such dichotomies. We see no compelling reason to choose between neuroscience and signaling (cf. Killin et al.; Rendell et al.).

By our hypothesis, information concerning rhythm (e.g., tempo and meter) and melody/harmony (e.g., pitch range and key) is crucial to achieve synchronization and coordination, and thus to achieve optimal social coordination and bonding within a group. This is echoed by Grahn, Bauer, & Zamm (Grahn et al.), with the amendment that although entrainment of bodies and minds may be a key mechanism by which music confers its effects on social bonding, accurate entrainment ability may not be required for such effects. We see musical information as directly serving social bonding functions, rather than solely signaling extra-musical information (e.g., group size or coalition strength) as Mehr et al. hypothesize. However, this does not prevent other listeners from extracting extra-musical information from a performance (e.g., about the sex of performers or group size). Instead, we suggest that such extraction is not necessary for music to have adaptive value.

Turning to the costs of musical signals, we disagree with Kennedy & Radford that high costs are required to “maintain the credibility of diverse signals across the natural world.” Despite its remarkable persistence, Zahavi's “handicap principle” that high costs are required to maintain honesty is argued by some to be a fallacy (Maynard Smith, Reference Maynard Smith1976; Penn & Számadó, Reference Penn and Számadó2020; Számadó, Reference Számadó2011). Low-cost signaling can be evolutionarily stable whenever interests are aligned (e.g., among relatives because of inclusive fitness benefits; Bergstrom & Lachmann, Reference Bergstrom and Lachmann1998), and in so-called “indices,” physical or anatomical constraints that can enforce honesty with zero handicap or “strategic” costs (Fitch & Hauser, Reference Fitch, Hauser, Simmons, Popper and Fay2002; Maynard Smith & Harper, Reference Maynard Smith and Harper2003).

We certainly agree that evolutionary models for musicality should take the costs of signaling into account. Unfortunately, there is very little empirical data upon which to base such theorizing. Human vocalization is in general low cost; for quiet speech this cost is almost unmeasurable (Moon & Lindblom, Reference Moon and Lindblom2003). Based on physiological principles (Titze, Reference Titze1994) and animal research, loud singing is somewhat more metabolically costly than normal speech (Oberweger & Goller, Reference Oberweger and Goller2001; Ward, Speakman, & Slater, Reference Ward, Speakman and Slater2003), and vigorous dancing is probably an order of magnitude more metabolically costly than song. Accepting this presumed ranking, we might hypothesize that high-cost dance can serve as a more honest signal of current energy and investment than lower cost song. Song may instead signal past practice, knowledge, cultural embeddedness, or other social information. Further empirical data are required to ground and test this or similar hypotheses.

Finally, the apparent disagreement between us and Mehr et al. on the intended recipient of the musical signal may reflect a false dichotomy. By our argument, the musical signal is primarily directed within the group, and for Mehr et al. it is directed to other, competing groups. But, even a signal “intended” by its emitter for a particular listener can be intercepted by an eavesdropper (McGregor & Dabelsteen, Reference McGregor, Dabelsteen, Kroodsma and Miller1996), and the resulting effects (positive or negative) can in turn lead to selection on the original signal (Ryan, Reference Ryan1985). Thus it seems reasonable to accept that music plays both intra- and inter-group signaling roles.

R3.4 Sexual selection mechanisms cannot be ruled out

Several commentators were unconvinced by Mehr et al.'s argument that the sexual selection hypothesis is refuted by a lack of musical sex differences in humans. Merker and Verpooten & Eens noted that sex differences are not necessarily required for sexual selection, whereas Bowling et al. note that the human voice is in fact unexpectedly sexually dimorphic relative to other primates. Although Mehr et al. argue that “A lone report of sex differences in the frequency of music performance across human societies (Savage et al. Reference Savage, Brown, Sakai and Currie2015) is likely the result of sampling bias,” we note that the predominance of male performers is replicated in other studies by Mehr and colleagues involving a “representative sample of human music” (Mehr & Singh et al., Reference Mehr, Singh, York, Glowacki and Krasnow2018, Reference Mehr, Singh, Knox, Ketter, Pickens-Jones, Atwood and Glowacki2019).Footnote 2

We emphasize that cross-cultural sex differences in the frequency of music performance among humans are more likely because of the cultural evolution of patriarchal restrictions on female performance than to biology (Savage et al., Reference Savage, Brown, Sakai and Currie2015).Footnote 3 However, as we have described, such cultural evolution can have feedback effects on the biological evolution of musicality. We restate our position from section 6.5 of our target article that we do not reject the sexual selection hypothesis and that we encourage cross-species and other comparative analyses that might enable quantification and testing of the relative effects of sexual selection, social bonding, and other factors on the evolution of musicality.

We found Merker's statement that we believe “not one of these [mechanisms of musicality] evolved by ordinary natural or sexual selection” puzzling. Our hypothesis is not a blanket appeal to the Baldwin effect for all aspects of the evolutionary process. We fully agree that “ordinary” natural and/or sexual selection must have played a role during certain stages in the protracted evolution of musicality. For example, we agree that vocal learning is a central capacity for musicality, and that the underlying neural circuitry had to evolve biologically (both in humans and other species). We simply observe that, once vocal learning is in place, cultural evolution becomes almost inevitable, and posit that in some cases this could modify selective regimes (“niche construction”), leading to gene-culture coevolution.

R3.5 The evolutionary age of musicality

A surprising number of commentators accepted Mehr et al.'s mischaracterization of our hypothesis as proposing that “musicality arose fairly recently” on the order of “tens of thousands of years.” We made no such claim. Instead, given its universality across the world's cultures, the evolution of human musicality must have been largely completed by the time modern humans expanded out of Africa about 100,000 years ago. The sophistication of 40,000 year old bone flutes (cf. sect. 3.2 in our target article) suggests that the evolution of musicality was already far progressed at that date, and our coevolutionary model posits cycles of gene-culture coevolution preceding these dates considerably. Although hard evidence is absent, this leads us to suspect that musicality had its beginnings considerably before modern Homo sapiens, probably in Homo erectus or even earlier (Mithen, 2005). Both fossil and comparative evidence suggests that early Homo would have had the ability to make a wide range of vocalizations, body movements, and gestures, especially after the appearance of full bipedalism at c. 1.8 mya, suggesting that some initial form of proto-musicality dates back to that time. We further speculate that our extinct Neanderthal and Denisovan cousins may well have used musicality for social bonding (although a pierced bone claimed to be a Neanderthal flute from Divje Babe cave in Slovenia may simply be a carnivore-chewed bone; cf. D'Errico, Villa, Llona, & Idarraga, Reference D'Errico, Villa, Llona and Idarraga1998; Kunej & Turk, Reference Kunej, Turk, Wallin, Merker and Brown2000). A rough time period for the evolution of musicality spans over 1 million years (Tomlinson, Reference Tomlinson2015).

R4. Adaptation, byproducts, and exaptation

The point of most disagreement among commentators revolved around the venerable question of whether musicality is an adaptation or a byproduct of some other adaptation. Harrison & Seale; Leivada; Lieberman & Billingsley; Pinker; and Stewart-Williams Zhang & Shi appear to support a version of Pinker's (Reference Pinker1997) hypothesis that musicality is primarily a byproduct of language evolution (or at least felt there was not enough evidence to reject this hypothesis). Others pointed to domains other than language as the adaptive source of musicality, such as auditory scene analysis (Trainor), prediction reward (Atzil & Abramson; Kraus & Hesselmann), pre-hunt charade (Szamado), artistic symbolism (van Mulukom), hierarchical processing (Hilton, Asano, & Boeckx), and mother–infant mutuality (Dissanayake).

Mehr et al.'s arguments against byproduct explanations were largely rejected by these commentators. But, although some commentators (e.g., Harrison & Seale; Trainor) also believed that we too were trying to overturn byproduct explanations, we stated in our target article that adaptation–byproduct relationships between music, language, and other social behaviors remain “open to debate.” Rather, our goal was to move beyond the “misguided” (Killin et al.), “over-simplistic” (Rendell et al.) adaptation–byproduct dichotomy underlying earlier debates, toward a more nuanced continuum incorporating concepts such as exaptation and gene-culture coevolution. Our argument explicitly built on the proposal of Patel, who was originally one of the strongest supporters of the idea that music was a purely cultural invention (Patel, Reference Patel2008, Reference Patel and Bailar2010), but recently modified his view to include exaptation and gene-culture coevolution of musicality (Patel, Reference Patel and Honing2018). This coevolutionary approach does not reject byproduct explanations entirely; instead, as Degen (Reference Degen2020) noted, it supports “having Pinker's cheesecake and eating it too.”

We particularly wish to emphasize the important distinction between “byproducts” and “exaptations” discussed by Bowling et al.; Dissanayake; and Killin et al. We distinguish byproducts (which have no function) from exaptations (where a trait is put to new use, and is functional, but not shaped by selection for that purpose). Most of the commentators supporting variants of Pinker's byproduct hypothesis appear to miss this distinction (e.g., when Harrison & Seale offer spider webs as an example of a “byproduct account,” or when Trainor uses “byproduct” and “exaptation” interchangeably). As Darwin recognized with his famous example of lungs and swim bladders (Darwin, Reference Darwin1859), and Gould and Vrba stressed when introducing the term exaptation using examples such as feathers, most complex adaptations have gone through multiple changes in function, and thus started life as exaptations (Gould, Reference Gould1991; Gould & Vrba, Reference Gould and Vrba1982).

Note that hypotheses about common phylogenetic origins do not preclude special adaptation to a new function: the fact that mammalian middle ear bones originated as jaw bones does not make them “byproducts” of chewing (Fitch, Reference Fitch2010). They may have constituted exaptations for audition initially, but once variants were selected for this new function they became bona fide adaptations for hearing. Similarly, if Darwin was correct that music and language share a common origin, the function of this original “protolanguage/protomusic” may remain the same in the “daughter” systems (e.g., social bonding) or have changed (e.g., propositional information transfer for language and bonding via prediction enhancement for music). But, in neither case would music constitute a “byproduct” of language – more an evolutionary fellow traveler.

Asking whether “music is an adaptation” (as Mehr et al. and Stewart-Williams do) oversimplifies these issues, and obscures precisely the sorts of questions that biomusicology should be confronting, by distinguishing “music” from musicality, exaptations from byproducts, and phylogenetic from adaptive functional explanations (Tinbergen, Reference Tinbergen1963). For example, we agree with Trainor that the complex perceptual processes underlying pitch perception, where many harmonics are fused into a perceived whole indexed by its fundamental frequency, plays an important role in auditory scene analysis and probably evolved in early vertebrates in that context (Trainor, Reference Trainor and Honing2018). Their initial use in music was thus an exaptation. But, these mechanisms appear likely to have been further fine-tuned in the human musical context of group singing, as relative pitch perception is typical of most humans but not most other animals (Hoeschele et al., Reference Hoeschele, Merchant, Kikuchi, Hattori, ten Cate and Honing2018). Further evidence for the fine-tuning of pitch perception for music comes from people with congenital amusia, who have selective impairments in fine-grained pitch perception especially from the lower harmonics (Cousineau, Oxenham, & Peretz, Reference Cousineau, Oxenham and Peretz2015; Peretz et al., Reference Peretz, Ayotte, Zatorre, Mehler, Ahad, Penhune and Jutras2002), but show no impairments in pitch-based perceptual organization or auditory scene analysis (Foxton, Dean, Gee, Peretz, & Griffiths, Reference Foxton, Dean, Gee, Peretz and Griffiths2004; Peretz & Hyde, Reference Peretz and Hyde2003). Thus, even if human pitch perception started as an exaptation of scene analysis, it seems plausible that later biological evolution could have fine-tuned this mechanism to its new use in musicality and group singing.

R5. Tests, extensions, and applications

R5.1 Explaining solo music-making

We agree with Fritz; Patel & von Rueden; Wald-Fuhrmann et al. and Zentner that the role of solo music-making in our hypothesis requires explanation. But, these commentators appear to overlook the crucial point we made in section 6.5 of our target article that music is often performed by a soloist or listened to by an individual in order to bond with others, to practice prior to group music-making, or to remember past social experiences. Most of the counter-examples cited fit this mold. For example, Patel & von Rueden follow their main counter-example that “Tsimané music-making was largely solo” with the explanation that these solo songs “conveyed traditional knowledge, reinforced cultural norms, and propitiated ancestors and the guardian spirits of forest animals.” Cultural evolutionary theories of religion, prosociality, and cultural transmission would treat all of these as crucial social functions facilitated by music (Norenzayan et al., Reference Norenzayan, Shariff, Gervais, Willard, McNamara, Slingerland and Henrich2016). Similarly, Fritz's counter-example of people selecting “Desert Island Discs” they would want to listen to if stranded alone highlights the social power of solo listening. In our qualitative experience listening to this (fantastic!) show, the vast majority of music is selected specifically to cherish the memories of the most important people in the listener's life – to feel their “solitude Peopled,” in the words of Browning's epigraph. Indeed, del Mastrao et al. emphasize that musical memories are often among the last connections to others preserved by patients with Alzheimer's or other forms of dementia. We thus disagree with Zentner's claim that “if music had a social purpose, this purpose seems to have largely vanished.” This social purpose is alive and well, although it takes new forms, in solo listening.

Clearly, however, cultural evolution can have strong effects on the frequency of group music-making (cf. Scott-Philipps et al.). The recent prevalence of recorded music and headphones (Thompson & Olsen, Reference Thompson and Olsen2021) is a case in point, as we discussed in section 2.5 of our target article. Although we agree with Wald-Fuhrmann et al.'s observation that solitary musicking is “extremely common,” cross-cultural analyses show that group music-making is much more common once the effects of recent expansion of Western music and culture have been controlled for (Lomax, Reference Lomax1968; Mehr et al., Reference Mehr, Singh, Knox, Ketter, Pickens-Jones, Atwood and Glowacki2019; Savage et al., Reference Savage, Brown, Sakai and Currie2015). However, cross-cultural variation in the relative frequency of virtuosic “presentational” versus communal “participatory” musicking provides useful testing grounds for the mechanisms and predictions we outlined in section 5.2 of our target article. We welcome proposals by Benítez-Burraco; Patel & von Rueden; and others to expand and refine these predictions, including co-relationships between music and language.

We disagree with Wald-Fuhrmann et al. that “solitary musicking” is “not predicted by any of the proposed evolutionary explanations.” For instance, solitary song is typical of songbirds as they acquire and perfect their song, and there is no difficulty explaining at least some solitary human music making in the same way (“practice makes perfect”). Young birds engage in solo “subsong” and young sac-winged bats “babble” as they develop their local group's song (Knörnschild, Behr, & von Helversen, Reference Knörnschild, Behr and von Helversen2006; Marler & Peters, Reference Marler, Peters, Kroodsma, Miller and Ouellet1982). Note that a “solo” performance to an audience can also provide a group bonding experience for those attending, particularly if they dance, clap along or are otherwise engaged. Nonetheless, we agree with Patel & von Rueden that the evolution of musicality could have proceeded from originally solo/presentational performance, or that solo music today may be an offshoot of musicality originally evolved in a group/participatory context.

R5.2 Cross-species testing

A number of the most interesting commentaries suggested ways to extend and test the cross-species predictions we listed in section 5.3 of our target article. Given the facts that music does not itself fossilize (Honing) and that intra-species evidence for genetic variation in humans explicitly linked to musicality are notoriously difficult to identify (Pfordresher; Tichko, Bird, & Parker [Tichko et al.]), cross-species comparisons with extant nonhuman species may be the most promising candidate for testing many of our predictions.

The most forceful empirical challenge came from Verpooten & Eens, who offered a qualitative analysis of avian vocalizations, suggesting that species with complex social systems (e.g., the fission/fusion lifestyle typifying many parrots), tend to feature short “unmusical” calls, whereas subjectively “music-like” songs are found in many birds with simpler (e.g., monogamous) social systems. We welcome this potential comparative test, but note two distinctions important in evaluating the social bonding hypothesis. First, social complexity is difficult to measure (Bergman & Beehner, Reference Bergman and Beehner2015; Turchin et al., Reference Turchin, Currie, Whitehouse, François, Feeney, Mullins and Spencer2018), and monogamy and joint parental care pose considerable cognitive challenges relative to solitary living (Burley & Kristine, Reference Burley and Kristine2002; Lukas & Clutton-Brock, Reference Lukas and Clutton-Brock2013; Shultz et al., Reference Shultz, Opie and Atkinson2011). Second, virtually all bird species have calls – typically mostly unlearned – and these are indeed often shorter and simpler than display vocalizations such as song. Calls serve a wide variety of specific functions – food, alarm, and mobbing calls are common – and their brevity and simplicity often reflect these clear adaptive functions (Marler, Reference Marler1955). Comparing calls with songs requires caution, because they are neither homologous vocalization types, nor analogous in function (cf. Lorenz, Reference Lorenz and Lorenz1971; Peters, Reference Peters2002).

The social bonding hypothesis predicts that learned song should be more complex than unlearned song (e.g., in songbirds and suboscines), and learned calls should be more complex than unlearned calls (cf. Sewall, Young, & Wright, Reference Sewall, Young and Wright2016) and acoustic complexity in either case should increase with social complexity. Indeed by Fitch's (Reference Fitch2006) definition, learned contact calls, such as the signature whistles of dolphins, parrot contact calls, or the rhythmic codas of sperm whales, are “songs,” and indeed appear considerably more complex than typical unlearned calls, although their brevity perhaps makes the musical term “riffs” more appropriate than “songs.” Finally, comparisons of the same vocal type within a species would be valuable; for instance Freeberg (Reference Freeberg2006) found that chickadees living in larger groups use more complex (learned) calls than those in smaller groups. We strongly agree with Bowling et al.; Hattori; Ravignani; Rendell et al.; Snyder & Creanza; Tichko et al.; and Verpooten & Eens that comparative data are crucial for testing the social bonding hypothesis, but care is required in executing such analyses, as is avoiding human subjective evaluations of how “music-like” a particular vocalization is. We think the qualitative proposals by these commentators are excellent starting points for future quantitative tests of the social bonding hypothesis and alternative hypotheses.

R5.3 Extending the neurobiological mechanistic model

Several commentators pointed out potential extensions to our proposed neurobiological model regarding the mechanisms underlying musicality's social bonding functions. The multiple neuroanatomical regions highlighted in Figure 3 of our target article were not meant to provide an exhaustive list of brain regions involved in music processing, or of brain regions that relate music to social behavior, and we agree with Fritz that future iterations of this model should add more specific areas and networks. Our neuroanatomical model was meant as a starting list of candidate neurobiological systems and pathways that we know to underlie certain components of social bonding (such as identity fusion or coalition formation; cf. Sachs, FeldmanHall, & Tamir [Sachs et al.]) and the processing of musical features. We agree with Belfi that simultaneous disruption of two cognitive processes from damage to the same region (e.g., vmPFC damage) does not necessarily imply that the processes are related or the same. We also agree with Juslin that a productive way forward would be to reconcile the contributions of discrete components of the BRECVEMA framework of musical emotions (Juslin, Reference Juslin2019) with neurobiological systems such as the perception and action network, the dopaminergic reward system, and the endogenous opioid system.

Atzil & Abramson and Kraus & Hesselmann noted the importance of prediction, which Figure 2 of our target article emphasized plays a central mechanistic role in our model. We argued that prediction is key for its proximate ties to reward and learning, but agree that it also ties in with allostasis (Atzil & Abramson) and neural entrainment (Grahn et al.). However, we view the ultimate functions (enhanced within-group bonds, improved group coordination, and group membership cues) as a different level of analysis from the proximate mechanisms of prediction and reward, and the neurobiological systems outlined in Figure 3 of our target article. In our view, musicality evolved with and for social bonding via enhanced predictions; there is no need to “question the implied causality” (cf. Kraus & Hesselmann).

R5.4 Extensions and applications

A large number of commentators expressed general support for the social bonding and/or credible signaling hypotheses, and detailed how these hypotheses could be extended/applied in various ways. Such applications/extensions include: clinical applications in patients with amnesia/Alzheimer's disease (del Mastrao et al.) and neurodevelopmental disorders (Kasdan et al.); applications to music education (Morrison) and sleep research (Akkermann et al.); proposing additional behavioral experiments to explore relationships between specific musical features and specific psychological mechanisms (Sachs et al.); proposing additional cultural transmission experiments to explore mechanisms of cultural evolution (Lumaca et al.; Scott-Philipps et al.); theoretical extension to the evolution of dance (Brown), gesture (Gardiner), play (Ashley), and story-telling (Trevor & Frühholz); exploring coevolution of music and language (Benítez-Burraco); incorporation of the role of knowledge songs (Levitin); cross-cultural extensions to Chinese music (Wang & Zou); capturing variation in musicality at the levels of development (Hannon et al.), vocal production (Pfordresher), and genomes (Tichko et al.); and further details of neurobiological mechanisms including the roles of ventromedial prefrontal cortex (Belfi), the cerebellum (Fritz), oxytocin (Hansen & Keller; cf. Harvey, Reference Harvey2020), entrainment (Grahn et al.), and emotion (Gingras; Juslin). We do not have space to address each of these proposals in detail, but we are delighted our proposals have stimulated such productive extensions and we look forward to seeing the results of their proposals.

R6. Conclusion: Understanding the value of music

Why has the evolution of musicality elicited such vigorous interdisciplinary debate? Harrison & Seale; Iyer; Margulis; Pfordresher; and Pinker; all mentioned the underlying role that evolutionary theory plays in value judgments about music (and the arts, more generally). Value judgments have dogged music precisely because, as Darwin observed, its practical survival value seems so “mysterious.” This leaves funding for teaching and performing music often the first to be cut. It also results in drives by supporters to find evidence for practical, quantifiable values for music, such as benefits of music on individual health or intelligence (Biancolli, Reference Biancolli2021). However, such efforts can sometimes be overzealous or counter-productive, as in the infamously debunked “Mozart effect” (Mehr, Schachner, Katz, & Spelke, Reference Mehr, Schachner, Katz and Spelke2013; Thompson, Schellenberg, & Husain, Reference Thompson, Schellenberg and Husain2001).

We suggest that the social bonding hypothesis provides a promising framework for scientific investigation of the value of music more in terms of its social benefits, rather than individual ones. As Schellenberg put it, music is “the thing that brings people together and creates social bonding and makes us feel fantastic….If that's not enough, then I don't know what is” (Leung, Reference Leung2019). We are excited by the constructive proposals of commentators to explore these questions, and hope that our hypothesis stimulates collection of additional data to help us better understand why the authors of our epigraphs all agree on the power of music to bring people together.

Acknowledgments

We thank Sam Passmore for writing the script to create Figure R1. We thank Sam Passmore, Aniruddh Patel, Peter Harrison, Dor Shilton, Jonathan de Souza, Jessica Grahn, and the other members of the University of Western Ontario music cognition reading group for comments on earlier versions of this manuscript. We also wish to acknowledge the important contributions to this topic by Bruno Nettl (1930–2020) and Iain Morley (1975–2021).

Financial support

PES was supported by Grant-in-Aid no. 19KK0064 from the Japan Society for the Promotion of Science and startup grants from Keio University (Keio Global Research Institute, Keio Research Institute at SFC, and Keio Gijuku Academic Development Fund). PL was supported by the National Science Foundation NSF-STTR no. 1720698, NSF-CAREER no. 1945436, NSF-STTR no. 2014870, the Grammy Foundation, and startup funds from Northeastern University. BT was supported by funding from the French Agence Nationale de la Recherche (under the Investissement d'Avenir program, ANR-17-EURE-0010) while on a Visiting Fellowship at the Institute of Advanced Study Toulouse. AS was supported by the National Science Foundation under NSF-BCS no. 1749551. WTF was supported by Austrian Science Fund (FWF) DK Grant “Cognition & Communication” (W1262-B29).

Conflict of interest

None.

Footnotes

1. Mehr et al.'s primary arguments against the social bonding hypothesis were that: (1) “A ‘stress-reducing’ social bonding mechanism is superfluous,” (2) “The social bonding hypothesis conflates proximate- and ultimate-level reasoning,” and (3) “Music is poorly designed to coordinate groups.”

2. Note that this male predominance (56 songs sung by only males vs. 44 sung by only females in Mehr et al.'s Discography; 1,152 vs. 751, respectively, in their Ethnography) would be even stronger if Mehr et al. included instrumental music in addition to vocal songs (biases toward male performance are much stronger for instrumental performance than for singing; Savage et al., Reference Savage, Brown, Sakai and Currie2015). The male bias would also be stronger if Mehr et al. sampled lullabies (which are predominantly sung by women) for their Discography at rates comparable to the rates they appeared in their Ethnography (i.e., ~7% [89/1,273 song texts coded for function] lullabies found in their Ethnography vs. 25% lullabies sampled in their Discography).

3. Such restrictions may also extend to the process of documenting performance, e.g., male ethnographers may be prevented from documenting music performed by females. However, male biases were also found even for music recorded by female ethnographers (Savage et al., Reference Savage, Brown, Sakai and Currie2015).

References

Bergman, T. J., & Beehner, J. C. (2015). Measuring social complexity. Animal Behaviour, 103, 203209.CrossRefGoogle Scholar
Bergstrom, C. T., & Lachmann, M. (1998). Signaling among relatives. III. Talk is cheap. Proceedings of the National Academy of Sciences, 95(9), 51005105.CrossRefGoogle ScholarPubMed
Biancolli, A. (2021). Music aids mental health: Science shows why. Mad in America. https://www.madinamerica.com/2021/01/music-aids-mental-health-science-shows-why/.Google Scholar
Boyd, R., & Richerson, P. J. (1985). Culture and the evolutionary process. University of Chicago Press.Google Scholar
Brown, G. R., & Richerson, P. J. (2014). Applying evolutionary theory to human behaviour: Past differences and current debates. Journal of Bioeconomics, 16, 105128.CrossRefGoogle Scholar
Browning, R. (1871). Balaustion's adventure: Including a transcript from Euripides. Smith, Elder and Co.Google Scholar
Burley, N. T., & Kristine, J. (2002). The evolution of avian parental care. Philosophical Transactions of the Royal Society B, 357, 241250.CrossRefGoogle ScholarPubMed
Cousineau, M., Oxenham, A. J., & Peretz, I. (2015). Congenital amusia: A cognitive disorder limited to resolved harmonics and with no peripheral basis. Neuropsychologia, 66, 293301.CrossRefGoogle ScholarPubMed
Darwin, C. (1859). On the origin of species. John Murray.Google Scholar
Darwin, C. (1871). The descent of man, and selection in relation to sex. John Murray.Google Scholar
Degen, R. [@DegenRolf]. (2020). Great article. It seems you are having Pinker's cheesecake and eating it too. [Tweet]. Twitter. https://twitter.com/DegenRolf/status/1290191006768304130.Google Scholar
D'Errico, F., Villa, P., Llona, A. C. P., & Idarraga, R. R. (1998). A middle palaeolithic origin of music? Using cave-bear bone accumulations to assess the Divje Babe I bone “flute.” Antiquity, 72, 6576.CrossRefGoogle Scholar
Eldakar, O. T., & Wilson, D. S. (2011). Eight criticisms not to make about group selection. Evolution, 65, 15231526.CrossRefGoogle Scholar
Fitch, W. T., & Hauser, M. D. (2002). Unpacking “honesty”: Vertebrate vocal production and the evolution of acoustic signals. In Simmons, A. M., Popper, A. N. & Fay, R. R. (Eds.), Acoustic communication (pp. 65137). Springer.Google Scholar
Fitch, W. T. (2006). The biology and evolution of music: A comparative perspective. Cognition, 100(1), 173215.CrossRefGoogle ScholarPubMed
Fitch, W. T. (2010). The evolution of language. Cambridge University Press.CrossRefGoogle ScholarPubMed
Foxton, J. M., Dean, J. L., Gee, R., Peretz, I., & Griffiths, T. D. (2004). Characterization of deficits in pitch perception underlying “tone deafness.” Brain, 127(4), 801810.CrossRefGoogle Scholar
Freeberg, T. M. (2006). Social complexity can drive vocal complexity: Group size influences vocal information in Carolina chickadees. Psychological Science, 17, 557561.CrossRefGoogle ScholarPubMed
Gould, S. J. (1991). Exaptation: A crucial tool for evolutionary psychology. Journal of Social Issues, 47, 4365.CrossRefGoogle Scholar
Gould, S. J., & Vrba, E. S. (1982). Exaptation – a missing term in the science of form. Paleobiology, 8, 415.CrossRefGoogle Scholar
Greenfield, M. D. (2005). Mechanisms and evolution of communal sexual displays in arthropods and anurans. Advances in the Study of Behavior, 35, 162.CrossRefGoogle Scholar
Hannon, E. E., & Trehub, S. E. (2005). Metrical categories in infancy and adulthood. Psychological Science, 16, 4855.CrossRefGoogle ScholarPubMed
Harvey, A. R. (2020). Links between the neurobiology of oxytocin and human musicality. Frontiers in Human Neuroscience, 14, 119. https://doi.org/10.3389/fnhum.2020.00350.CrossRefGoogle ScholarPubMed
Hesse, H. (1943). Das Glasperlenspiel. Suhrkamp.Google Scholar
Hoeschele, M., Merchant, H., Kikuchi, Y., Hattori, Y., & ten Cate, C. (2018). Searching for the origins of musicality across species. In Honing, H. (Ed.), The origins of musicality (pp. 149170). MIT Press.Google Scholar
Honing, H. (Ed.). (2018). The origins of musicality. MIT Press.CrossRefGoogle Scholar
Honing, H., Cate, C., Peretz, I., & Trehub, S. E. (2015). Without it no music: Cognition, biology and evolution of musicality. Philosophical Transactions of the Royal Society B: Biological Sciences, 370, 20140088. http://doi.org/10.1098/rstb.2014.0088.CrossRefGoogle ScholarPubMed
Juslin, P.N. (2019). Musical emotions explained: Unlocking the secrets of musical affect. Oxford University Press.CrossRefGoogle Scholar
Knörnschild, M., Behr, O., & von Helversen, O. (2006). Babbling behavior in the sac-winged bat (Saccopteryx bilineata). Naturwissenschaften, 93, 451545.CrossRefGoogle Scholar
Kunej, D., & Turk, I. (2000). New perspectives on the beginnings of music: Archaeological and musicological analysis of a middle Paleolithic bone “flute.” In Wallin, N. L., Merker, B. & Brown, S. (Eds.), The origins of music (pp. 235268). The MIT Press.Google Scholar
Leung, W. (2019). B.C. students who took music classes scored higher than peers in math, science and English: study. The Globe and Mail. https://www.theglobeandmail.com/canada/article-bc-students-who-took-music-classes-scored-higher-than-peers-in-math/.Google Scholar
Lomax, A. (Ed.). (1968). Folk song style and culture. American Association for the Advancement of Science.Google Scholar
Lorenz, K. (1971). Comparative studies on the behaviour of the Anatinae. In Lorenz, K. (Ed.), Studies in animal and human behaviour vol. II (pp. 14114). Harvard University Press.Google Scholar
Lukas, D., & Clutton-Brock, T. H. (2013). The evolution of social monogamy in mammals. Science (New York, N.Y.), 341, 526530.CrossRefGoogle ScholarPubMed
Marler, P. (1955). Characteristics of some animal calls. Nature, 176, 67.CrossRefGoogle Scholar
Marler, P., & Peters, S. (1982). Subsong and plastic song: Their role in the vocal learning process. In Kroodsma, D. E., Miller, E. H. & Ouellet, H. (Eds.), Acoustic communication in birds, Vol. 2 : Song learning and its consequences (pp. 2550). Academic Press.Google Scholar
Maynard Smith, J. (1976). Sexual selection and the handicap principle. Journal of Theoretical Biology, 57, 239242.CrossRefGoogle Scholar
Maynard Smith, J., & Harper, D. (2003). Animal signals. Oxford University Press.Google Scholar
McGregor, P. K., & Dabelsteen, T. (1996). Communication networks. In Kroodsma, D. E. & Miller, E. H. (Eds.), Ecology and evolution of acoustic communication in birds (pp. 409425). Cornell University Press.Google Scholar
Mehr, S. A., Schachner, A., Katz, R. C., & Spelke, E. S. (2013). Two randomized trials provide no consistent evidence for nonmusical cognitive benefits of brief preschool music enrichment. PLoS ONE, 8(12), e82007. https://doi.org/10.1371/journal.pone.0082007.CrossRefGoogle ScholarPubMed
Mehr, S. A., Singh, M., York, H., Glowacki, L., & Krasnow, M. M. (2018). Form and function in human song. Current Biology, 28, 356368.CrossRefGoogle ScholarPubMed
Mehr, S. A., Singh, M., Knox, D., Ketter, D. M., Pickens-Jones, D., Atwood, S., … Glowacki, L. (2019). Universality and diversity in human song. Science, 366(6468), 957970. https://doi.org/10.1126/science.aax0868.CrossRefGoogle ScholarPubMed
Miller, G. F. (2000). Evolution of human music through sexual selection. In Wallin, N. L., Merker, B. & Brown, S. (Eds.), The origins of music (pp. 329360). MIT Press.Google Scholar
Moon, S.-J., & Lindblom, B. (2003). Two experiments on oxygen consumption during speech production: vocal effort and speaking tempo. Proceedings of the 15th International Congress of the Phonetic Sciences, Barcelona, pp. 31293132.Google Scholar
Norenzayan, A., Shariff, A. F., Gervais, W. M., Willard, A. K., McNamara, R. A., Slingerland, E., & Henrich, J. (2016). The cultural evolution of prosocial religions. Behavioral and Brain Sciences, 39(e1). https://doi.org/10.1017/S0140525X14001356.Google ScholarPubMed
Oberweger, K., & Goller, F. (2001). The metabolic cost of birdsong production. Journal of Experimental Biology, 204, 33793388.CrossRefGoogle ScholarPubMed
Patel, A. D. (2010). Music, biological evolution, and the brain. In Bailar, M. (Ed.), Emerging disciplines (pp. 91144). Rice University Press.Google Scholar
Patel, A. D. (2008). Music, language and the brain. Oxford University Press.Google Scholar
Patel, A. D. (2018). Music as a transformative technology of the mind: An update. In Honing, H. (Ed.), The origins of musicality (pp. 113126). MIT Press.Google Scholar
Penn, D. J., & Számadó, S. (2020). The handicap principle: How an erroneous hypothesis became a scientific principle. Biological Reviews, 95, 267290.CrossRefGoogle Scholar
Peretz, I., Ayotte, J., Zatorre, R. J., Mehler, J., Ahad, P., Penhune, V. B., & Jutras, B. (2002). Congenital amusia: A disorder of fine-grained pitch discrimination. Neuron, 33(2), 185191.CrossRefGoogle ScholarPubMed
Peretz, I., & Hyde, K. L. (2003). What is specific to music processing? Insights from congenital amusia. Trends in Cognitive Sciences, 7, 362–336.CrossRefGoogle ScholarPubMed
Peters, G. (2002). Purring and similar vocalizations in mammals. Mammal Review, 32, 245271.CrossRefGoogle Scholar
Pinker, S. (1997). How the mind works. Norton.Google Scholar
Poggi, J. (2013). Interview: Sean “Diddy” Combs says you'll soon tune into Revolt TV, and so will brands. AdAge. https://adage.com/article/special-report-music-and-marketing/interview-sean-combs-revolt-tv/244350.Google Scholar
Richerson, P., Baldini, R., Bell, A., Demps, K., Frost, K., Hillis, V., … Zefferman, M. (2016). Cultural group selection plays an essential role in explaining human cooperation: A sketch of the evidence. Behavioral and Brain Sciences, 39, e30. http://doi.org/10.1017/S0140525X1400106X.CrossRefGoogle Scholar
Rousseau, J.-J. (1760/1998). Essay on the origin of languages and writings related to music (J. T. Scott [ed. & trans.]). University Press of New England.Google Scholar
Ryan, M. J. (1985). The túngara frog, A study in sexual selection and communication. University of Chicago Press.Google Scholar
Savage, P. E. (2019b). Universals. In Sturman, J. L. (Ed.), The Sage international encyclopedia of music and culture (pp. 22822285). Sage Publications. http://doi.org/10.4135/9781483317731.n759.Google Scholar
Savage, P. E., Brown, S., Sakai, E., & Currie, T. E. (2015). Statistical universals reveal the structures and functions of human music. Proceedings of the National Academy of Sciences of the USA, 112(29), 89878992.CrossRefGoogle ScholarPubMed
Sewall, K. B., Young, A. M., & Wright, T. F. (2016). Social calls provide novel insights into the evolution of vocal learning. Animal Behaviour, 120, 163172.CrossRefGoogle ScholarPubMed
Shultz, S., Opie, C., & Atkinson, Q. D. (2011). Stepwise evolution of stable sociality in primates. Nature, 479(7372), 219222.CrossRefGoogle ScholarPubMed
Sober, E., & Wilson, D. S. (1998). Unto others: The evolution and psychology of unselfish behavior. Harvard University Press.Google Scholar
Számadó, S. (2011). The cost of honesty and the fallacy of the handicap principle. Animal Behaviour, 81, 310.CrossRefGoogle Scholar
Tinbergen, N. (1963). On aims and methods of ethology. Zeitschrift Für Tierpsychologie, 20, 410433.CrossRefGoogle Scholar
Thompson, W. F., & Olsen, K. N. (Eds.). (2021). The science and psychology of music: From Beethoven at the office to Beyoncé at the gym. Greenwood.Google Scholar
Thompson, W. F., Schellenberg, E. G., & Husain, G. (2001). Arousal, mood, and the Mozart effect. Psychological Science, 12(3), 248251.CrossRefGoogle ScholarPubMed
Titze, I. R. (1994). Principles of voice production. Prentice Hall.Google Scholar
Tomlinson, G. (2015). A million years of music: The emergence of human modernity. MIT Press.CrossRefGoogle Scholar
Trainor, L. J. (2018). The origins of music: Auditory scene analysis, evolution, and culture in music creation. In Honing, H. (Ed.), The origins of musicality (pp. 81112). MIT press.Google Scholar
Turchin, P., Currie, T. E., Whitehouse, H., François, P., Feeney, K., Mullins, D., … Spencer, C. (2018). Quantitative historical analysis uncovers a single dimension of complexity that structures global variation in human social organization. Proceedings of the National Academy of Sciences of the USA, 115(2), E144E151.CrossRefGoogle ScholarPubMed
Wallin, N. L., Merker, B., & Brown, S. (Eds.). (2000). The origins of music. MIT Press.Google Scholar
Ward, S., Speakman, J. R., & Slater, P. J. B. (2003). The energy cost of song in the canary, Serinus canaria. Animal Behaviour, 66, 893–890.CrossRefGoogle Scholar
Washington State University. (2020). War songs and lullabies behind origins of music. Science Daily. www.sciencedaily.com/releases/2020/10/201026095422.htm.Google Scholar
West, S. A., Griffin, A. S., & Gardner, A. (2008). Social semantics: How useful has group selection been? Journal of Evolutionary Biology, 21, 374385.CrossRefGoogle Scholar
Figure 0

Figure R1. Visual comparison of the 60 commentaries responding to the pair of target articles, based on our subjective evaluation of the degree to which they are supportive or critical of each target article. Figure R1 plots the average of subjective ratings by PES and PL on a scale from −10 (“strongly critical”) to 10 (“strongly supportive”). Agreement between the two raters was high (intraclass correlation coefficient (ICC) = 0.89). See github.com/comp-music-lab/social-bonding for full data and code. Responses published with our target article are ordered using numbers (1–35; colored blue), whereas those published with Mehr et al. are ordered using letters (A–Y; colored red). Key commentaries discussed in detail in our response are highlighted in bold.

Figure 1

Table R1. List of the 60 commentaries accompanying target articles by Mehr et al. and ourselves