Isochrony, vocal learning, and the acquisition of rhythm and melody

Andrea Ravignani

doi:10.1017/S0140525X20001478

Isochrony, vocal learning, and the acquisition of rhythm and melody

Published online by Cambridge University Press: 30 September 2021

Andrea Ravignani

Show author details

Andrea Ravignani*: Affiliation:
Comparative Bioacoustics Group, Max Planck Institute for Psycholinguistics, Nijmegen, 6525 XD, The Netherlands. andrea.ravignani@mpi.nl; https://www.mpi.nl/people/ravignani-andrea

Article contents

Abstract
References

Rights & Permissions

Abstract

A cross-species perspective can extend and provide testable predictions for Savage et al.'s framework. Rhythm and melody, I argue, could bootstrap each other in the evolution of musicality. Isochrony may function as a temporal grid to support rehearsing and learning modulated, pitched vocalizations. Once this melodic plasticity is acquired, focus can shift back to refining rhythm processing and beat induction.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 44 , 2021 , e88

DOI: https://doi.org/10.1017/S0140525X20001478 [Opens in a new window]
Copyright: Copyright © The Author(s), 2021. Published by Cambridge University Press

Musicality consists of the (neuro)biological underpinnings to perceive and produce music. Research in the evolution of musicality needs cross-species evidence. As a parallel, to understand the evolution of bat wings, one asks why all other mammals lack wings and why other flying animals have evolved them. Similarly, our species only constitutes one datapoint to construct evolutionary hypotheses on musicality. Comparisons with other species are necessary to avoid post-hoc explanations of evolutionary traits.

Four concepts discussed in Savage et al. are key for understanding musicality, both in humans and other animals (Fig. 1). Isochrony describes metronomic temporal regularity, similar to the ticking of a clock (Merker, Madison, & Eckerdal, Reference Merker, Madison and Eckerdal2009; Ravignani & Madison, Reference Ravignani and Madison2017). Synchrony is the perfect co-occurrence in time of two series of events, with no strong teleological or mechanistic focus (Kotz, Ravignani, & Fitch, Reference Kotz, Ravignani and Fitch2018; Ravignani, Reference Ravignani2017). Vocal learning is the ability to learn and modify non-innate vocalizations, including melodies (Lattenkamp & Vernes, Reference Lattenkamp and Vernes2018). Beat induction denotes a top-down capacity to induce a regular pulse from music and move in synchrony to it (Grahn & Brett, Reference Grahn and Brett2007; Honing, Reference Honing2012).

Figure 1. Conceptualization of the four abilities partly explored in the target articles plus a fifth one, vocal rhythms, which deserves entering the discussion. Isochrony, when present in acoustic or motoric behaviors, may provide a clear, extremely predictable temporal grid, similar to squared notebooks guiding children who learn how to write. An isochronous pattern is, per se, neither musical nor demanding to produce or perceive. Isochrony has low entropy, definitely lower than expected for “musical” patterns (Milne & Herff, Reference Milne and Herff2020; Ravignani & Madison, Reference Ravignani and Madison2017). Production of isochrony can result from a motoric behavior entraining to a neural oscillator. Perception of isochrony requires, at least, comparing pairs of temporal intervals, an ability found in several species (e.g., Church & Lacourse, Reference Church and Lacourse1998; Heinrich, Ravignani, & Hanke, Reference Heinrich, Ravignani and Hanke2020; Ng, Garcia, Dyer, & Stuart-Fox, Reference Ng, Garcia, Dyer and Stuart-Fox2020). Although isochrony is characterized by equal timing in a series of events, synchrony requires pairwise coincidence of events from two series, neither of which needs to be isochronous (Ravignani, Reference Ravignani2017). Given an acoustic sequence (black), beat induction consists of inferring an isochronous pulse (gray), which need not physically exist in the sequence (Honing, Reference Honing2012; Kotz et al., Reference Kotz, Ravignani and Fitch2018). Synchronization differs from beat induction in being independent from isochrony, relatively inflexible, achievable for a narrow range of tempi and unimodal (Patel, Iversen, Bregman, & Schulz, Reference Patel, Iversen, Bregman and Schulz2009). Vocal learning – here with emphasis in its spectral domain – includes, among other things, the capacity to copy (gray) a vocal signal (black) (Lattenkamp & Vernes, Reference Lattenkamp and Vernes2018; Wirthlin et al., Reference Wirthlin, Chang, Knörnschild, Krubitzer, Mello, Miller and Yartsev2019). A vocal rhythm (black) is a temporal pattern of events, which conveys most information in the temporal domain (Ravignani et al., Reference Ravignani, Kello, De Reus, Kotz, Dalla Bella, Méndez-Aróstegui and de Boer2019) and could also be learnt or imitated (gray).

Do other animals have these capacities supporting musicality? Isochrony appears in many species' communication (e.g., from lobster rattles to sea lion barks: Patek & Caldwell, Reference Patek and Caldwell2006; Schusterman, Reference Schusterman1977), autonomously-regulated behavior or (neuro)physiology. Synchrony is widespread but scattered across taxonomic groups (Ravignani, Bowling, & Fitch, Reference Ravignani, Bowling and Fitch2014; Wilson & Cook, Reference Wilson and Cook2016). Vocal learning is rare but potentially arose multiple times in evolution because of different pressures across species (Garcia & Ravignani, Reference Garcia and Ravignani2020; Martins & Boeckx, Reference Martins and Boeckx2020; Nowicki & Searcy, Reference Nowicki and Searcy2014). Beat induction has only been found in a few animals, as acknowledged by Savage and colleagues (Kotz et al., Reference Kotz, Ravignani and Fitch2018; cf. Mehr et al., claiming its presence in many species).

Savage and colleagues briefly characterize these four abilities; this invites discussion of cross-species implications and predictions as to how they evolved to support musicality. I add a fifth, still largely unexplored capacity: vocal rhythms, which consist of producing, perceiving, learning, or imitating signals with accuracy in the temporal – as opposed to the spectral – domain. Although this capacity to precisely time one's vocalizations is related to its spectral counterpart, vocal rhythms also have their own mechanistic and communicative value (Wirthlin et al., Reference Wirthlin, Chang, Knörnschild, Krubitzer, Mello, Miller and Yartsev2019). I argue that, across species, these five capacities are linked, mapping them to Savage et al.'s framework.

The core of Savage et al.'s idea of melodic and rhythmic musicality features vocal learning and beat induction. These are also at the core of an influential hypothesis in evolutionary neuroscience (Patel, Reference Patel2006), predicting in some cases their joint co-occurrence across species. However, a few outlier species point to a mismatch between the current data and the hypothesis' predictions (Cook, Rouse, Wilson, & Reichmuth, Reference Cook, Rouse, Wilson and Reichmuth2013), requiring an updated theoretical framework.

Within Savage et al.'s framework, I argue that rhythm and melody may have bootstrapped each other in humans and other species gradually, especially in social interactions, such as chorusing, turn-taking, and so forth (Christophe, Millotte, Bernal, & Lidz, Reference Christophe, Millotte, Bernal and Lidz2008; Hannon & Johnson, Reference Hannon and Johnson2005; Höhle, Reference Höhle2009; Ravignani et al., Reference Ravignani, Bowling and Fitch2014). An isochronous sequence, such as the repetitive bark of a sea lion, provides a temporal grid of predictable sound events. Both the producer of an isochronous rhythm and its conspecifics can rely on this periodicity to learn and experiment in the spectral, hence melodic, domain during vocal learning: vocal emissions could be anchored to the onsets of the isochronous sequence (Merker et al., Reference Merker, Madison and Eckerdal2009). Hence, rhythmic isochrony may function as temporal grid to rehearse learnt vocalizations (and possibly orient attention; Bolger, Coull, & Schön, Reference Bolger, Coull and Schön2014; Cason, Astésano, & Schön, Reference Cason, Astésano and Schön2015; Jones, Reference Jones, Nobre and Coull2010; Norton, Reference Norton2019). In turn, learnt, consolidated vocalizations may serve as a “spectral anchor” to segment conspecifics' temporal sequences (Hyland Bruno, Reference Hyland Bruno2017; Lipkind et al., Reference Lipkind, Marcus, Bemis, Sasahara, Jacoby, Takahasi and Tchernichovski2013), also generating vocal rhythms. Therefore, melodic templates acquired via vocal learning can afford increased attentional or cognitive resources spent on the rhythmic domain, including temporal segmentation and regularization. This provides a bootstrapping mechanism for Savage et al.'s co-evolutionary dynamics to work, and a testbench for some signaling hypotheses in Mehr and colleagues.

This hypothesis generates several testable predictions. First, by testing species along the vocal learning continuum (Martins & Boeckx, Reference Martins and Boeckx2020), and extending this continuum to beat induction, species with a stronger sense of beat should be found among those with more developed vocal learning capacities. Chickens, great apes, parrots, and humans are examples of species predicted to show, in this order, increasing abilities in both domains. Second, isochrony should go hand in hand with synchrony but not with beat induction, so that species with developed isochrony should also synchronize. Third, empirical evidence for the rhythm–melody scaffolding process (Cason et al., Reference Cason and Schön2012; Emmendorfer, Correia, Jansma, Kotz, & Bonte, Reference Emmendorfer, Correia, Jansma, Kotz and Bonte2020) could be obtained from large-scale developmental datasets, which should feature both humans and nonhuman animals, and contain data from as many capacities as possible from Figure 1. As ontogeny sometimes recapitulates phylogeny (e.g., Heldstab, Isler, Schuppli, & van Schaik, Reference Heldstab, Isler, Schuppli and van Schaik2020), one would test whether the same stepwise processes hypothesized above appear in the first years of human life (Höhle, Reference Höhle2009). Fourth, a partial neural dissociation between rhythm and melody may occur early in life and become less severe over development; the dynamics of this dissociation could be tested via longitudinal neuroimaging studies (Bengtsson & Ullén, Reference Bengtsson and Ullén2006; Salami, Wåhlin, Kaboodvand, Lundquist, & Nyberg, Reference Salami, Wåhlin, Kaboodvand, Lundquist and Nyberg2016). Fifth, within Savage et al.'s framework, physiological evidence for the rhythm–melody gradual interplay could come from measurements or manipulations of the dopaminergic reward system and the endogenous opioid system, testing whether they provide complementary, alternating effects. Finally, most of these putative links can be, following Savage et al., modulated by species-specific social factors, such us group density and social networks. Similarly, their value as honest signals can be tested to provide empirical support for Mehr et al. using, among others, methods from cultural evolution research (e.g., Lumaca et al., commentary on the target article by Mehr et al.; Miton, Vesper, Wolf, Knoblich, & Sperber, Reference Miton, Wolf, Vesper, Knoblich and Sperber2020).

To conclude, the frameworks proposed in both target articles can benefit from a finer dissection of core abilities for musicality (Fig. 1 and Honing, commentary on the target article by Savage et al.). These must then be tested across species to infer plausible evolutionary scenarios.

Acknowledgments

I am grateful to Henkjan Honing, Koen de Reus, Laura Verga, Massimo Lumaca, and Sonja Kotz for helpful discussion and feedback.

Financial support

Andrea Ravignani is supported by the Max Planck Society via an Independent Research Group Leader position.

Conflict of interest

None.

References

Bengtsson, S. L., & Ullén, F. (2006). Dissociation between melodic and rhythmic processing during piano performance from musical scores. NeuroImage, 30(1), 272–284.CrossRef Google Scholar PubMed

Bolger, D., Coull, J. T., & Schön, D. (2014). Metrical rhythm implicitly orients attention in time as indexed by improved target detection and left inferior parietal activation. Journal of Cognitive Neuroscience, 26(3), 593–605.CrossRef Google Scholar PubMed

Cason, N., Astésano, C., & Schön, D. (2015). Bridging music and speech rhythm: Rhythmic priming and audio–motor training affect speech perception. Acta Psychologica, 155, 43–50.CrossRef Google Scholar PubMed

Cason, N., & Schön, D. (2012). Rhythmic priming enhances the phonological processing of speech. Neuropsychologia, 50(11), 2652–2658.CrossRef Google Scholar

Christophe, A., Millotte, S., Bernal, S., & Lidz, J. (2008). Bootstrapping lexical and syntactic acquisition. Language and Speech, 51(1–2), 61–75.CrossRef Google Scholar PubMed

Church, R. M., & Lacourse, D. M. (1998). Serial pattern learning of temporal intervals. Animal Learning & Behavior, 26(3), 272–289.CrossRef Google Scholar

Cook, P., Rouse, A., Wilson, M., & Reichmuth, C. (2013). A California sea lion (Zalophus californianus) can keep the beat: Motor entrainment to rhythmic auditory stimuli in a non vocal mimic. Journal of Comparative Psychology, 127(4), 412.CrossRef Google Scholar

Emmendorfer, A. K., Correia, J. M., Jansma, B. M., Kotz, S. A., & Bonte, M. (2020). ERP mismatch response to phonological and temporal regularities in speech. Scientific Reports, 10(1), 1–12.CrossRef Google Scholar

Garcia, M., & Ravignani, A. (2020). Acoustic allometry and vocal learning in mammals. Biology Letters, 16(7), 20200081.CrossRef Google Scholar PubMed

Grahn, J. A., & Brett, M. (2007). Rhythm and beat perception in motor areas of the brain. Journal of Cognitive Neuroscience, 19(5), 893–906.CrossRef Google Scholar PubMed

Hannon, E. E., & Johnson, S. P. (2005). Infants use meter to categorize rhythms and melodies: Implications for musical structure learning. Cognitive Psychology, 50(4), 354–377.CrossRef Google Scholar PubMed

Heinrich, T., Ravignani, A., & Hanke, F. H. (2020). Visual timing abilities of a harbour seal (Phoca vitulina) and a South African fur seal (Arctocephalus pusillus pusillus) for sub-and supra-second time intervals. Animal Cognition, 23(5): 851–859.CrossRef Google Scholar

Heldstab, S. A., Isler, K., Schuppli, C., & van Schaik, C. P. (2020). When ontogeny recapitulates phylogeny: Fixed neurodevelopmental sequence of manipulative skills among primates. Science Advances, 6(30), eabb4685.CrossRef Google Scholar PubMed

Höhle, B. (2009). Bootstrapping mechanisms in first language acquisition. Linguistics, 47(2), 359–382.CrossRef Google Scholar

Honing, H. (2012). Without it no music: Beat induction as a fundamental musical trait. Annals of the New York Academy of Sciences, 1252(1), 85–91.CrossRef Google Scholar PubMed

Hyland Bruno, J. (2017). Song rhythm development in zebra finches. City University of New York.Google Scholar

Jones, M. R. (2010). Attending to sound patterns and the role of entrainment. In Nobre, A. C. & Coull, J. T. (eds.) Attention and time (pp. 317–330). Oxford University Press.CrossRef Google Scholar

Kotz, S. A., Ravignani, A., & Fitch, W. T. (2018). The evolution of rhythm processing. Trends in Cognitive Sciences, 22(10), 896–910.CrossRef Google Scholar PubMed

Lattenkamp, E. Z., & Vernes, S. C. (2018). Vocal learning: A language-relevant trait in need of a broad cross-species approach. Current Opinion in Behavioral Sciences, 21, 209–215.CrossRef Google Scholar

Lipkind, D., Marcus, G. F., Bemis, D. K., Sasahara, K., Jacoby, N., Takahasi, M, …Tchernichovski, O. (2013). Stepwise acquisition of vocal combinatorial capacity in songbirds and human infants. Nature, 498(7452), 104–108.CrossRef Google Scholar PubMed

Martins, P. T., & Boeckx, C. (2020). Vocal learning: Beyond the continuum. PLoS Biology, 18(3), e3000672.CrossRef Google Scholar PubMed

Merker, B. H., Madison, G. S., & Eckerdal, P. (2009). On the role and origin of isochrony in human rhythmic entrainment. Cortex, 45(1), 4–17.CrossRef Google Scholar PubMed

Milne, A. J., & Herff, S. A. (2020). The perceptual relevance of balance, evenness, and entropy in musical rhythms. Cognition, 203, 104233.CrossRef Google Scholar PubMed

Miton, H., Wolf, T., Vesper, C., Knoblich, G., & Sperber, D. (2020). Motor constraints influence cultural evolution of rhythm. Proceedings of the Royal Society B, 287(1937), 20202001.CrossRef Google Scholar PubMed

Ng, L., Garcia, J. E., Dyer, A. G., & Stuart-Fox, D. (2020). The ecological significance of time sense in animals. Biological Reviews, 96(2), 526–540.CrossRef Google Scholar PubMed

Nowicki, S., & Searcy, W. A. (2014). The evolution of vocal learning. Current opinion in Neurobiology, 28, 48–53.CrossRef Google Scholar PubMed

Norton, P. (2019). Isochronous rhythmic organization of learned animal vocalizations. Doctoral dissertation.Google Scholar

Patek, S. N., & Caldwell, R. L. (2006). The stomatopod rumble: Low frequency sound production in Hemisquilla californiensis. Marine and Freshwater Behaviour and Physiology, 39(2), 99–111.CrossRef Google Scholar

Patel, A. D. (2006). Musical rhythm, linguistic rhythm, and human evolution. Music Perception, 24(1), 99–104.CrossRef Google Scholar

Patel, A., Iversen, J., Bregman, M., & Schulz, I. (2009). Studying synchronization to a musical beat in nonhuman animals. Annals of the New York Academy of Sciences, 1169(1), 459–469.CrossRef Google Scholar PubMed

Ravignani, A. (2017). Interdisciplinary debate: Agree on definitions of synchrony. Nature, 545, 158.CrossRef Google Scholar

Ravignani, A., Bowling, D. L., & Fitch, W. (2014). Chorusing, synchrony, and the evolutionary functions of rhythm. Frontiers in Psychology, 5, 1118.CrossRef Google Scholar PubMed

Ravignani, A., Kello, C. T., De Reus, K., Kotz, S. A., Dalla Bella, S., Méndez-Aróstegui, M., …de Boer, B. (2019). Ontogeny of vocal rhythms in harbor seal pups: An exploratory study. Current Zoology, 65(1), 107–120.CrossRef Google Scholar

Ravignani, A., & Madison, G. (2017). The paradox of isochrony in the evolution of human rhythm. Frontiers in Psychology, 8, 1820.CrossRef Google Scholar PubMed

Salami, A., Wåhlin, A., Kaboodvand, N., Lundquist, A., & Nyberg, L. (2016). Longitudinal evidence for dissociation of anterior and posterior MTL resting-state connectivity in aging: Links to perfusion and memory. Cerebral Cortex, 26(10), 3953–3963.CrossRef Google Scholar PubMed

Schusterman, R. J. (1977). Temporal patterning in sea lion barking (Zalophus californianus). Behavioral Biology, 20(3), 404–408.CrossRef Google Scholar

Wilson, M., & Cook, P. F. (2016). Rhythmic entrainment: Why humans want to, fireflies can't help it, pet birds try, and sea lions have to be bribed. Psychonomic Bulletin & Review, 23(6), 1647–1659.CrossRef Google Scholar PubMed

Wirthlin, M., Chang, E. F., Knörnschild, M., Krubitzer, L. A., Mello, C. V., Miller, C. T., … Yartsev, M. M. (2019). A modular approach to vocal learning: Disentangling the diversity of a complex behavioral trait. Neuron, 104(1), 87–99.CrossRef Google Scholar PubMed

Figure 1. Conceptualization of the four abilities partly explored in the target articles plus a fifth one, vocal rhythms, which deserves entering the discussion. Isochrony, when present in acoustic or motoric behaviors, may provide a clear, extremely predictable temporal grid, similar to squared notebooks guiding children who learn how to write. An isochronous pattern is, per se, neither musical nor demanding to produce or perceive. Isochrony has low entropy, definitely lower than expected for “musical” patterns (Milne & Herff, 2020; Ravignani & Madison, 2017). Production of isochrony can result from a motoric behavior entraining to a neural oscillator. Perception of isochrony requires, at least, comparing pairs of temporal intervals, an ability found in several species (e.g., Church & Lacourse, 1998; Heinrich, Ravignani, & Hanke, 2020; Ng, Garcia, Dyer, & Stuart-Fox, 2020). Although isochrony is characterized by equal timing in a series of events, synchrony requires pairwise coincidence of events from two series, neither of which needs to be isochronous (Ravignani, 2017). Given an acoustic sequence (black), beat induction consists of inferring an isochronous pulse (gray), which need not physically exist in the sequence (Honing, 2012; Kotz et al., 2018). Synchronization differs from beat induction in being independent from isochrony, relatively inflexible, achievable for a narrow range of tempi and unimodal (Patel, Iversen, Bregman, & Schulz, 2009). Vocal learning – here with emphasis in its spectral domain – includes, among other things, the capacity to copy (gray) a vocal signal (black) (Lattenkamp & Vernes, 2018; Wirthlin et al., 2019). A vocal rhythm (black) is a temporal pattern of events, which conveys most information in the temporal domain (Ravignani et al., 2019) and could also be learnt or imitated (gray).