Analytical Approaches to Video Game Music

Part III - Analytical Approaches to Video Game Music

Introduction

Published online by Cambridge University Press: 15 April 2021

Melanie Fritsch and

Tim Summers

Edited by

Melanie Fritsch and

Tim Summers

Show author details

Melanie Fritsch: Affiliation:
Heinrich-Heine-Universität Düsseldorf
Tim Summers: Affiliation:
Royal Holloway, University of London

Book contents

Summary

One aspect of video game music that is both compelling and challenging is the question of how video game music should be studied. Game music is often sonically similar to classical music, popular musical styles and music in other media like film. Techniques from these other established fields of study can be applied to game music. Yet at the same time, game music exists as part a medium with its own particular qualities. Indeed, many such aspects of games, including their interactivity, complicate assumptions that are normally made about how we study and analyse music.

Type: Chapter
Information: The Cambridge Companion to Video Game Music , pp. 131 - 262

DOI: https://doi.org/10.1017/9781108670289 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2021

One aspect of video game music that is both compelling and challenging is the question of how video game music should be studied. Game music is often sonically similar to classical music, popular musical styles and music in other media like film. Techniques from these other established fields of study can be applied to game music. Yet at the same time, game music exists as part of a medium with its own particular qualities. Indeed, many such aspects of games, including their interactivity, complicate assumptions that are normally made about how we study and analyse music.

Relationship with Film Music

Some scholars have examined game music by making fruitful comparisons with film music, noting the similarities and differences.Footnote ¹ Neil Lerner has shown the similarities between musical techniques and materials of early cinema and those of early video game music like Donkey Kong (1981) and Super Mario Bros. (1985).Footnote ² William Gibbons has further suggested that the transition of silent film to ‘talkies’ provides a useful lens for understanding how some video games like Grandia II (2000) and the Lunar (1992, 1996) games dealt with integrating voice into their soundtracks.Footnote ³ Such parallels recognized, we must be wary of suggesting that films and games can be treated identically, even when they seem most similar: Giles Hooper has considered the varieties of different kinds of cutscenes in games, and the complex role(s) that music plays in such sequences.Footnote ⁴

Film music scholars routinely differentiate between music that is part of the world of the characters (diegetic) and music that the characters cannot hear, like typically Hollywood orchestral underscore (non-diegetic). While discussing Resident Evil 4 (2005) and Guitar Hero (2005), Isabella van Elferen has described how games complicate film models of music and diegesis. Van Elferen notes that, even if we assume an avatar-character cannot hear non-diegetic music, the player can hear that music, which influences how their avatar-character acts.Footnote ⁵ These kinds of observations reveal how powerful and influential music is in the video game medium, perhaps even more so than in film. Further, the distinct genre of music games poses a particular challenge to approaches from other fields, and reveals the necessity of a more tailored approach.

Borrowing Art Music Techniques

Just as thematic/motivic analysis has been useful in art music and film music studies, so it can reveal important insights in game music. Jason Brame analysed the recurrence of themes across Legend of Zelda games to show how the series creates connections and sense of familiarity in the various instalments,Footnote ⁶ while Guillaume Laroche and Andrew Schartmann have conducted extensive motivic and melodic analyses of the music of the Super Mario games.Footnote ⁷ Frank Lehman has applied Neo-Riemannian theory, a method of analysing harmonic movement, to the example of Portal 2 (2011).Footnote ⁸ Recently, topic theory – a way of considering the role of styles and types of musical materials – has been applied to games. Thomas Yee has used topic theory to investigate how games like Xenoblade Chronicles (2010) (amongst others) use religious and rock music for certain types of boss themes,Footnote ⁹ while Sean Atkinson has used a similar approach to reveal the musical approaches to flying sequences in Final Fantasy IV (1991) and The Legend of Zelda: Skyward Sword (2011).Footnote ¹⁰

Peter Shultz has inverted such studies, and instead suggests that some video games themselves are a form of musical analysis.Footnote ¹¹ He illustrates how Guitar Hero represents songs in gameplay, with varying degrees of simplification as the difficulty changes.

The Experience of Interactivity

Musical analysis has traditionally been able to rely on the assumption that each time a particular piece of music is played, the musical events will occur in the same order and last for approximately the same duration. Though analysts have always been aware of issues such as optional repeats, different editions, substituted parts and varying performance practices, art music analysis has often tended to treat a piece of music as highly consistent between performances. The interactive nature of games, however, means that the music accompanying a particular section of a game can sound radically different from one play session to the next. The degree of variation depends on the music programming of the game, but the way that game music is prompted by, and responds to, player action asks us to reconsider how we understand our relationships with the music in an interactive setting.

Elizabeth Medina-Gray has advocated for a modular understanding of video game music. She writes,

modularity provides a fundamental basis for the dynamic music in video games. Real-time soundtracks usually arise from a collection of distinct musical modules stored in a game’s code – each module being anywhere from a fraction of a second to several minutes in length – that become triggered and modified during gameplay. … [M]usical modularity requires, first of all, a collection of modules and a set of rules that dictate how the modules may combine.Footnote ¹²

Medina-Gray’s approach allows us to examine how these musical modules interact with each other. This includes how simultaneously sounding materials fit together, like the background cues and performed music in The Legend of Zelda games.Footnote ¹³ Medina-Gray analyses the musical ‘seams’ between one module of music and another. By comparing the metre, timbre, pitch and volume of one cue and another, as well as the abruptness of the transition between the two, we can assess the ‘smoothness’ of the seams.Footnote ¹⁴ Games deploy smooth and disjunct musical seams to achieve a variety of different effects, just like musical presence and musical silence are meaningful to players (see, for example, William Gibbons’ exploration of the careful use of silence in Shadow of the Colossus (2005)).Footnote ¹⁵

Another approach to dealing with the indeterminacy of games has come from scholars who adapt techniques originally used to discuss real-world sonic environments. Just like we might analyse the sound world of a village, countryside or city, we can discuss the virtual sonic worlds of games.Footnote ¹⁶ These approaches have two distinct advantages: they account for player agency to move around the world, and they contextualize music within the other non-musical elements of the soundtrack.

The interactive nature of games has made scholars fascinated by the experiences of engaging with music in the context of the video game. William Cheng’s influential volume Sound Play draws on detailed interrogation of the author’s personal experience to illuminate modes of interacting with music in games.Footnote ¹⁷ The book deals with player agency and ethical and aesthetic engagement with music in games. His case studies include morality and music in Fallout 3 (2008), as well as how players exert a huge amount of interpretive effort when they engage with game music, such as in the case of the opera scene in Final Fantasy VI (1994). Small wonder that gamers should so frequently feel passionate about the music of games.

Michiel Kamp has described the experience of listening to video game music in a different way, by outlining what he calls ‘four ways of hearing video game music’. They are:

1. Semiotic, when music is heard as providing the player ‘with information about gameplay states or events’;
2. Ludic, when we pay attention to the music and play ‘to the music or along with the music, such as running to the beat, or following a crescendo up a mountain’;
3. Aesthetic, when ‘we stop whatever we are doing to attend to and reflect on the music, or on a situation accompanied by music’;
4. Background, which refers to when music ‘does not attract our attention, but still affects us somehow’.Footnote ¹⁸

Kamp emphasizes the multidimensional qualities of listening to music in games, and, by implication, the variety of ways of analysing it.

‘Music Games’

In games, we become listeners, creators and performers all at once, so we might ask, what kind of musical experiences do games afford? One of the most useful places to start answering this question is in the context of so-called ‘music games’ that foreground players’ interaction with music in one way or another.

Unsurprisingly, Guitar Hero and other music games have attracted much attention from musicians. It is obvious that playing instrument-performance games like Guitar Hero is not the same as playing the instrument in the traditional way, yet these games still represent important musical experiences. Kiri Miller has conducted extensive ethnographic research into Guitar Hero and Rock Band (2007). As well as revealing that these games appealed to gamers who also played a musical instrument, she reported that players ‘emphasized that their musical experiences with Guitar Hero and Rock Band feel as “real” as the other musical experiences in their lives’.Footnote ¹⁹ With work by Henry Svec, Dominic Arsenault and David Roesner,Footnote ²⁰ the scholarly consensus is that, within the musical constraints and possibilities represented by the games, Guitar Hero and Rock Band emphasize the performative aspects of playing music, especially in rock culture, and opening up questions of music making as well as of the liveness of music.Footnote ²¹ As Miller puts it, the games ‘let players put the performance back into recorded music, reanimating it with their physical engagement and adrenaline’.Footnote ²² Most importantly, then, the games provide a new way to listen to, perform and engage with the music through the architecture of the games (not to mention the fan communities that surround them).

Beyond instrument-based games, a number of scholars have devised different methods of categorizing the kinds of musical interactivity afforded players. Martin Pichlmair and Fares Kayali outline common qualities of music games, including the simulation of synaesthesia and interaction with other elements in the game which affect the musical output.Footnote ²³ Anahid Kassabian and Freya Jarman emphasize musical choices and the different kinds of play in music games (from goal-based structures to creative free play).Footnote ²⁴ Melanie Fritsch has discussed different approaches to world-building through the active engagement of players with music in several ways,Footnote ²⁵ as well as the adoption of a musician’s ‘musical persona’Footnote ²⁶ for game design by analysing games, turning on Michael Jackson as a case study.Footnote ²⁷

Opportunities to perform music in games are not limited to those specifically dedicated to music; see, for example, Stephanie Lind’s discussion of The Legend of Zelda: Ocarina of Time (1998).Footnote ²⁸ William Cheng and Mark Sweeney have described how musical communities have formed in multiplayer online games like Lord of the Rings Online (2007) and Star Wars: Galaxies (2003).Footnote ²⁹ In the case of Lord of the Rings Online, players have the opportunity to perform on a musical instrument, which has given rise to bands and even in-game virtual musical festivals like Weatherstock, where players from across the world gather together to perform and listen to each other. These festivals are social and aesthetic experiences. These communities also exist in tandem with online fan culture beyond the boundary of the game (of which more elsewhere in the book; see Chapter 23 by Ryan Thompson).

Play Beyond Music Games

Music is important to many games that are not explicitly ‘music games’. It is indicative of the significance of music in the medium of the video game that the boundaries of the ‘music game’ genre are ambiguous. For instance, Steven Reale provocatively asks whether the crime thriller game L.A. Noire (2011) is a music game. He writes,

Guitar-shaped peripherals are not required for a game’s music to be intractable from its gameplay … the interaction of the game world with its audio invites the possibility that playing the game is playing its music.Footnote ³⁰

If games respond to action with musical material, the levels of games can become like musical scores which are performed by the gamer when they play. We may even begin to hear the game in terms of its music.

Reale’s comments echo a broader theme in game music studies: the role of play and playfulness as an important connection between playing games and playing music. Perhaps the most famous advocate for this perspective is Roger Moseley, who writes that

Like a Mario game, the playing of a Mozart concerto primarily involves interactive digital input: in prompting both linear and looping motions through time and space, it responds to imaginative engagement … [It makes] stringent yet negotiable demands of performers while affording them ample opportunity to display their virtuosity and ingenuity.Footnote ³¹

Moseley argues that ‘Music and the techniques that shape it simultaneously trace and are traced by the materials, technologies and metaphors of play.’Footnote ³² In games, musical play and game play are fused, through the player’s interaction with both. In doing so, game music emphasizes the fun, playful aspects of music in human activity more generally. That, then, is part of the significance of video game music – not only is it important for gaming and its associated contexts, but video games reveal the all-too-often-ignored playful qualities of music.

Of course, there is no single way that video game music should be analysed. Rather, a huge variety of approaches are open to anyone seeking to investigate game music in depth. This section of the Companion presents several different methods and perspectives on understanding game music.

9 Music Games

Michael L. Austin

Within the field of game studies, much ink has been spilt in the quest to define and classify games based on their genre, that is, to determine in which category they belong based on the type of interaction each game affords. Some games lie clearly within an established genre; for example, it is rather difficult to mistake a racing game for a first-person shooter. Other games, however, can fall outside the boundaries of any particular genre, or lie within the perimeters of several genres at once. Such is the case with music games. While some may argue that a game can be considered to be a music game only if its formal elements, such as theme, mechanics or objectives, centre on music, musicians, music making or another music-related activity, in practice the defining characteristics of a music game are much less clear – or rather, are much broader – than with other genres. Many game publishers, players, scholars and critics classify a game as musical simply because it features a particular genre of music in its soundtrack or musicians make appearances as playable characters, despite the fact that little-to-no musical activity is taking place within the game.

In this chapter, I outline a number of types of music games. I also discuss the various physical interfaces and controllers that facilitate musical interaction within games, and briefly highlight a number of cultural issues related to music video games. As a conclusion, I will suggest a model for categorizing music games based on the kind of musical engagement they provide.

Music Game (Sub)Genres

Rhythm Games

Rhythm games, or rhythm action games, are titles in which the core game mechanics require players to match their actions to a given rhythm or musical beat. The ways in which players interact with the game can vary widely, ranging from manually ‘playing’ the rhythm with an instrument-shaped controller, to dancing, to moving an in-game character to the beat of a game’s overworld music. Although proto-rhythm games have existed in some form since the 1970s, such as the handheld game Simon (1978), the genre became very successful in Japan during the late 1990s and early 2000s with titles such as Beatmania (1997), Dance Dance Revolution (1998) and Taiko no Tatsujin (2001). With Guitar Hero (2005), the genre came to the West and had its first worldwide smash hit. Guitar Hero achieved enormous commercial success in the mid-2000s before waning in popularity in the 2010s.Footnote ¹

Peripheral-based Rhythm Games

Perhaps the most well-known of all types of rhythm games, peripheral-based rhythm games, are those in which the primary interactive mechanics rely on a physical controller (called peripheral because it is usually an extra, external component used to control the game console). While peripheral controllers can take any number of shapes, those that control rhythm games are often shaped like guitars, drums, microphones, turntables or other musical instruments or equipment.

Quest for Fame (1995) was one of the first music games to utilize a strum-able peripheral controller – a plastic wedge called a ‘VPick’. After connecting its cord into the second PlayStation controller port, players held the VPick like a guitar pick, strumming anything they had available – a tennis racket, their own thigh, and so on – like they would a guitar. The actions would be registered as electrical impulses that registered as soundwaves within the game. Players would progress through the game’s various levels depending on their success in matching their soundwaves with the soundwaves of the in-game band.

Games in the Guitar Hero and Rock Band series are almost certainly the most well-known peripheral-based rhythm games (and arguably the most well-known music video games). Guitar Hero features a guitar-shaped peripheral controller with coloured buttons along its fretboard. As coloured gems that correspond to the buttons on the controller (and ostensibly to notes in the song being performed) scroll down the screen, players must press the matching coloured buttons in time to the music. The original Rock Band (2007) combined a number of various plastic instrument controllers, such as a guitar, drum kit and microphone, to form a complete virtual band. Titles in the Guitar Hero and Rock Band series also allow players to play a wide variety of songs and genres of music through a large number of available expansion packs.

Non-Peripheral Rhythm Games

PaRappa the Rapper (1996), considered by many to be one of the first popular music games, was a rhythm game that used the PlayStation controller to accomplish the rhythm-matching tasks set forth in the game, rather than a specialized peripheral controller.Footnote ² In-game characters would teach PaRappa, a rapping dog, how to rap, providing him with lyrics and corresponding PlayStation controller buttons. Players were judged on their ability to press the correct controller button in rhythm to the rap. Similar non-peripheral rhythm games include Vib-Ribbon (1999).

Other rhythm games aim for a middle ground between peripheral and non-peripheral rhythm games. These rely on players using physical gestures to respond to the musical rhythm. Games in this category include Wii Music (2008), which uses the Wii’s motion controls so that players’ movements are synchronized with the musical materials. Rhythm Heaven/Rhythm Paradise (2008) was released for the Nintendo DS, and to play the game’s fifty rhythm mini-games, players held the DS like a book, tapping the touch screen with their thumbs or using the stylus to tap, flick or drag objects on the screen to the game’s music.

Dance-based Rhythm Games

Dance-based rhythm games can also be subdivided into two categories. Corporeal dance-based rhythm games require players to use their bodies to dance, either by using dance pads on the floor, or dancing in the presence of sensors. Manual dance games achieve the required beat-matching through button-mashing that is synchronized with the dance moves of on-screen avatars using a traditional controller in the player’s hands. These manual dance games include titles such as Spice World (1998) and Space Channel 5 (1999).

Corporeal dance games were amongst the first rhythm games. Dance Aerobics (1987), called Aerobics Studio in Japan, utilized the Nintendo Entertainment System’s Power Pad controller, a plastic floor-mat controller activated when players stepped on red and blue circles, triggering the sensors imbedded inside. Players matched the motion of the 8-bit instructor, losing points for every misstep or rhythmic mistake. As players advanced through each class, the difficulty of matching the rhythmic movements of the instructor also increased. The ‘Pad Antics Mode’ of Dance Aerobics included a free-form musical mode called ‘Tune Up’ in which players could use the NES Power Pad to compose a melody, with each of the ten spots on the periphery of the Power Pad assigned a diatonic pitch; in ‘Mat Melodies’, players used these same spots to play tunes such as ‘Twinkle, Twinkle Little Star’ with their feet while following notes that appeared on a musical staff at the top of the screen. ‘Ditto’ mode featured a musical memory game similar to Simon.

Games in Konami’s Bemani series include many of the best known in the rhythm games genre. The term ‘Bemani’ originated as a portmanteau in broken English of Konami’s first rhythm game, Beatmania, and it stuck as a nickname for all music games produced by the publisher; the company even adopted the nickname as a brand name, replacing what was previously known as the Games & Music Division (GMD). Dance Dance Revolution, or DDR quickly became one of the most well-known games in this dance-based rhythm game genre. Usually located in public arcades, these games became spectacles as crowds would gather to watch players dance to match the speed and accuracy required to succeed within the game. DDR players stand on a special 3 x 3 square platform controller which features four buttons, each with arrows pointing forward, back, left or right. As arrows scroll upward from the bottom of the screen and pass over a stationary set of ‘guide arrows’, players must step on the corresponding arrow on the platform in rhythm with the game, and in doing so, they essentially dance to the music of the game.Footnote ³

Although the first dance-based games required players to match dance steps alone, without regard to what they did with the rest of their body, motion-based games required players to involve their entire body. For instance, Just Dance (2009) is a dance-based rhythm game for the motion-controlled Wii console that requires players to match the full-body choreography that accompanies each song on the game’s track list. As of the writing of this chapter in 2020, games were still being published in this incredibly popular series. When Dance Central (2010) was released for Microsoft Kinect, players of dance games were finally rid of the need to hold any controllers or to dance on a mat or pad controller; rather than simply stepping in a certain place at a certain time, or waving a controller in the air to the music, players were actually asked to dance, as the console’s motion-tracking sensors were able to detect the players’ motion and choreography, assessing their ability to match the positions of on-screen dancing avatars.Footnote ⁴

Musical Rail-Shooter Games

Rail-shooter games are those in which a player moves on a fixed path (which could literally be the rail of a train or rollercoaster) and cannot control the path of the in-game character/avatar or vehicle throughout the course of the level. From this path, the player is required to perform specific tasks, usually to shoot enemies while speeding towards the finish line. In musical rail-shooters, a player scores points or proves to be more successful in completing the level if they shoot enemies or execute other tasks in rhythm with music. Amongst the first of this type of game was Frequency (2001), which asked players to slide along an octagonal track, hitting controller buttons in rhythm in order to activate gems that represented small chunks of music. Other musical rail-shooters include Rez (2001), Amplitude (2003) and Audiosurf (2008).

Sampling/Sequencing and Sandbox Games

Some music games provide players with the ability to create music of their own. These games often began as music notation software or sequencing software programmes that were first marketed as toys, rather than serious music-creation software, such as Will Harvey’s Music Construction Set (1984), C.P.U. Bach (1994) and Music (1998). Otocky (1987) was designed as a music-themed side-scrolling shoot-’em-up in which players were able to fire their weapon in eight directions. Shots fired in each direction produced a different note, allowing players a small bit of freedom to compose music depending upon in which direction they shot. Mario Paint (1992) included an in-game tool that allowed players to compose music to accompany the artistic works they created within the game; this tool became so popular, it spurred on an online culture and spin-off software called Mario Paint Composer (Unfungames.com, 1992).Footnote ⁵

Electroplankton (2005) was released for the Nintendo DS console and designed as a sequencing tool/game in which players interacted with ten species of plankton (represented as various shapes) through the use of the DS’s microphone, stylus and touchscreen in the game’s ‘Performance Mode’. The species of a plankton indicated the action it performed in the game and the sound it produced. In ‘Audience Mode’, players could simply listen to music from the game. Despite its popularity, Electroplankton was not a successful music-creating platform due to its lack of a ‘save’ feature, which would have allowed players to keep a record of the music they created and share it.

KORG DS-10 (2008) was a fully fledged KORG synthesizer emulator designed to function like a physical synth in KORG’s MS range, and was created for use on the Nintendo DS platform. Other iterations, such as KORG DS-10 Plus (2009) and KORG iDS-10 (2015) were released for the Nintendo DSi and the iPhone, respectively. While it was released as synth emulator, KORG DS-10 received positive reviews by game critics because it inspired playful exploration and music creation.Footnote ⁶

Karaoke Music Games

Music games can also test a player’s ability to perform using musical parameters beyond rhythm alone. Games such as Karaoke Revolution (2003) and others in the series, as well as those in the SingStar series (2004–2017) have similar game mechanics to rhythm games, but rather than requiring players to match a steady beat, these games require players to match pitch with their voices.

Def Jam Rapstar (2010), perhaps the most critically acclaimed hip-hop-themed karaoke game, included a microphone controller in which players would rap along to a popular track’s music video (or a graphic visualization when no video was available), matching the words and rhythms of the original song’s lyrics; in instances where singing was required, a player’s ability to match pitch was also graded.

Mnemonic Music Games and Musical Puzzle Games

Many video games in this subgenre of music games have their roots in pre-digital games, and they rely on a player’s ability to memorize a sequence of pitches or to recall information from their personal experience with music. Simon, considered by many to be the first electronic music game, tested a player’s ability to recall a progressively complex sequence of tones and blinking lights by recreating the pattern using coloured buttons. Later games, such as Loom (1990) and The Legend of Zelda: Ocarina of Time (1998) relied, at least in part, on a similar game mechanic in which players needed to recall musical patterns in gameplay.

A variation on the ‘name that tune’ subgenre, Musika (2007) tapped into the library of song files players had loaded onto their iPod Touch. As they listen, players must quickly decide whether or not the letter slowly being uncovered on their screen appears in the title of that particular song. The faster they decide, the higher a player’s potential score. SongPop (2012) affords players the chance to challenge one another to a race in naming the title or performer of popular tunes, albeit asynchronously, from a multiple-choice list.

Musician Video Games

The subgenre of what we might term ‘musician video games’ are those in which musicians (usually well-known musicians that perform in popular genres) or music industry insiders become heavily involved in the creation of a video game as a vehicle to promote themselves and their work beyond the video game itself. Games might even serve as the means of distribution for their music. Musicians or bands featured in a game may go on music-themed quests, perform at in-game concerts or do a myriad number of other things that have nothing at all to do with music. A wide variety of musicians are featured in this type of game, which include Journey (1983), Frankie Goes to Hollywood (1985), Michael Jackson’s Moonwalker (1990), Peter Gabriel: EVE (1996) and Devo Presents Adventures of the Smart Patrol (1996).

Music Industry Games

In much the same way many musician video games allowed players to live the life of a rock star through an avatar, music industry-themed games put players in charge of musical empires, gamifying real-world, industry-related issues and tasks, such as budgets, publicity and promotion, and music video production. Games in this genre include Rock Star Ate My Hamster (1988), Make My Video (1992), Power Factory Featuring C+C Music Factory (1992), Virtual VCR: The Colors of Modern Rock (1992), Rock Manager (2001), Recordshop Tycoon (2010) and TastemakerX (2011).

Edutainment Music Games and Musical Gamification

Educators and video game publishers have long been keen on using video games as both a source of entertainment and as a potential tool for teaching various subjects, including music. Early edutainment games, such as Miracle Piano Teaching System (1990), sought to make learning music fun by incorporating video-game-style gameplay on a video game console or personal computer. Such endeavours must deal with the issue of how to incorporate instruments into the learning process. Some opt for MIDI interfaces (like Miracle Piano), while others use replica instruments. Power Gig: Rise of the SixString (2010) included an instrument very similar to an authentic guitar. This peripheral controller was designed to help teach players how to play an actual guitar, although it was 2/3 the size of a standard guitar and of poor quality, and besides teaching players various simplified power chords, the game really did not teach players how to play the guitar, instead simply mimicking the mechanics of Guitar Hero and other similar games. Power Gig also included a set of sensors that monitored the action of players who were air-drumming along with songs in the game. The game that has arguably come the closest to teaching game players to become actual instrument players is Ubisoft’s Rocksmith (2011) and its sequels, which allowed players to plug in an actual electric guitar of their own (or acoustic guitar with a pickup), via a USB-to-1/4 inch TRS cable, into their Xbox 360 console to be used as a game controller. The game’s mechanics are similar to those of other guitar-themed rhythm games: players place their fingers in particular locations on a fretboard to match pitch and strum in rhythm when notes are supposed to occur, rather than mashing one of four coloured buttons as in Guitar Hero-style rhythm games. Similarly, and taking the concept a step further, BandFuse: Rock Legends (2013) allows up to four players to plug in real electric guitars, bass guitars and microphones as a means of interacting with the video game by playing real music. In fact, the ‘legends’ referenced in the game’s title are rock legends who, with the aid of other virtual instructors, teach players how to play their hit songs through interactive video lessons in the game’s ‘Practice’ and ‘Shred-U’ modes.

Music Game Technology

Peripheral Controllers

Music games can frequently rely on specialized controllers or other accessories to give players the sense that they are actually making music. As previously discussed, these controllers are called ‘peripheral’ because they are often additional controllers used to provide input for a particular music game, not the standard controllers with which most other games are played. While some peripheral controllers look and react much like real, functioning instruments and give players the haptic sensation of playing an actual instrument, others rely on standard console controllers or other forms of player input.

Guitars

Games such as those in the Guitar Hero and Rock Band series rely on now-iconic guitar-shaped controllers to help players simulate the strumming and fretwork of a real guitar during gameplay. While pressing coloured buttons on the fretboard that correspond to notes appearing on the screen, players must flick a strum bar (button) as the notes pass through the cursor or target area. Some guitar controllers also feature whammy bars for pitch bends, additional effect switches and buttons and additional fret buttons or pads. These controllers can often resemble popular models of electric guitars, as with the standard guitar controller for Guitar Hero, which resembles a Gibson SG.

Drums and Other Percussion

Taiko: Drum Master (2004), known as Taiko no Tatsujin in Japan, is an arcade game that features two Japanese Taiko drums mounted to the console, allowing for a two-player mode; the home version employs the Taiko Tapping Controller, or ‘TaTaCon’, which is a small mounted drum with two bachi (i.e., the sticks used to play a taiko drum). Games in the Donkey Konga series (2003–2005) and Donkey Kong Jungle Beat (2004) require a peripheral set of bongo drums, called DK Bongos, to play on the GameCube console for which the games were created. Samba de Amigo (1999) is an arcade game, later developed for the Dreamcast console, played with a pair of peripheral controllers fashioned after maracas. Players shake the maracas at various heights to the beat of the music, positioning their maracas as indicated by coloured dots on the screen.

Turntables

Some hip-hop-themed music games based on turntablism, such as DJ Hero (2009), use turntable peripheral controllers to simulate the actions of a disc jockey. These controllers usually include moveable turntables, crossfaders and additional buttons that control various parameters within the game.

Microphones

Karaoke music games usually require a microphone peripheral in order to play them, and these controllers are notorious for their incompatibility with other games or consoles. Games such as Karaoke Revolution initially included headset microphones, but later editions featured handheld models. LIPS (2008) featured a glowing microphone with light that pulsed to the music.

Mats, Pads and Platforms

These controllers are flat mats or pads, usually placed on the ground, and in most cases, players use their feet to activate various buttons embedded in the mat. Now somewhat standardized, these mats are customarily found as 3 x 3 square grids with directional arrows printed on each square. As was previously mentioned, players use the NES Power Pad to play Dance Aerobics; this controller is a soft pad made of vinyl or plastic that can easily be rolled up and stored. Arcade music games like Dance Dance Revolution use hard pads, often accompanied by a rail behind the player to give them something to grab for stability during especially difficult dance moves and to prevent them from falling over. Some of these games now use solid-state pads that utilize a proximity sensor to detect a player’s movement, rather than relying on the pressure of the player’s step to activate the controller. DropMix (2017) is a music-mixing game which combines the use of a tabletop game plastic platform, cards with embedded microchips and a companion smartphone application to allow players to create mashups of popular songs. Using near-field communication and a smartphone’s Bluetooth capabilities, players lay the cards on particular spots on the game’s platform. Each card is colour-coded to represent a particular musical element, such as a vocal or drum track, and depending upon its power level, it will mix in or overtake the current mix playing from the smartphone. Players can also share their mixes on social media through the app.

Motion Controls

Taiko Drum Master: Drum ‘n’ Fun (2018) was released for Nintendo Switch and relies on the player using the console’s motion controls. Taking a controller in each hand, the player swings the handheld controllers downwards for red notes and diagonally for the blue notes that scroll across the screen. In Fantasia: Music Evolved (2014), based on the ‘Sorcerer’s Apprentice’ section of the film Fantasia (1940), players act as virtual conductors, moving their arms to trace arrows in time with music to accomplish goals within the game. These motions are registered by the Xbox Kinect’s motion sensors, allowing players to move freely without needing to touch a physical controller.

Wii Nunchuks

Using the Wii’s nunchuk controllers, players of Wii Music (2008) could cordlessly conduct an orchestra or play a number of musical instruments. Likewise, players of Ultimate Band (2008) played notes on a virtual guitar by pressing various combinations of buttons on the Wii nunchuk while strumming up and down with the Wii remote.

Smartphone or Portable Listening Device Touchscreens

Some games mimic the mechanics of a peripheral-based rhythm game, but without the need for the instrument controller. Phase: Your Music Is the Game (2007) is a touchscreen-based rhythm game for the Apple iPod Touch; using the music found on the iPod in the player’s song library as the playable soundtrack, players tap the iPod’s touchscreen, rather than colour-coded buttons on a plastic guitar. Likewise, Tap Tap Revenge (2008) also utilized the iPhone’s touchscreen to simulate controller-based rhythm games such as Guitar Hero.

Wider Culture and Music Games

When popular music is included within a music game, copyright and licensing can often be the source of controversy and fodder for lawsuits. In its early days, players of Def Jam Rapstar were able to record videos of themselves performing the hip-hop songs featured in the game using their consoles or computers, and could upload these videos to an online community for recognition. In 2012, EMI and other rights holders brought up charges of copyright infringement, suing the game’s makers for sampling large portions of hip-hop songs for which they owned the rights.Footnote ⁷ Because the game was a karaoke-style music game and players could further distribute the copyrighted songs in question through the online community, EMI sought even more damages, and the online community was subsequently shut down.

At the height of their popularity, rhythm games such as Rock Band and Guitar Hero featured frequently in popular culture; for example, on Season 2, Episode 15 (2009) of the popular television show The Big Bang Theory, characters are shown playing the Red Hot Chili Peppers’ song ‘Under the Bridge’ on Rock Band. As these games gained popularity in their heyday, naysayers and musical purists insisted that these games had no inherent musical value since they did not seem to encourage anyone to actually play an instrument. As such, the games were often parodied in popular culture: the premise of the South Park episode ‘Guitar Queer-O’ (Season 11, Episode 3, 2007) revolves around the supposition that games such as Guitar Hero require no real musical skills. But these kinds of music games have also enjoyed a surge in popularity in educational arenas, and have been successfully put to instructive uses in classrooms, albeit not as a replacement for instrumental tuition.Footnote ⁸ There is some evidence to suggest that music games have actually inspired players to learn to play musical instruments outside of video games. In fact, seeing game characters play instruments in music games, such as the male protagonist Link who plays the ocarina in games in the Legend of Zelda series (Nintendo), has inspired male students to study the flute.Footnote ⁹ Also related to music, gender and performance, Kiri Miller writes in Playable Bodies that music games such as Dance Central also provide opportunities for players who chose to play as avatars that do not correspond with their own gender expression to engage in ‘generic, stylized gender performances that may pose little risk or challenge to their own identities’,Footnote ¹⁰ and in doing so, may denaturalize gender binaries.

Types of Music Games

As we have seen, the term ‘music games’ covers a wide spectrum of games and subgenres. To conclude this chapter, I wish to introduce a model for categorization. While this type of analysis will never provide a definitive classification of music games, it does present a framework within which analysts can discuss what types of musical activities or opportunities a game affords its players and what types of musical play might be occurring within the course of a player’s interaction with a game.

One useful way to describe music games is by asking whether, and to what extent, the player’s musical engagement through the game is procedural (interacting with musical materials and procedures) and/or conceptual (explicitly themed around music-making contexts). These two aspects form a pair of axes which allow us to describe the musical experiences of games. It also allows us to recognize musical-interactive qualities of games that do not announce themselves as explicitly ‘music games’.

Procedural and Conceptual Musical Aspects of Games

The procedural rhetoric of a music game denotes the rules, mechanics and objectives within a game – or beyond it – that encourage, facilitate or require musical activity or interaction. For example, in the rhythm game Guitar Hero, players must perform a musical activity in order to succeed in the game; in this case, a player must press coloured buttons on the game’s peripheral guitar-shaped controller in time with the music, and with the notes that are scrolling by on the screen. Other games, such as SongPop require players to select a themed playlist, listen to music and race an opponent to select the name of the song or the artist performing the song they are hearing. It should be noted that the term procedural is used here to describe the elements of a game that facilitate a particular means or process of interaction, which is different from procedural generation, or a method of game design in which algorithms are used to automatically create visual and sonic elements of a video game based on player action and interaction within a game.

Amongst procedural games, some are strictly procedural, in that they rely on clear objectives, fixed rules and/or right or wrong answers. One such category of strictly procedural music games is that of rhythm- or pitch-matching games. Players of strictly procedural music games score points based on the accuracy with which they can match pitch or rhythm, synchronize their actions to music within the game, or otherwise comply with the explicit or implicit musical rules of the game. Loosely procedural music games rely less on rigorous adherence to rules or correct answers but rather facilitate improvisation and free-form exploration of music, or bring to bear a player’s personal experience with the game’s featured music. Games such as Mario Paint Composer or My Singing Monsters (2012) function as sandbox-type music-making or music-mixing games that allow players more freedom to create music by combining pre-recorded sonic elements in inventive ways. Other loosely procedural games take the form of quiz, puzzle or memory games that rely on a player’s individual ability to recall musical material or match lyrics or recorded clips of songs to their titles (as is the case with the previously mentioned SongPop); even though the procedural rhetoric of these games requires players to accurately match music with information about it (that is to say, there is a right and a wrong answer), players bring their own memories and affective experiences to the game, and players without these experiences are likely much less successful when playing them.

Highly conceptual music games are those in which theme, genre, narrative, setting and other conceptual elements of the game are related to music or music making. Here, content and the context provide the musical materials for a music game. This contrasts with the ‘procedural’ axis which is concerned with the way the game is controlled. Conceptually musical aspects of games recognize how extra-ludic, real-world music experiences and affective memories create or influence the musical nature of the game. This is often accomplished by featuring a particular genre of music in the soundtrack or including famous musicians as playable characters within the game. We can think of procedural and conceptual musical qualities as the differences between ‘inside-out’ (procedural) or ‘outside-in’ (conceptual) relationships with music.

Games may be predominantly procedurally musical, predominantly conceptually musical or some combination of the two. Many music games employ both logics – they are not only procedurally musical, as they facilitate musical activity, but the games’ music-related themes or narratives render them conceptually musical as well. We can also observe examples of games that are highly conceptually musical, but not procedural (like the artist games named above), and games that are procedural, but not conceptual (like games where players can attend to musical materials to help them win the game, but which are not explicitly themed around music, such as L.A. Noire (2011)).Footnote ¹¹

Types of Conceptual Musical Content

Conceptual music games rely on rhetorical devices similar to those employed in rhetorical language, such as metonyms/synecdoches, which are used to describe strong, closely related conceptual ties, and epithets, for those with looser conceptual connections. A metonym is a figure of speech in which a thing or concept is used to represent another closely related thing or concept. For example, ‘the White House’ and ‘Downing Street’ are often used to represent the entire Executive Branches of the United States and British governments respectively. Similarly, the ‘Ivy League’ is literally a sports conference comprising eight universities in the Northeastern United States, but the term is more often used when discussing the academic endeavours or elite reputations of these universities. There are also musical metonyms, such as noting that someone has ‘an ear for music’ to mean not only that they hear or listen to music well, but that they are also able to understand and/or perform it well. Further, we often use the names of composers or performers to represent a particular style or genre of music as a synecdoche (a type of metonym in which a part represents the whole). For instance, Beethoven’s name is often invoked to represent all Classical and Romantic music, both or either of these style periods, all symphonic music, or all ‘art music’ in general. Similarly, Britney Spears or Madonna can stand in for the genre of post-1970s pop music or sometimes even all popular music.

Metonymic music games are often considered musical because a prominent element of the game represents music writ large to the player or the general public, even if the procedural logic or mechanics of the game are not necessarily musical, or only include a small bit of musical interactivity. For example, The Legend of Zelda: Ocarina of Time is, by almost all accounts, an action-adventure game. Link, the game’s main character, traverses the enormous gameworld of Hyrule to prevent Ganondorf from capturing the Triforce, battling various enemies along the way. Link plays an ocarina in the game, and, as one might be able to conclude based on the game’s title, the ocarina plays a central role in the game’s plot; therefore, the game could be considered a music game. Amongst the many other varied tasks Link completes, lands he explores, items he collects, and so on, he also learns twelve melodies (and writes one melody) to solve a few music-based puzzles, allowing him to teleport to other locations. For many, the amount of musical material in the game sufficiently substantiates an argument for labelling the game as a music game. Most of the game does not involve direct interaction with music. It is limited to isolated (albeit narratively important) moments. We can therefore describe Ocarina of Time as conceptually musical in a metonymic way, but with limited procedural musical content.

Music games that are even further removed from music and music-making than metonymic music games are epithetic music games. An epithet is a rhetorical device in which an adjective or adjectival phrase is used as a byname or nickname of sorts to characterize the person, place or thing being described. This descriptive nickname can be based on real or perceived characteristics, and may disparage or abuse the person being described. In the case of Richard the Lionheart, the epithet ‘the Lionheart’ is used both to distinguish Richard I from Richards II and III, and to serve as an honorific title based on the perceived personality trait of bravery. In The Odyssey, Homer writes about sailing across the ‘wine-dark sea’, whereas James Joyce describes the sea in Ulysses using epithetical descriptions such as ‘the snot-green sea’ and ‘the scrotum-tightening sea’; in these instances, the authors chose to name the sea by focusing closely on only one of its many characteristics (in these cases, colour or temperature), and one could argue that colour or temperature are not even the most prominent or important characteristic of the sea being described.

Epithetic music games are video games classified by scholars, players and fans as music games, despite their obvious lack of musical elements or music making, due to a loose or tangential association with music through its characters, setting, visual elements, or other non-aural game assets, and so on. These games differ from metonymic games in that music is even further from the centre of thematic focus and gameplay, despite the musical nickname or label associated with them; in other words, the interaction with music within these games is only passive, with no direct musical action from the player required to play the game or interact with the game’s plot.

These epithetic connections are often made with hip-hop games. Hip-hop games are sometimes classified as music games because, aside from the obvious utilization of hip-hop music in the game’s score, many of the non-musical elements of hip-hop culture can be seen, controlled, or acted out within the game, even if music is not performed by the player, per se, through rapping/MC-ing or turntablism. Making music is not the primary (or secondary, or usually even tertiary) object of gameplay. For example, breakdancing is a para-musical activity central to hip-hop culture that can be found in many hip-hop games, but in non-rhythm hip-hop games, a player almost never directs the movements of the dancer. Boom boxes, or ‘ghetto blasters’, are also a marker of hip-hop culture and can be seen carried in a video game scene by non-player characters, or resting on urban apartment building stoops or basketball court sidelines in the backgrounds of many hip-hop-themed video games, but rarely are they controlled by the player. Games such as Def Jam Vendetta (2003), 50 Cent: Bulletproof (2005) and Wu-Tang: Shaolin Style (1999) are sometimes classified as music games due to the overt and substantial depictions of famous hip-hop artists, despite the lack of musical objectives or themes within these fighting and adventure games. Here, particular rappers are used as icons for hip-hop music and musical culture. For example, Def Jam Vendetta is a fighting game wherein hip-hop artists such as DMX, Ghostface Killah and Ludacris face off in professional wrestling matches. Similarly, the NBA Street series features hip-hop artists such as Nelly, the St. Lunatics, the Beastie Boys and others as playable characters, but since the artists are seen playing basketball, rather than engaging in musical activities, they are epithetic and loosely conceptual music games.

While it might be tempting to classify games as either procedurally or conceptually musical (and often games do tend to emphasize one or the other), this does not allow for the complexity of the situation. Some games are especially musical both procedurally and conceptually, as is the case with games such as those in the Rock Band and Guitar Hero series which require players to perform musical activities (rhythm matching/performing) in a conceptually musical gameworld (as in a rock concert setting, playing with rock star avatars, etc.). It is perhaps more helpful to consider the two elements as different aspects, rather than mutually exclusive. The procedural–conceptual axes (Figure 9.1) can be used to analyse and compare various music games, plotting titles depending upon which traits were more prominent in each game. We can also note that the more a game includes one or both of these features, the more likely it is to be considered a ‘music game’ in popular discourse.

Figure 9.1 Graphical representation of procedural–conceptual axes of music games

Using this model, Rocksmith would be plotted in the upper right-hand corner of the graph since the game is both especially procedurally musical (rhythm-matching game that uses a real electric guitar as a controller) and conceptually musical (the plot of the game revolves around the player’s in-game career as a musician). Rayman Origins (2011) would be plotted closer to the middle of the graph since it is somewhat, but not especially, procedurally or conceptually musical; on the one hand, players that synch their movements to the overworld music tend to do better, and the game’s plot does have some musical elements (such as a magical microphone and dancing non-playable characters); on the other hand, this game is a platform action game in which the object is to reach the end of each level, avoid enemies and pick up helpful items along the way, not necessarily to make music per se. Games in the Call of Duty or Mortal Kombat franchises, for example, would be plotted in the bottom-left corner of the graph because neither the games’ mechanics nor the plot, setting, themes and so on, are musical in nature; thus, it is much less likely that these games would be considered by anyone to be a ‘music game’ compared to those plotted nearer the opposite corner of the graph.

Conclusions

Even if music video games never regain the blockbusting status they enjoyed in the mid-to-late 2000s, it is hard to imagine that game creators and publishers will ever discontinue production of games with musical mechanics or themes. While the categorical classification of the music game and its myriad forms and (sub)genres remains a perennial issue in the study of music video games, there exists great potential for research to emerge in the field of ludomusicology that dives deeper into the various types of play afforded by music games and the impact such play can have on the music industry, the academy and both Eastern and Western cultures more broadly.

10 Autoethnography, Phenomenology and Hermeneutics

Michiel Kamp

When studying a video game’s musical soundtrack, how do we account for the experience of hearing the music while playing the game? Let us pretend for a moment that a recording of Bastion (2011) is not from a game at all, but a clip from perhaps a cartoon series or an animated film.Footnote ¹ We would immediately be struck by the peculiar camera angle. At first, when ‘The Kid’ is lying in bed, what we see could be an establishing shot of some sort (see Figure 10.1). The high-angle long shot captures the isolated mote of land that The Kid finds himself on through a contrast in focus between the bright and colourful ruins and the blurry ground far beneath him. As soon as he gets up and starts running, however, the camera starts tracking him, maintaining the isometric angle (from 0’02” in the clip). While tracking shots of characters are not uncommon in cinema, this particular angle is unusual, as is the rigidity with which the camera follows The Kid. Whereas the rigidity is reminiscent of the iconic tricycle shots from The Shining (1980), the angle is more similar to crane shots in Westerns like High Noon (1952). It would seem easy to argue that the high angle and the camera distance render The Kid diminutive and vulnerable, but David Bordwell and Kristin Thompson warn against interpreting such aspects of cinematography in absolute terms.Footnote ² The major difference between traditional action sequences in film and the clip from Bastion is that the latter is essentially one long take, whereas typical (Hollywood) action sequences are composed of fast cuts between shots of varying camera distances and angles. The camerawork of this short sequence, then, creates a certain tension that remains unresolved, because there is no ‘cut’ to a next shot.

Figure 10.1 Bastion’s ‘opening shot’

From this film-analysis standpoint, the music in the clip is less problematic. The drone with which the scene opens creates the same air of expectancy as the camera angle, but it almost immediately fulfils that expectation, as The Kid rises from his bed, a moment accentuated by the narrator’s comment, ‘He gets up’. The Kid’s frantic pace through the assembling landscape is echoed in the snare drum and bass that form a shaky, staccato ground under the masculine reassurance of the narrator’s gravelly voice (whose southern drawl sounds like a seasoned cowboy, perhaps a Sam Elliott character). As The Kid picks up his weapon, a giant hammer (which the narrator calls ‘his lifelong friend’), a woodwind melody starts playing (0’19” in the clip). This common progression in the musical structure – a single drone leading into rhythmic accompaniment, in turn leading into melody – follows the progression in the narrative: The Kid gets his belongings and his journey is underway. But why does the camera not follow suit by cutting to the next ‘shot’?

Consider now a clip I made when playing the same sequence in Bastion on another occasion.Footnote ³ We see The Kid lying in bed, a single drone accompanying his resting state. The drone creates an air of tension, in which we discern the distant sounds of gushing wind. These sounds draw our attention to the rising embers around The Kid’s island high in the sky. Before we can start asking questions about the tension the drone creates – does it signify the emptiness of the sky, or the aftermath of the destruction that is implied in the ruined bedroom in the centre of the shot? – The Kid gets up, and a fast rhythm starts playing. As The Kid cautiously, haltingly, makes his way down a strip of land that starts forming around him in the sky (0’04” in the clip) – the camera following him step by step from afar – the frantic music suggests the confusion of the situation. The music heard in the first clip as underscoring the thrill and excitement of adventure now seems full of the anxiety and halting uncertainty of the protagonist. Yet the narrator suggests that ‘he don’t stop [sic] to wonder why’. So why does The Kid keep stopping?

Hearing Video Game Music

The questions that my analyses of Bastion prompted come from a misunderstanding of the medium of video games. The Kid stopped in the second clip because as my avatar, he was responding to my hesitant movements, probing the game for the source and logic of the appearing ground. I was trying to figure out if the ground would disappear if ‘I’ went back, and if it would stop appearing if ‘I’ stopped moving (‘I’ referring here to my avatar, an extension of myself in the gameworld, but when discussing the experience of playing a game, the two often become blurred). The camera does not cut to close-ups or other angles in the first clip because of the logic of the genre: this is an action-based role-playing game (RPG) that provides the player with an isometric, top-down view of the action, so that they have the best overview in combat situations, when enemies are appearing from all sides. So in order to accurately interpret the meaning of the camera angle, or of the actions of the characters, we need to have an understanding of what it is like to play a video game, rather than analyse it in terms of its audiovisual presentation. But what does this mean for our interpretation of the music?

In my example, I gave two slightly different accounts of the music in Bastion. In both accounts, it was subservient to the narrative and to the images, ‘underscoring’ the narrative arc of the first clip, and the confused movements of The Kid in the second. But do I actually hear these relationships when I am playing the game? What is it like to hear music when I am performing actions in a goal-oriented, rule-bound medium? Do I hear the musical ‘underscoring’ in Bastion as a suggestion of what actions to perform? I could, for instance, take the continuous drone at the beginning of the clip – a piece of dynamic music that triggers a musical transition when the player moves their avatar – as a sign to get up and do something. Or do I reflect on the music’s relationship to my actions and to the game’s visuals and narrative? While running along the pathway that the appearing ground makes, I could ask myself what the woodwind melody means in relation to the narrator’s voice. Or, again, do I decide to play along to the music? As soon as I get up, I can hear the music as an invitation to run along with the frantic pace of the cue, taking me wherever I need to be to progress in the game. Or, finally, do I pay attention to the music at all? It could, after all, be no more than ‘elevator music’, the woodwinds having nothing particularly worthwhile to add to the narrator’s words and the path leading me to where I need to be.

The questions I asked in the previous paragraph are all related to the broader question of ‘what is it like to hear video game music while playing a game?’ The three approaches that make up the title of this chapter – autoethnography, phenomenology and hermeneutics – revolve around this question. Each of the approaches can facilitate an account based in first-hand experience, of ‘what it is like’ to hear music while playing a video game, but each has its own methods and aims. With each also comes a different kind of knowledge, following Wilhelm Windelband’s classic distinction between the nomothetic and idiographic.Footnote ⁴ Whereas the nomothetic aims to generalize from individual cases – or experiences in this case – to say something about any kind of experience, the idiographic is interested in the particularities of a case, what makes it unique. As we shall see, whereas hermeneutics tends towards the idiographic and phenomenology towards the nomothetic, autoethnography sits somewhere in-between the two. The three approaches can be categorized in another manner as well. To loosely paraphrase a central tenet of phenomenology, every experience consists of an experiencer, an experiential act and an experienced object.Footnote ⁵ Autoethnography focuses on the unique view of the experiencer, phenomenology on the essence of the act and hermeneutics on the idiosyncrasies of the object. These are very broad distinctions that come with a lot of caveats, but the chapter’s aim is to show what each of these approaches can tell us about video game music. Since this chapter is about methodology, I will also compare a number of instances of existing scholarship on video game music in addition to returning to the example of Bastion. As it is currently the most common of the three approaches in the field, I will start with a discussion of hermeneutics. Autoethnography has not, until now, been employed as explicitly in video game music studies as in other disciplines, but there are clear examples of authors employing autoethnographic methods. Finally, phenomenology has been explored the least in the field, which is why I will end this chapter with a discussion of the potential of this approach as a complement to the other two.

Hermeneutics

Of the three approaches, hermeneutics is the one with the longest history within the humanities and the most applications in the fields of music, video games and video game music. As an approach, it is virtually synonymous with the idiographic, with its interest in the singular, idiosyncratic and unique. Hermeneutics is, simply put, both the practice and the study of interpretation. Interpretation is different from explanation, or even analysis, in that it aims to change or enhance the subject’s understanding of the object. We can describe three forms of interpretation: functional, textual and artistic. First, interpretation, in the broadest sense of the word, is functional and ubiquitous. Drivers interpret traffic signs on a busy junction, orchestra musicians interpret the gestures of a conductor and video game players interpret gameplay mechanics to determine the best course of action. Second, in the more specialist, academic sense of the word, interpretation is first and foremost the interpretation of texts, and it derives from a long history going back to biblical exegesis.Footnote ⁶ This practice of textual interpretation involves a certain submission to the authority of a text or its author, and we can include the historian’s interpretation of primary sources and the lawyer’s interpretation of legal texts in this form as well. In contrast, there is a third mode of interpretation, a more creative, artistic form of interpreting artworks. The New Criticism in literary studies is an important part of this tradition in the twentieth century,Footnote ⁷ but the practice of ekphrasis – interpreting visual artworks through poetry or literary texts – is often seen to go much further back, to classical antiquity.Footnote ⁸ A hermeneutics of video game music involves navigating the differences between these three forms of interpretation.

Of particular importance in the case of video games is the difference between functional and artistic interpretation. Players interpret video game music for a number of practical or functional purposes.Footnote ⁹ The most often discussed example is Zach Whalen’s idea of ‘danger state music’, or what Isabella van Elferen more broadly calls ‘ludic music’: music acts like a signpost, warning the player of the presence of enemies or other important events.Footnote ¹⁰ But the drone in Bastion warrants functional interpretation as well: as a player, I can understand its looping and uneventful qualities as signifying a temporary, waiting state, for me to break out of by pressing buttons. Artistic interpretation can certainly begin from such functional interpretation (I will return to this later), but it moves beyond that. It is not content with actions – pressing buttons – as a resolution of a hermeneutic issue but wants to understand the meaning of such musical material, in such a situation, in such a video game, in such a historical context and so on. This process of alternating focus on the musically specific and the contexts in which it sits is what is usually referred to as the hermeneutic circle, which is alternatively described as a going back and forth between parts and whole, between text and context, or between textual authority and the interpreter’s prejudices.Footnote ¹¹

One of the most explicit proponents of artistic interpretation in musicology is Lawrence Kramer. First of all, what I referred to as a ‘hermeneutic issue’ is what Kramer calls a ‘hermeneutic window’: the notion that something in a piece stands out to the listener, that something is ‘off’ that requires shifting one’s existential position or perspective in regard to it. In other words, something deviates from generic conventions.Footnote ¹² Through this window, we step into a hermeneutic circle of interpretation, which navigates between two poles: ‘ekphrastic fear’ and ‘ekphrastic hope’.Footnote ¹³ In every artistic interpretation, there is the fear that one’s verbal paraphrase of a piece overtakes or supplants it. The interpretation then becomes more of a translation, missing out on the idiosyncrasies of the original and driving the interpreter and their readers further away from the piece, rather than towards an understanding of it. Ekphrastic fear means that one’s prejudices fully overtake the authority of the work: it no longer speaks to us, but we speak for it. Ekphrastic hope, on the other hand, is the hope that a paraphrase triggers a spark of understanding, of seeing something new in the artwork, of letting it speak to us.

Game music hermeneutics involve another kind of fear that is often held by researchers and that is at the heart of this chapter: are we still interpreting from the perspective of a player of the game, or are we viewing the soundtrack as an outsider? This fear is best articulated by Whalen when he suggests that

[o]ne could imagine a player ‘performing’ by playing the game while an ‘audience’ listens in on headphones. By considering the musical content of a game as a kind of output, the critic has pre-empted analysis of the game itself. In other words, taking literally the implications of applying narrative structure to video-game music, one closes off the gameness of the game by making an arbitrary determination of its expressive content.Footnote ¹⁴

In a sense, Whalen’s ‘audience’ perspective is exactly what I took on in my introduction of Bastion. This kind of ‘phenomenological fear’ can be better understood and related to ekphrastic fear through an interesting commonality in the histories of video game and music hermeneutics. Both disciplines feature an infamous interpretation of a canonical work that is often used as an example of the dangers of hermeneutics by detractors. In the case of video games, it is Janet Murray describing Tetris as ‘a perfect enactment of the overtasked lives of Americans in the 1990s – of the constant bombardment of tasks that demand our attention and that we must somehow fit into our overcrowded schedules and clear off our desks in order to make room for the next onslaught’.Footnote ¹⁵ In the case of music, this is Susan McClary’s conception of a particular moment in Beethoven’s Ninth Symphony as the ‘unparalleled fusion of murderous rage and yet a kind of pleasure in its fulfilment of formal demands’.Footnote ¹⁶ These interpretations were made an example of by those unsympathetic to hermeneutic interpretation.Footnote ¹⁷ Murray’s work was used by ludologists to defend the player experience of video games from narratological encroachment.Footnote ¹⁸ McClary’s work was used by formalist musicologists (amongst others) to defend the listener’s experience from the interpretations of New Musicology – a movement also influenced by literary studies. We might say that there is a certain formalism at play, then, in both the idea of phenomenological fear and ekphrastic fear: are we not going too far in our interpretations; and do both the experiencing of gameplay ‘itself’ and music ‘itself’ really involve much imaginative ekphrasis or critical analogizing at all?Footnote ¹⁹

There are two remedies to hermeneutics’ fears of overinterpretation and misrepresentation. First, there are questions surrounding what exactly the player’s perspective pertains to. Is this just the experience of gameplay, or of a broader field of experiences pertaining to gaming? That might involve examining a game’s paratexts (associated materials), the player and critical discourse surrounding a game and ultimately understanding the place of a game in culture. For instance, it is difficult to interpret a game like Fortnite Battle Royale (2017) or Minecraft (2011) without taking into account its huge cultural footprint and historical context. A music-related example would be K. J. Donnelly’s case study of the soundtrack to Plants vs. Zombies (2009).Footnote ²⁰ Like Kramer, Donnelly opens with a question that the game raises, a hermeneutic window. In this case, Plants vs. Zombies is a game with a ‘simpler’ non-dynamic soundtrack in an era in which game soundtracks are usually praised for and judged by their dynamicity, yet the soundtrack has received positive critical and popular reception. The question of why is not solely born out of the player’s experience of music in relation to gameplay. Rather, it contraposes a deviation from compositional norms of the period. Donnelly then proceeds to interpret the soundtrack’s non-dynamicity, not as lacking, but as an integral part to the game’s meaning: a kind of indifference that ‘seems particularly fitting to the relentless forward movement of zombies in Plants vs. Zombies’.Footnote ²¹ In other words, the soundtrack’s indifference to the gameplay becomes an important part of the game’s meaning, which Donnelly then places in the context of a long history of arcade game soundtracks. By framing the interpretation through historical contextualization, Donnelly lends an authority to his account that a mere analogical insight (‘musical difference matches the indifference of the zombies in the game’) might not have had: experiencing Plants vs. Zombies in this manner sheds new light on a tradition of arcade game playing.

The second remedy involves keeping the player’s experience in mind in one’s interpretations. Context is essential to the understanding of every phenomenon, and it is difficult to ascertain where gameplay ends and context begins.Footnote ²² Even something as phenomenally simple as hearing the opening drone in Bastion as ‘expectant’ is based on a long tradition of musical conventions in other media, from Also Sprach Zarathustra in the opening scene of 2001: A Space Odyssey (1968) to the beginning of Wagner’s opera Das Rheingold. However, it is important to note that an explicit awareness of this tradition is not at all necessary for a player’s understanding of the drone in that manner, for their functional interpretation of it. In fact, the player’s functional interpretation relies on the conventionality of the expectant opening drone: without it, it would have formed a hermeneutic window that drove the player away from playing, and towards more artistic or textual forms of interpretation. These examples suggest that while the kind of functional interpreting that the experience of playing a game involves and artistic interpretation are to some extent complementary – the hermeneutic windows of artistic interpretation can certainly be rooted in musical experiences during gameplay – they can be antithetical as well. If the player’s experience is often based in their understanding of game-musical conventions, it is only when a score breaks significantly with these conventions that a hermeneutics of the object comes into play. In other situations, the idiosyncrasies of the player or their experience of playing a game might be more interesting to the researcher, and it is this kind of interpretation that autoethnography and phenomenology allow for. Playing a game and paying special attention to the ways in which one is invited to interpret the game as a player might reveal opportunities for interpretation that steers clear of generalization or mischaracterization.

Autoethnography

If the three approaches discussed in this chapter are about verbalizing musical experience in games, the most obvious but perhaps also the most controversial of the three is autoethnography. It contends that the scholarly explication of experiences can be similar to the way in which we relate many of our daily experiences: by recounting them. This renders the method vulnerable to criticisms of introspection: what value can a personal account have in scholarly discourse? Questions dealing with experience take the form of ‘what is it like to … ?’ When considering a question like this, it is always useful to ask ‘why?’ and ‘who wants to know?’ When I ask you what something is like, I usually do so because I have no (easy) way of finding out for myself. It could be that you have different physiological features (‘what is it like to be 7-foot tall?’; ‘what is it like to have synaesthesia?’), or have a different life history (‘what is it like to have grown up in Japan?’; ‘what is it like to be a veteran from the Iraq war?’). But why would I want to hear your description of what it is like to play a video game and hear the music, when I can find out for myself? What kind of privileged knowledge does video game music autoethnography give access to? This is one of the problems of autoethnography, with which I will deal first.

Carolyn Ellis, one of the pioneers of the method, describes autoethnography as involving ‘systematic sociological introspection’ and ‘emotional recall’, communicated through storytelling.Footnote ²³ The kinds of stories told, then, are as much about the storyteller as they are about the stories’ subjects. Indeed, Deborah Reed-Danahay suggests that the interest in autoethnography in the late 1990s came from a combination of anthropologists being ‘increasingly explicit in their exploration of links between their own autobiographies and their ethnographic practices’, and of ‘“natives” telling their own stories and [having] become ethnographers of their own cultures’.Footnote ²⁴ She characterizes the autoethnographer as a ‘boundary crosser’, and this double role can be found in the case of the game music researcher as well: they are both player and scholar. As Tim Summers argues, ‘[i]n a situation where the analyst is intrinsically linked to the sounded incarnation of the text, it is impossible to differentiate the listener, analyst, and gamer’.Footnote ²⁵

If the video game music analyser is already inextricably connected to their object, what does autoethnography add to analysis? Autoethnography makes explicit this connectedness by focusing the argument on the analyst. My opening description of Bastion was not explicitly autoethnographic, but it could be written as a more personal, autobiographic narrative. Writing as a researcher who is familiar with neoformalist approaches to film analysis, with the discourse on interactivity in video game music and with the game Bastion, I was able to ‘feign’ a perspective in which Bastion is not an interactive game but an animated film. If I were to have written about a less experimental approach to the game, one that was closer to my ‘normal’ mode of engagement with it, I could have remarked on how the transition between cues registers for me, as an experienced gamer who is familiar with the genre conventions of dynamic music systems. In other words, autoethnography would have revealed as much about me as a player as it would have about the soundtrack of the game.

This brings us to the second problem with autoethnography, namely the question of representation. Of course, my position as a gamer-cum-researcher is relatively idiosyncratic, but to what extent are all positions of gamers idiosyncratic? And to what extent is my position relevant at all to those interested in video game music? In other words: who cares what the musical experience of a game music researcher is like? Autoethnography occupies a somewhat ambiguous place in Windelband’s distinction between nomothetic and idiographic knowledge. Most methodologies of autoethnography to a degree argue that the method is not merely idiographic: my account does not just represent my own experience, but to some extent that of a larger group, and from there it can derive some of its value. This is where the autoethnographic method, that of systematic introspection, plays an important role. Consider William Cheng’s account of researching Fallout 3 (2008).Footnote ²⁶ While recording his playthrough, he finds himself pausing his progress through the game in order to sit back and enjoy a virtual sunrise, underscored by a Bach partita playing on the game’s diegetic radio. This prompts him to wonder not just the extent to which the music influenced his actions, but the extent to which the fact that he was being recorded did as well. This reveals both his insider/outsider perspective as a gamer/researcher and the idea that playing along to music is a form of role-playing. While the former revelation is perhaps idiosyncratic, the latter is something relatable to other, or perhaps all, forms of player engagement with musical soundtracks.

One argument why autoethnography lends itself well to the study of video game music is the length and scope of some video games. In particular, classic RPGs like the Final Fantasy series take many dozens of hours to complete. Although a busy researcher might opt for less ‘costly’ approaches, such as analysing a cue, looking at textual, audiovisual or ethnographic sources of the games’ reception or focusing on aspects of production, they would be missing out on the experiential aspects of devoting a not insubstantial part of one’s life to these games. The biographical connotations of RPG soundtracks – when and where players were in their lives when they played through an RPG – are the lifeblood of their reception. Relatively small soundtracks for games of sprawling lengths ensure that melodic and repetitive cues lodge themselves in the brains and memories of players and inspire all manners of reminiscing, from YouTube comments to concert performances. An autoethnographic account based on the researcher’s own reminiscing would then straddle the nomothetic/idiographic and insider/outsider divides that are central to the perspective. Not only does an approach like this recognize both the player’s and researcher’s role in the construction of the musical experience, but it provides access to the essential role that lived experience plays in the historical, musical significance of these games.

Phenomenology

Both hermeneutic and autoethnographic approaches can benefit from a more detailed and systematic account of not just the experiencer or the experienced music, but of the experience itself. Phenomenology has not been employed extensively in the field, so this final section should then be seen as an exploration of what this approach might offer the study of video game music, rather than a survey of existing studies. Whereas autoethnography takes the charge of introspection and wears it proudly, the origins of phenomenology lie in a scholarly context in which it was considered a dirty word. Edmund Husserl, generally considered the father of phenomenology, strenuously distinguished his approach from introspection.Footnote ²⁷ Rather than an attempt at finding empirical aspects of experience by investigating one’s own consciousness, phenomenology involves a reflection on conscious experience in order to find logical preconditions for those experiences. In other words, phenomenology deals not in empirical facts, but theoretical essences. It therefore aims to be closer in nature to logic and mathematics than to psychology and anthropology. It is unabashedly nomothetic, even if the experiential ‘data’ from which it starts are idiosyncratic to the experiencer. Husserl’s intent was to follow through the line of philosophical thought that started with Descartes and continued through Kant, of finding absolute truths in non-empirical knowledge: if the existence of the world beyond its appearance to me is in doubt, then all I can do is study appearances or phenomena. This project neatly lines up with the problem of interpretation: if the meaning of a video game (score) as intended by its creators to me is in doubt, then I have to focus on my experience thereof. Where a phenomenological approach differs from hermeneutics is that it is ultimately not interested in the object of experience, but in the (player) experience itself – that is, what one might call ‘hearing’ or ‘listening’ to game music.

In order to study phenomena, one needs to suspend one’s ‘natural attitude’, in which one assumes the existence of the world beyond our experiences of it. This is what is called the phenomenological epoché, transcendental reduction or simply ‘bracketing’.Footnote ²⁸ Our commonsensical, ‘natural’ ways of being in the world are so taken for granted that they ‘pass by unnoticed’, and so ‘we must abstain from them for a moment in order to awaken them and make them appear’ in a way as to understand them better.Footnote ²⁹ In this mode, this epoché, we can begin to distinguish certain phenomena. For instance, in Husserlian terminology, all phenomena that we experience as existing outside our immediate consciousness (e.g., things that we perceive with our senses) are ‘transcendent’ phenomena; those phenomena that only exist in experience, such as imagined or remembered things, are ‘immanent’ phenomena. This leads to the insight that hearing a melody – a case of temporal perception – involves both transcendent objects, like a note C that I hear right now, and immanent objects, like a note D I remember hearing just a moment ago, and which informs my understanding of note C as being part of a descending motif.

After adopting the attitude of the epoché, the phenomenological method involves intuiting essences through imaginative variation, also known as the ‘eidetic reduction’.Footnote ³⁰ By imagining variations on a phenomenon, and considering at what point those variations would cease to be instances of that phenomenon, we can identify its essential characteristics. Consider, for instance, the opening drone in the Bastion sequence, which I suggested carried with it an air of expectancy in my experience as a player. As long, held notes, drones in general might be seen to have a static quality about them; after all, they are melodically directionless, and often harmonically as well. The attribute of expectancy in this particular experience of a drone is constantly at odds with this static quality: it seems to make the music want to go somewhere else. By imagining the Bastion opening drone as sounding or appearing different, it is possible to work out the way in which this ‘expectantness’ is an essential quality of this particular experienced drone. For instance, I can imagine the drone being higher or lower in volume or pitch, but it would still carry with it this same attribute in the context of my experience in Bastion. Only when I imagine hearing certain very specific other musical events with very specific musical and cultural contexts against the drone – for example, the Celtic folk melodies that are often accompanied throughout by bagpipe drones, or the drone-like early polyphony of Pérotin – is this attribute lost. This suggests that in this experience, qualities like ‘static’ and ‘expectant’ have more to do with context than with musical parameters of the Bastion drone itself. Taken as an essential quality of my experience of the drone, ‘expectancy’ reveals this context, and further imaginative variation might reveal more about its nature: the way in which audiovisual impressions or game-generic expectations are involved as preconditions for the experience, for instance.

Based on subjective experience, the phenomenological approach is ultimately theoretical rather than empirical. I can never say anything in general about actual experiences of the opening drone in Bastion – those had by other players – based on an examination of my own experiences, but I can say something about possible experiences of the opening drone. This means that as an approach, phenomenology lends itself best to experiences widely shared, but not thoroughly understood. This is why it has mostly been employed in investigations into some of the most basic and universal concepts: perception, art, technology and even existence and being itself.Footnote ³¹ Husserl did discuss music, but only as a means of elucidating our consciousness of time.Footnote ³² Throughout the twentieth century, there have been more applied, sporadic attempts at investigating music in a phenomenological manner.Footnote ³³ Scholars such as Alfred Schutz and Thomas Clifton have offered insights on music’s relationship to time, from the experience of a musical work as an ideal, Platonic object, to the way a musical performance allows us to enter a ‘flux’ of inner time instead of outer ‘clock’ time.Footnote ³⁴ All of these studies, however, are concerned with music as the exclusive object of attention, whether it be in the concert hall or at home on the listener’s couch as they listen to a recording.Footnote ³⁵ Video games offer varied modes of engagement with music, whether they be more attentive (such as in Guitar Hero, 2005) or inattentive (such as in Bastion).Footnote ³⁶ While earlier phenomenologies of music therefore are not necessarily directly applicable to video games, they do offer useful starting points for interrogating and refining existing theories of game music.

For example, Elizabeth Medina-Gray, in her analysis of modularity in game composition, makes a distinction between musical and non-musical choices.Footnote ³⁷ For instance, pressing a button in Guitar Hero to play a note or phrase is a musical choice, based on rhythmic timing; pressing a button in Bastion to ‘get up’ is a non-musical choice, based on our desire to get our avatar moving. In both instances, the music responds to our actions, but the qualitative difference between the ways in which we hear that musical response can be described phenomenologically. A cursory glance would suggest that in the case of Guitar Hero, we are firmly in the ‘inner time’ of a song, whereas in the case of Bastion, this temporal experience is at the very least a function of musical and non-musical expectations. However, looking closer at the music in Bastion, what exactly is the inner time of the musical drone with which the soundtrack opens? Jonathan Kramer might suggest that this is a form of ‘vertical music’ that has no clear directionality to it,Footnote ³⁸ but then a drone can be expectant as well, depending on its context (cf. the opening to Also Sprach Zarathustra). While it is undoubtedly the case that the expectancy created by the drone is a soundtrack convention – in part a non-musical expectation – it is also a musical convention going back before Strauss’ symphonic poem to, for instance, bagpipe playing. Moreover, to suggest that expectancy is an inessential attribute of Bastion’s opening drone is a misconstrual of my experience, of the phenomenon in question. Is ‘getting up’ in Bastion then a completely non-musical choice, if soundtrack conventions are so closely intertwined with video game and audiovisual narrative conventions?

To some extent, this kind of applied phenomenology resembles music theory in nature. It too attempts to abstract from empirical data – experiences as opposed to pieces of music – to find theoretical rules and patterns.Footnote ³⁹ But as in music theory, these rules and patterns are historically and culturally contingent. And here lies the main challenge with phenomenologies of cultural phenomena such as video game music: it is hard to deduce when they stray from the universal and become ‘too applied’, because they are ultimately and inescapably rooted in subjective experience. As a critical complement to approaches such as hermeneutics and autoethnography, however, they can be an invaluable resource that helps us to unpack the specifics of what it is like to play a game and experience its music.

* * *

The three methods outlined in this chapter can be related to each other in a circular manner. Although an autoethnographic account of a game soundtrack can open up phenomenological questions, and these can be interpreted in a cultural-historical context, it is often a hermeneutic question or window that functions as the starting point of autoethnography: what is idiosyncratic about this particular experience, by this player, of this game? Although the player experience has been the central point of concern for this chapter, it is by no means the exclusive object of investigation for these methods. David Bessell, for instance, autoethnographically approaches the creative process involved in designing the soundtrack to the unreleased horror game Deal With the Devil.Footnote ⁴⁰ Moreover, in recent years, the lines between creation and consumption, between artists and audiences, have been blurred. Video game music is very much a part of participatory culture, as evidenced in the thousands of arrangements, covers and appropriations of popular soundtracks like that of Super Mario Bros. on platforms such as YouTube.Footnote ⁴¹ Not only should an account of the player experience involve this complicated web of material beyond the game, but this material itself could be construed as a modern form of ekphrasis. Academic approaches to the understanding of player experience might then be considered as just another strand of this wider web of interpretative practices.

11 Interacting with Soundscapes: Music, Sound Effects and Dialogue in Video Games

Elizabeth Medina-Gray

Video games often incorporate a wide variety of sounds while presenting interactive virtual environments to players. Music is one critical element of video game audio – to which the bulk of the current volume attests – but it is not alone: sound effects (e.g., ambient sounds, interface sounds and sounds tied to gameworld actions) and dialogue (speaking voices) are also common and important elements of interactive video game soundscapes. Often, music, sound effects and dialogue together accompany and impact a player’s experience playing a video game, and as Karen Collins points out, ‘theorizing about a single auditory aspect without including the others would be to miss out on an important element of this experience, particularly since there is often considerable overlap between them’.Footnote ¹ A broad approach that considers all components of video game soundscapes – and that acknowledges relationships and potentially blurred boundaries between these components – opens the way for a greater understanding not only of video game music but also of game audio more broadly and its effects for players.

This chapter focuses on music, sound effects and dialogue as three main categories of video game audio. In some ways, and in certain contexts, these three categories can be considered as clearly distinct and discrete. Indeed this chapter begins from the general premise – inherited from film audio – that enough of video game audio can be readily described as either ‘music’, ‘sound effects’ or ‘dialogue’ to warrant the existence of these separate categories. Such distinctions are reflected in various practical aspects of video game design: some games – many first-person shooter games, for instance – provide separate volume controls for music and sound effects,Footnote ² and the Game Audio Network Guild’s annual awards for video game audio achievement include categories for ‘Music of the Year’ and ‘Best Dialogue’ (amongst others).Footnote ³ While music, sound effects and dialogue are generally recognizable categories of video game audio, the interactive, innovative and technologically grounded aspects of video games often lead to situations – throughout the history of video game audio – where the distinctions between these categories are not actually very clear: a particular sound may seem to fit into multiple categories, or sounds seemingly from different categories may interact in surprising ways. Such blurred boundaries between sonic elements raise intriguing questions about the nature of game audio, and they afford interpretive and affective layers beyond those available for more readily definable audio elements. These issues are not wholly unique to video games – John Richardson and Claudia Gorbman point to increasingly blurred boundaries and interrelationships across soundtrack elements in cinema, for exampleFootnote ⁴ – although the interactive aspect of games allows for certain effects and sonic situations that do not necessarily have parallels in other types of multimedia. In using the term ‘soundscape’ here, moreover, I acknowledge a tradition of soundscape studies, or acoustic ecology, built on the work of R. Murray Schafer and others, which encourages consideration of all perceptible sounds in an environment, including but by no means limited to music. Some authors have found it useful to apply a concept of soundscapes (or acoustic ecologies) specifically to those sounds that video games present as being within a virtual world (i.e., diegetic sounds).Footnote ⁵ I use the term more broadly here – in this chapter, ‘soundscape’ refers to any and all sounds produced by a game; these sounds comprise the audible component of a gameplay environment for a player. (An even broader consideration of gameplay soundscapes would also include sounds produced within a player’s physical environment, but such sounds are outside the scope of the current chapter.)

This chapter first considers music, sound effects and dialogue in turn, and highlights the particular effects and benefits that each of these three audio categories may contribute to players’ experiences with games. During the discussion of sound effects, this chapter further distinguishes between earcons and auditory icons (following definitions from the field of human–computer interaction, which this chapter will reference in some detail) in order to help nuance the qualities and utilities of sounds in this larger audio category. The remaining portion of this chapter examines ways in which music, sound effects and dialogue sometimes interact or blur together, first in games that appeared relatively early in the history of video game technology (when game audio primarily relied on waveforms generated in real time via sound chips, rather than sampled or prerecorded audio), and then in some more recent games. This chapter especially considers the frequently blurred boundaries between music and sound effects, and suggests a framework through which to consider the musicality of non-score sounds in the larger context of the soundscape. Overall, this chapter encourages consideration across and beyond the apparent divides between music, sound effects and dialogue in games.

Individual Contributions of Music, Sound Effects and Dialogue in Video Games

Amongst the wide variety of audio elements in video game soundscapes, I find it useful to define the category of music in terms of content: I here consider music to be a category of sound that features some ongoing organization across time in terms of rhythm and/or pitch. In video games, music typically includes score that accompanies cutscenes (cinematics) or gameplay in particular environments, as well as, occasionally, music produced by characters within a game’s virtual world. A sound might be considered at least semi-musical if it contains isolated components like pitch, or timbre from a musical instrument, but for the purposes of this chapter, I will reserve the term ‘music’ for elements of a soundscape that are organized in rhythm and/or pitch over time. (The time span for such organization can be brief: a two-second-long victory fanfare would be considered music, for example, as long as it contains enough onsets to convey some organization across its two seconds. By contrast, a sound consisting of only two pitched onsets is – at least by itself – semi-musical rather than music.) Music receives thorough treatment throughout this volume, so only a brief summary of its effects is necessary here. Amongst its many and multifaceted functions in video games, music can suggest emotional content, and provide information to players about environments and characters; moreover, if the extensive amount of fan activity (YouTube covers, etc.) and public concerts centred around game music are any indication, music inspires close and imaginative engagement with video games, even beyond the time of immediate gameplay.

While I define music here in terms of content, I define sound effects in terms of their gameplay context: Sound effects are the (usually brief) sounds that are tied to the actions of players and in-game characters and objects; these are the sounds of footsteps, sounds of obtaining items, sounds of selecting elements in a menu screen and so on, as well as ambient sounds (which are apparently produced within the virtual world of the game, but which may not have specific, visible sources in the environment). As functional elements of video game soundscapes, sound effects have significant capability to enrich virtual environments and impact players’ experiences with games. As in film, sound effects in games can perceptually adhere to their apparent visual sources on the screen – an effect which Michel Chion terms synchresis – despite the fact that the visual source is not actually producing the sound (i.e., the visuals and sounds have different technical origins).Footnote ⁶ Beyond a visual connection, sound effects frequently take part in a process that Karen Collins calls kinesonic synchresis, where sounds fuse with corresponding actions or events.Footnote ⁷ For example, if a game produces a sound when a player obtains a particular in-game item, the sound becomes connected to that action, at least as much as (if not more so than) the same sound adheres to corresponding visual data (e.g., a visible object disappearing). Sound effects can adhere to actions initiated by a computer as well, as for example, in the case of a sound that plays when an in-game object appears or disappears without a player’s input. Through repetition of sound together with action, kinesonically synchretic sounds can become familiar and predictable, and can reinforce the fact that a particular action has taken place (or, if a sound is absent when it is anticipated, that an expected action hasn’t occurred). In this way, sound effects often serve as critical components of interactive gameplay: they provide an immediate channel of communication between computer and player, and players can use the feedback from such sounds to more easily and efficiently play a game.Footnote ⁸ Even when sound effects are neither connected to an apparent visible source nor obviously tied to a corresponding action (e.g., the first time a player hears a particular sound), they may serve important roles in gameplay: acousmatic sounds (sounds without visible sources), for instance, might help players to visualize the gameworld outside of what is immediately visible on the screen, by drawing from earlier experiences both playing the game and interacting with the real world.Footnote ⁹

Within the context-defined category of sound effects, the contents of these sounds vary widely, from synthesized to prerecorded (similar to Foley in film sound), for example, and from realistic to abstract. To further consider the contributions of sound effects in video games, it can be helpful to delineate this larger category with respect to content. For this purpose, I here borrow concepts of auditory icons and earcons – types of non-speech audio feedback – from the field of Human–Computer Interaction (HCI, a field concerned with studying and designing interfaces between computers and their human users). In the HCI literature, auditory icons are ‘natural, everyday sounds that can be used to represent actions and objects within an interface’, while earcons ‘use abstract, synthetic tones in structured combinations to create auditory messages’.Footnote ¹⁰ (The word ‘earcon’ is a play on the idea of an icon perceived via the ear rather than the eye.) Auditory icons capitalize on a relatively intuitive connection between a naturalistic sound and the action or object it represents in order to convey feedback or meaning to a user; for example, the sound of clinking glass when I empty my computer’s trash folder is similar to the sounds one might hear when emptying an actual trash can, and this auditory icon tells me that the files in that folder have been successfully deleted. In video games, auditory icons might include footstep-like sounds when characters move, sword-like clanging when characters attack, material clicks or taps when selecting options in a menu, and so on; and all of these naturalistic sounds can suggest meanings or connections with gameplay even upon first listening. Earcons, by contrast, are constructed of pitches and/or rhythms, making these informative sounds abstract rather than naturalistic; the connection between an earcon and action, therefore, is not intuitive, and it may take more repetitions for a player to learn an earcon’s meaning than it might to learn the meaning of an auditory icon. Earcons can bring additional functional capabilities, however. For instance, earcons may group into families through similarities in particular sonic aspects (e.g., timbre), so that similar sounds might correspond to logically similar actions or objects. (The HCI literature suggests various methods for constructing ‘compound earcons’ and ‘hierarchical earcons’, methods which capitalize on earcons’ capacity to form relational structures,Footnote ¹¹ but these more complex concepts are not strictly necessary for this chapter’s purposes.) Moreover, the abstract qualities of earcons – and their pitched and/or rhythmic content – may open up these sound effects to further interpretation and potentially musical treatment, as later portions of this chapter explore. In video games, earcons might appear as brief tones or motives when players obtain particular items, navigate through text boxes and so on. The jumping and coin-collecting sounds in Super Mario Bros. (1985) are earcons, as are many of the sound effects in early video games (since early games often lacked the capacity to produce more realistic sounds).

Note that other authors also use the terms ‘auditory icons’ and ‘earcons’ with respect to video game audio, with similar but not necessarily identical applications to how I use the terms here. For example, earcons in Kristine Jørgensen’s treatment include any abstract and informative sound in video games, including score (when this music conveys information to players).Footnote ¹² Here, I restrict my treatment of earcons to brief fragments, as these sounds typically appear in the HCI literature, and I do not extend the concept to include continuous music. In other words, for the purposes of this article, earcons and auditory icons are most useful as more specific descriptors for those sounds that I have already defined as sound effects.

In video game soundscapes, dialogue yields a further array of effects particular to this type of sound. Dialogue here refers to the sounds of characters speaking, including the voices of in-game characters as well as narration or voice-over. Typically, such dialogue equates to recordings of human speech (i.e., voice acting), although the technological limitations and affordances of video games can raise other possibilities for sounds that might also fit into this category. In video games, the semantic content of dialogue can provide direct information about character states, gameplay goals and narrative/story, while the sounds of human voices can carry additional affective and emotional content. In particular, a link between a sounding voice and the material body that produced it (e.g., the voice’s ‘grain’, after Roland Barthes) might allow voices in games to reference physical bodies, potentially enhancing players’ senses of identification with avatars.Footnote ¹³ A voice’s affect might even augment a game’s visuals by suggesting facial expressions.Footnote ¹⁴ When players add their own dialogue to game soundscapes, moreover – as in voice chat during online multiplayer games, for instance – these voices bring additional complex issues relating to players’ bodies, identities and engagement with the game.Footnote ¹⁵

Blurring Audio Categories in Early Video Game Soundscapes

As a result of extreme technological limitations, synthesized tones served as a main sonic resource in early video games, a feature which often led to a blurring of audio categories. The soundscape in Pong (1972), for example, consists entirely of pitched square-wave tones that play when a ball collides with a paddle or wall, and when a point is scored; Tim Summers notes that Pong’s sound ‘sits at the boundary between sound effect and music – undeniably pitched, but synchronized to the on-screen action in a way more similar to a sound effect’.Footnote ¹⁶ In terms of the current chapter’s definitions, the sounds in Pong indeed belong in the category of sound effects – more specifically, these abstract sounds are earcons – here made semi-musical because of their pitched content. But with continuous gameplay, multiple pitched earcons can string together across time to form their own gameplay-derived rhythms that are predictable in their timing (because of the associated visual cues), are at times repeated and regular, and can potentially result in a sounding phenomenon that can be defined as music. In the absence of any more obviously musical element in the soundscape – like, for example, a continuous score – these recurring sound effects may reasonably step in to fill that role.

Technological advances in the decades after Pong allowed 8-bit games of the mid-to-late 1980s, for example, to readily incorporate both a continuous score and sound effects, as well as a somewhat wider variety of possible timbres. Even so, distinctions between audio types in 8-bit games are frequently fuzzy, since this audio still relies mainly on synthesized sound (either regular/pitched waveforms or white noise), and since a single sound chip with limited capabilities typically produces all the sounds in a given game. The Nintendo Entertainment System (NES, known as the Family Computer or Famicom in Japan), for example, features a sound chip with five channels: two pulse-wave channels (each with four duty cycles allowing for some timbral variation), one triangle-wave channel, one noise channel and one channel that can play samples (which only some games use). In many NES/Famicom games, music and sound effects are designated to some of the same channels, so that gameplay yields a soundscape in which certain channels switch immediately, back and forth, between producing music and producing sound effects. This is the case in the opening level of Super Mario Bros., for example, where various sound effects are produced by either of the two pulse-wave channels and/or the noise channel, and all of these channels (when they are not playing a sound effect) are otherwise engaged in playing a part in the score; certain sound effects (e.g., coin sounds) even use the pulse-wave channel designated for the melodic part in the score, which means that the melody drops out of the soundscape whenever those sound effects occur.Footnote ¹⁷ When a single channel’s resources become divided between music (score) and sound effects in this way, I can hear two possible ramifications for these sounds: On the one hand, the score may invite the sound effects into a more musical space, highlighting the already semi-musical quality of many of these effects; on the other hand, the interjections of sound effects may help to reveal the music itself as a collection of discrete sounds, downplaying the musicality of the score. Either way, such a direct juxtaposition of music and sound effects reasonably destabilizes and blurs these sonic categories.

With only limited sampling capabilities (and only on some systems), early video games generally were not able to include recordings of voices with a high degree of regularity or fidelity, so the audio category of dialogue is often absent in early games. Even without directly recognizable reproductions of human voices, however, certain sounds in early games can sometimes be seen as evoking voices in surprising ways; Tom Langhorst, for example, has suggested that the pitch glides and decreasing volume in the failure sound in Pac-Man (1980) make this brief sound perhaps especially like speech, or even laughter.Footnote ¹⁸ Moreover, dialogue as an audio category can sometimes exist in early games, even without the use of recorded voices. Dragon Quest (1986) provides an intriguing case study in which a particular sound bears productive examination in terms of all three audio categories: sound effect, music and dialogue. This case study closes the current examination of early game soundscapes, and introduces an approach to the boundaries between music and other sounds that remains useful beyond treatments of early games.

Dragon Quest (with music composed by Koichi Sugiyama) was originally published in 1986 in Japan on the Nintendo Famicom; in 1989, the game was published in North America on the NES with the title Dragon Warrior. An early and influential entry in the genre of Japanese Role-Playing Games (JRPGs), Dragon Quest casts the player in the role of a hero who must explore a fantastical world, battle monsters and interact with computer-controlled characters in order to progress through the game’s quest. The game’s sound is designed such that – unlike in Super Mario Bros. – simultaneous music and sound effects generally utilize different channels on the sound chip. For example, the music that accompanies exploration in castle, town, overworld and underworld areas and in battles, always uses only the first pulse-wave channel and the triangle-wave channel, while the various sound effects that can occur in these areas (including the sounds of menu selections, battle actions and so on) use only the second pulse-wave channel and/or the noise channel. This sound design separates music and sound effects by giving these elements distinct spaces in the sound chip, but many of the sound effects are still at least semi-musical in that they consist of pitched tones produced by a pulse-wave channel. When a player talks with other inhabitants of the game’s virtual world, the resulting dialogue appears – gradually, symbol by symbol – as text within single quotation marks in a box on the screen. As long as the visual components of this speech are appearing, a very brief but repeated tone (on the pitch A5) plays on the second pulse-wave channel. Primarily, this repeated sound is a sound effect (an earcon): it is the sound of dialogue text appearing, and so can be reasonably described as adhering to this visual, textual source. Functionally, the sound confirms and reinforces for players the fact that dialogue text is appearing on screen. The sound’s consistent tone, repeated at irregular time intervals, moreover, may evoke a larger category of sounds that accompany transmission of text through technological means, for example, the sounds of activated typewriter or computer keys, or Morse code.

At the same time, the dialogue-text sound can be understood in terms of music, first and most basically because of its semi-musical content (the pitch A5), but also especially because this sound frequently agrees (or fits together well) with simultaneous music in at least one critical aspect: pitch. Elsewhere, I have argued that sonic agreement – a quality that I call ‘smoothness’ – between simultaneous elements of video game audio arises through various aspects, in particular, through consonance between pitches, alignment between metres or onsets, shared timbres and similarities between volume levels.Footnote ¹⁹ Now, I suggest that any straightforward music component like a score – that is, any sonic material that clearly falls into the category of music – can absorb other simultaneous sounds into that same music classification as long as those simultaneous sonic elements agree with the score, especially in terms of both pitch and metre. (Timbre and volume seem less critical for these specifically musical situations; consider that it is fairly common for two instruments in an ensemble to contribute to the same piece of music while disagreeing in timbre or volume.) When a sound agrees with simultaneous score in only one aspect out of pitch or metre, and disagrees in the other aspect, the score can still emphasize at least a semi-musical quality in the sound. For instance, Example 11.1 provides an excerpt from the looping score that accompanies gameplay in the first area of Dragon Quest, the castle’s throne room, where it is possible to speak at some length with several characters. While in the throne room, the dialogue-text sound (with the pitch A) can happen over any point in the sixteen-bar looping score for this area; in 27 per cent of those possible points of simultaneous combination, the score also contains a pitch A and so the two audio elements are related by octave or unison, making the combination especially smooth in terms of pitch; in 41 per cent of all possible combinations, the pitch A is not present in the score, but the combination of the dialogue-text sound with the score produces only consonant intervals (perfect fourths or fifths, or major or minor thirds or sixths, including octave equivalents). In short, 68 per cent of the time, while a player reads dialogue text in the throne room, the accompanying sound agrees with the simultaneous score in terms of pitch. More broadly, almost all of the dialogue in this game happens in the castle and town locations, and around two-thirds of the time, the addition of an A on top of the music in any of these locations produces a smooth combination in terms of pitch. The dialogue-text sounds’ onsets sometimes briefly align with the score’s meter, but irregular pauses between the sound’s repetitions keep the sound from fully or obviously becoming music, and keep it mainly distinct from the score. Even so, agreement in pitch frequently emphasizes this sound’s semi-musical quality.

Example 11.1 Castle throne room (excerpt), Dragon Quest

Further musical connections also exist between the dialogue-text sound and score. The throne room’s score (see Example 11.1) establishes an A-minor tonality, as does the similar score for the castle’s neighbouring courtyard area; in this tonal context, the pitch A gains special status as musically central and stable, and the repeated sounds of lengthy dialogue text can form a tonic drone (an upper-register pedal tone) against which the score’s shifting harmonies push and pull, depart and return. The towns in Dragon Quest feature an accompanying score in F major, which casts the dialogue-text sound in a different – but still tonal – role, as the third scale degree (the mediant).

Finally, although the sound that accompanies dialogue text in Dragon Quest bears little resemblance to a human voice on its own, the game consistently attaches this sound to characters’ speech, and so encourages an understanding of this sound as dialogue. Indeed, other boxes with textual messages for players (describing an item found in a chest, giving battle information, and so on), although visually very similar to boxes with dialogue text, do not trigger the repeated A5 tone with the appearing text, leading this dialogue-text sound to associate specifically with speech rather than the appearance of just any text. With such a context, this sound may reasonably be construed as adhering to the visible people as well as to their text – in other words, the sound can be understood as produced by the gameworld’s inhabitants as much as it is produced by the appearance of dialogue text (and players’ triggering of this text). In a game in which the bodies of human characters are visually represented by blocky pixels, only a few colours and limited motions, it does not seem much of a stretch to imagine that a single tone could represent the sounds produced by human characters’ vocal chords. As William Cheng points out, in the context of early (technologically limited) games, acculturated listeners ‘grew ears to extract maximum significance from minimal sounds’.Footnote ²⁰ And if we consider the Famicom/NES sound chip to be similar to a musical instrument, then Dragon Quest continues a long tradition in which musical instruments attempt to represent human voices; Theo van Leeuwen suggests that such instrumental representations ‘have always been relatively abstract, perhaps in the first place seeking to provide a kind of discourse about the human voice, rather than seeking to be heard as realistic representations of human voices’.Footnote ²¹ Although the dialogue-text sound in Dragon Quest does not directly tap into the effects that recorded human voices can bring to games (see the discussion of the benefits of dialogue earlier in this chapter), Dragon Quest’s representational dialogue sound may at least reference such effects and open space for players to make imaginative connections between sound and body. Finally, in the moments when this sound’s pitch (A5) fits well with the simultaneous score (and perhaps especially when it coincides with a tonic A-minor harmony in the castle’s music), the score highlights the semi-musical qualities in this dialogue sound, so that characters may seem to be intoning or chanting, rather than merely speaking. In this game’s fantastical world of magic and high adventure, I have little trouble imagining a populace that speaks in an affected tone that occasionally approaches song.

Blurring Sound Effects and Music in More Recent Games

As video game technology has advanced in recent decades, a much greater range and fidelity of sounds have become possible. Yet many video games continue to challenge the distinctions between music, sound effects and dialogue in a variety of ways. Even the practice of using abstract pitched sounds to accompany dialogue text – expanded beyond the single pulse-wave pitch of Dragon Quest – has continued in some recent games, despite the increased capability of games to include recorded voices; examples include Undertale (2015) and games in the Ace Attorney series (2001–2017).

The remainder of this chapter focuses primarily on ways in which the categories of sound effects and music sometimes blur in more recent games. Certain developers and certain genres of games blur these two elements of soundscapes with some frequency. Nintendo, for instance, has a tendency to musicalize sound effects, for example, in the company’s flagship Mario series, tying in with aesthetics of playfulness and joyful fun that are central to these games.Footnote ²² Horror games, for example, Silent Hill (1999), sometimes incorporate into their scores sounds that were apparently produced by non-musical objects (e.g., sounds of squealing metal, etc.) and so suggest themselves initially as sound effects (in the manner of auditory icons); William Cheng points out that such sounds in Silent Hill frequently blur the boundaries between diegetic and non-diegetic sounds, and can even confuse the issue of whether particular sounds are coming from within the game or from within players’ real-world spaces, potentially leading to especially unsettling/horrifying experiences for players.Footnote ²³ Overall, beyond any particular genre or company, an optional design aesthetic seems to exist in which sound effects can be considered at least potentially musical, and are worth considering together with the score; for instance, in a 2015 book about composing music for video games, Michael Sweet suggests that sound designers and composers should work together to ‘make sure that the SFX don’t clash with the music by, for example, being in the wrong key’.Footnote ²⁴

The distinction between auditory icons and earcons provides a further lens through which to help illuminate the capabilities of sound effects to act as music. The naturalistic sounds of auditory icons do not typically suggest themselves as musical, while the pitched and/or rhythmic content of earcons cast these sounds as at least semi-musical from the start; unlike auditory icons, earcons are poised to extend across the boundary into music, if a game’s context allows them to do so.

Following the framework introduced during this chapter’s analysis of sounds in Dragon Quest, a score provides an important context that can critically influence the musicality of simultaneous sound effects. Some games use repeated metric agreement with a simultaneous score to pull even auditory icons into the realm of music; these naturalistic sounds typically lack pitch with which to agree or disagree with a score, so metric agreement is enough to cast these sounds as musical. For example, in the Shy Guy Falls racing course in Mario Kart 8 (2014), a regular pulse of mining sounds (hammers hitting rock, etc.), enters the soundscape as a player drives past certain areas and this pulse aligns with the metre of the score, imbuing the game’s music with an element of both action and location.

Frequently, though, when games of recent decades bring sound effects especially closely into the realm of music, those sound effects are earcons, and they contain pitch as well as rhythmic content (i.e., at least one onset). This is the case, for example, in Bit.Trip Runner (2010), a side-scrolling rhythm game in which the main character runs at a constant speed from left to right, and a player must perform various manoeuvres (jump, duck, etc.) to avoid obstacles, collect items and so on. Earcons accompany most of these actions, and these sound effects readily integrate into the ongoing simultaneous score through likely consonance between pitches as well as careful level design so that the various actions (when successful) and their corresponding earcons align with pulses in the score’s metre (this is possible because the main character always runs at a set speed).Footnote ²⁵ In this game, a player’s actions – at least the successful ones – create earcons that are also music, and these physical motions and the gameworld actions they produce may become organized and dance-like.

In games without such strong ludic restrictions on the rhythms of players’ actions and their resulting sounds, a musical treatment of earcons can allow for – or even encourage – a special kind of experimentation in gameplay. Elsewhere, I have examined this type of situation in the game Flower (2009), where the act of causing flowers to bloom produces pitched earcons that weave in and out of the score (to different degrees in different levels/environments of the game, and for various effects).Footnote ²⁶ Many of Nintendo’s recent games also include earcons as part of a flexible – yet distinctly musical – sound design; a detailed example from Super Mario Galaxy (2007) provides a final case study on the capability of sound effects to become music when their context allows it.

Much of Super Mario Galaxy involves gameplay in various galaxies, and players reach these galaxies by selecting them from maps within domes in the central Comet Observatory area. When a player points at a galaxy on one of these maps (with the Wii remote), the galaxy’s name and additional information about the galaxy appear on the screen, the galaxy enlarges slightly, the controller vibrates briefly and an earcon consisting of a single high-register tone with a glockenspiel-like timbre plays; although this earcon always has the same timbre, high register and only one tone, the pitch of this tone is able to vary widely, depending on the current context of the simultaneous score that plays during gameplay in this environment. For every harmony in the looping score (i.e., every bar, with this music’s harmonic rhythm), four or five pitches are available for the galaxy-pointing earcon; whenever a player points at a galaxy, the computer refers to the current position of the score and plays one tone (apparently at random) from the associated set. Example 11.2 shows an excerpt from the beginning of the score that accompanies the galaxy maps; the pitches available for the galaxy-pointing earcon appear above the notated score, aligned with the sections of the score over which they can occur. (The system in Example 11.2 applies to the typical dome map environments; in some special cases during gameplay, the domes use different scores and correspondingly different systems of earcon pitches.) All of these possible sounds can be considered instances of a single earcon with flexible pitch, or else several different earcons that all belong to a single close-knit family. Either way, the consistent components of these sounds (timbre, register and rhythm) allow them to adhere to the single action and provide consistent feedback (i.e., that a player has successfully highlighted a galaxy with the cursor), while distinguishing these sounds from other elements of the game’s soundscape.

Example 11.2 The galaxy maps (excerpt), Super Mario Galaxy

All of the available pitches for the galaxy-pointing earcons are members of the underlying harmony for the corresponding measure – this harmony appears in the most rhythmically consistent part of the score, the bottom two staves in Example 11.2. With this design, whenever a player points at a galaxy, it is extremely likely that the pitch of the resulting earcon will agree with the score, either because this pitch is already sounding at a lower register in the score (77 per cent of the time), or because the pitch is not already present but forms only consonant intervals with the score’s pitches at that moment (11 per cent of the time). Since the pitches of the score’s underlying harmony do not themselves sound throughout each full measure in the score, there is no guarantee that earcons that use pitches from this harmony will be consonant with the score; for example, if a player points at a galaxy during the first beat of the first bar of the score and the A♭ (or C) earcon sounds, this pitch will form a mild dissonance (a minor seventh) with the B♭ in the score at that moment. Such moments of dissonance are relatively rare (12 per cent of the time), however; and in any case, the decaying tail of such an earcon may still then become consonant once the rest of the pitches in the bar’s harmony enter on the second beat.

The score’s context (and careful programming) is thus very likely to reaffirm the semi-musical quality of each individual galaxy-pointing earcon through agreement in pitch. When a player points at additional galaxies after the first (or when the cursor slips off one galaxy and back on again – Wii sensors can be finicky), the variety of pitches in this flexible earcon system can start to come into play. Such a variety of musical sounds can open up a new way of interacting with what is functionally a level-selection menu; instead of a bare-bones and entirely functional static sound, the variety of musical pitches elicited by this single and otherwise mundane action might elevate the action to a more playful or joyful mode of engagement, and encourage a player to further experiment with the system (both the actions and the sounds).Footnote ²⁷ A player could even decide to ‘play’ these sound effects by timing their pointing-at-galaxy actions in such a way that the earcons’ onsets agree with the score’s metre, thus bringing the sound effects fully into the realm of music. Flexible earcon systems similar to the one examined here occur in several other places in Super Mario Galaxy (for example, the sound when a player selects a galaxy in the map after pointing at it, and when a second player interacts with an enemy). In short, the sound design in Super Mario Galaxy allows for – but does not insist on – a particular type of musical play that compliments and deepens this game’s playful aesthetic.

Music, sound effects and dialogue each bring a wide variety of effects to video games, and the potential for innovative and engaging experiences with players only expands when the boundaries between these three sonic categories blur. This chapter provides an entry point into the complex issues of interactive soundscapes, and suggests some frameworks through which to examine the musicality of various non-score sounds, but the treatment of video game soundscapes here is by no means exhaustive. In the end, whether the aim is to analyse game audio or produce it, considering the multiple – and sometimes ambiguous – components of soundscapes leads to the discovery of new ways of understanding game music and sound.

12 Analytical Traditions and Game Music: Super Mario Galaxy as a Case Study

Steven Reale ^*

Ludomusicologists generally agree that cinema and television represent the nearest siblings to video games, and so therefore adopt many methodologies familiar to film music scholarship in their work. For example, the influential concepts of diegetic and non-diegetic, which respectively describe sounds that exist either within or outside a narrative frame,Footnote ¹ feature prominently in many accounts of game audio, and represent one axis of Karen Collins’s model for the uses of game audio, the other being dynamic and non-dynamic, where dynamic audio can be further subcategorized as adaptive or interactive.Footnote ² Ludomusicologists generally also agree that the interactive nature of video games marks its primary distinction from other forms of multimedia, and so a fundamental point of entry into studying game audio is to examine how composers and sound designers create scores and soundtracks that can adapt to indeterminate player actions.

Indeterminacy, though, creates a challenge for modern music theory, which is founded on the close study of musical scores, subjecting to detailed analysis the dots, curves and lines inscribed on their pages to reveal the musical structures and organizational patterns that drive their composition. For music theorists, who may be most comfortable examining music that has been fixed into notation in the form of a musical score, indeterminacy raises additional questions and problems: how does one analyse music for which there is no single agreed-upon structure, and which may be realized in a fundamentally different way every time it is heard? Because of this problem, music theory, alongside historical musicology, has been vulnerable to criticism for its tendency to privilege the notated score above other forms and artefacts of musical creativity. Don Michael Randel has observed, for example, that in popular music, ‘“the work itself” is not so easily defined and certainly not in terms of musical notation’.Footnote ³ In that vein, Lydia Goehr’s study, The Imaginary Museum of Musical Works, has become a gold-standard critical handbook of philosophical approaches to the concept of ‘the musical work’, and one of her first tasks is to dispense with the notion that the work (if such a thing exists at all) can be fully encapsulated by the musical score – even in the case of a common-practice composition for which an authoritative score exists.Footnote ⁴ Other scholarship, such as Jean-Jacques Nattiez’s Music and Discourse,Footnote ⁵ has criticized the manner by which privileging the musical score presumes a direct communicative act between composer and listener that treats the role of the performer as ancillary. At worst, score-centric analysis may even suggest – or explicitly claim – that the need for musical performance reflects an unfortunate real-world compromise to an ideal situation in which musical ideas could be communicated from composer to listener without mediation. Heinrich Schenker’s opening line of The Art of Performance makes this argument in a particularly bold and provocative way: ‘Basically, a composition does not require a performance to exist.’Footnote ⁶ And what is more, some mid-twentieth-century composers embraced electronic music specifically for its potential to eliminate the need for human performance.Footnote ⁷

At best, music theorists might isolate single improvised performances as case studies, freezing them into musical notation to subject them to conventional analytical methodologies. But such strategies fail when applied to the analysis of ever-changing video game music. While a single performance of Charlie Parker and Dizzy Gillespie performing Tadd Dameron’s Hot House might achieve canonical heights worthy of transcription and close study, no single playthrough of a video game could ever be understood as definitive, and it is typically very difficult, if not impossible, to play a video game exactly the same way twice. A compelling analysis of a video game score, then, requires methodologies that are a bit alien to some assumptions that currently govern music-theoretical practice. We must heed David Lewin, who once described his analytic method ‘as a space of theoretical potentialities, rather than a compendium of musical practicalities’,Footnote ⁸ and yet go further, developing as we do approaches that can analyse music as both a set of theoretical potentialities and an adjoining set of practical musical potentialities, introduced whenever a video game score implements algorithmic solutions that provide satisfactory continuations to moments of musical indeterminacy. In short, many theoretical methodologies are not readily equipped to analyse game audio; existing toolsets must be reworked and new ones devised to grapple with this protean music.Footnote ⁹ This is a good thing: analytic methodologies are not destroyed but tempered and strengthened when reforged for use with new materials. This essay examines excerpts from Super Mario Galaxy (2007) through the lens of three music-theoretical methodologies in current practice – formal, reductive and transformational – demonstrating how the idiomatic nature of video game storytelling has a substantial impact on the utility of the analytical tools used to study its music.

The central conceit of Super Mario Galaxy is that the perennial villain Bowser has once again kidnapped Mario’s beloved Princess Peach and is travelling through the universe with her as his prisoner. After some introductory levels, Mario finds himself on a space observatory, which, due to its low power levels, is only able to observe a small set of galaxies, the game’s basic platforming stages. Each galaxy is associated with one or more missions; through completing them, Mario accumulates ‘power stars’, which re-energize the Observatory so that it is able to reach further out into space, allowing Mario access to more galaxies with more power stars, and ultimately Bowser’s hideout.

Methodology 1: Formal Analysis – Theme and Variation

Elaine Sisman, in her article on variation in The New Grove Dictionary of Music and Musicians, defines it as ‘A form founded on repetition, and as such an outgrowth of a fundamental musical and rhetorical principle, in which a discrete theme is repeated several or many times with various modifications’, and she shows that one purpose of the form is epideixis, a rhetoric of ceremony and demonstration, where the purpose of the repetitions is to amplify, ‘revealing in ever stronger terms the importance of the subject’.Footnote ¹⁰ In a classical set, the listener is often presented with variations of greater and greater complexity with the final variation serving as a climactic whirlwind of virtuosity that Roger Moseley has described as a musical ilinx, a ‘dizzying, unruly play of motion’.Footnote ¹¹ Musical variations appear in the score to Super Mario Galaxy, which features compositions that are readily understood as variations of others. I will consider two examples here: the developing orchestration to the Observatory Waltz, and the fragmenting of the Battlerock music in subsequent galaxies.

The waltz accompanies Mario’s exploration of the Observatory, during which it becomes evident that because the Observatory is not fully powered, there are many dark, inaccessible sections in it (see Figure 12.1). The waltz is sparsely orchestrated: after an introductory standing-on-the-dominant, a vibraphone carries the melody with a simple bass accompaniment played by a cello; the harmonic progression of the passage is thus implied by a two-voice contrapuntal framework (see Example 12.1). As Mario explores the galaxy and accumulates power stars, the Observatory is re-energized, and more areas become illuminated and accessible (see Figure 12.2). As this happens, the waltz becomes more lushly orchestrated and new countermelodies are added. Example 12.2 presents an excerpt from the final stage of musical development: the bassline, still articulated by the cello, is joined by horns that fill out the chords in an oom-pah-pah rhythm. Violins now carry the basic melody, adding an expressivity not present in the vibraphone, and the flute, which in the earlier iteration simply doubled the vibraphone upon repetition, now sounds a new melody in counterpoint with the violins. The rich variations of the basic waltz theme thus become an aural metaphor for the enlivened Observatory and the wide array of areas that are now open for play and exploration. Moreover, the variations do not appear simply as an exercise in progressive musical elaborations that we, the players, passively hear. Rather, the player, through Mario, is afforded a degree of agency in causing the variations’ development: it is through our actions that the Observatory is re-energized, and so, within the fiction of the gameworld, it is we who bring about the subsequent variations.

Example 12.1 Comet Observatory waltz, early (excerpt). All transcriptions by the author

Example 12.2 Comet Observatory waltz, late (excerpt)

Figure 12.1 Comet Observatory, early; note darkened, inaccessible area in background*

*The author thanks Ryan Thompson and Dana and Joseph Plank for their assistance creating the gameplay stills in this chapter.

Figure 12.2 Comet Observatory, late; background now illuminated and accessible

In a case opposite to the first example, a theme is varied by stripping its elements down – reduction, rather than elaboration: in an early stage, the Battlerock Galaxy, Mario leaps between platforms and dodges bullets and force fields beside and through an enormous battle station built into an asteroid. The level’s music, a brief reduction of which is excerpted in Example 12.3, features a martial and memorably tuneful scoring that alternates between two primary melodies with a march-like accompaniment, resembling countless science-fiction space epics, like Star Wars (1977), or the original Battlestar Galactica television series (1978–1979).

Example 12.3 Reduction of Battlerock Galaxy music, first theme (excerpt)

The music appears in an unaltered form in the later ‘Dreadnought Galaxy’, and the repetition of the theme connects the military qualities of the battleship with the earlier battle station. However, the initial introduction to this galaxy takes place in a sequence entitled ‘Infiltrating the Dreadnought’, in which Mario sneaks onto the ship by way of a pipe hidden on its side. During this portion of the level, the music that plays is a nearly unrecognizable, minimalistic version of the Battlerock tune: entitled ‘Space Fantasy’ on the official soundtrack release,Footnote ¹² the string melody has disappeared, and the basic chord progression is implied only by the frantic, unpredictably leaping arpeggios that appear deep in the mix of the Battlerock music (see Example 12.4). If we imagine the process of variation to be one of further elaboration of a basic structure, then the ‘Space Fantasy’ music serves as a kind of anti-variation, having stripped the Battlerock music of its elaborations to provide only its basic structure. There is a clever thematic parallel to the story taking place on screen: Mario’s ‘infiltration’ of the Dreadnought implies that he stealthily evades detection during his entry – how fitting, then, that the musical accompaniment is a furtive echo of the bombastic battle tune, whose presence is so subtle that it may even elude detection by the player.

Example 12.4 Reduction of ‘Space Fantasy’ (excerpt), same passage as that associated with Example 12.3. Although the surface chords change in the sixth bar of Example 12.3, the arpeggios that play beneath them at that point are the same as they appear here

‘Space Fantasy’ underscores several other galaxies as well, including the ‘Matter Splatter Galaxy’. Though this level mostly consists of a nebulous void, surfaces appear when droplets of matter fall in the manner of raindrops splattering on pavement, and as these ‘dry up’, the platforms that they comprise vanish until new ones fall in their place (see Figure 12.3). Once again, the appearance of the Battlerock tune invites the observant listener to conceptually connect these ostensibly disparate levels and to imagine a narrative thread that connects them. Perhaps the matter-splatter phenomenon is the result of a devastating battle that once took place, and just as the ‘Space Fantasy’ presents a disjointed, nearly unrecognizable anti-variation of the Battlerock theme, so too has all of the matter of the galaxy been fragmented, with only the sparsest bits being recognizable as traversable ground.

Figure 12.3 Matter Splatter Galaxy

Elsewhere, I have examined the interaction of theme and variations with gameplay in Portal 2 (Valve Corporation, 2011).Footnote ¹³ There, I showed that there was a distinct correlation of the game’s procedures for training the player in its mechanics with a development in the musical accompaniments: that the gameplay itself is organized according to the musical principle of theme and variations. Portal 2 is not unique in this regard; video games are often designed around iterative storytelling and gameplay mechanics: video game stories are often told and retold.Footnote ¹⁴ In the world of Super Mario Bros. this could mean an iterative story in which Mario explores three stages plus a fortress, only to find that ‘OUR PRINCESS IS IN ANOTHER CASTLE’, meaning that he must repeat the exercise seven more times in increasingly difficult challenges. It could also mean the iterative process by which almost all entries in the franchise involve Bowser kidnapping Peach and Mario exploring grander and grander spaces (Land … World … Galaxy) to rescue her, with each title introducing and requiring of the player more and more complex gameplay techniques to do so.Footnote ¹⁵ Given this narratological structure, which is fundamental to the series, the similarly iterative musical principle of theme and variations offers a particularly well suited means of accompanying Mario’s adventures.

Methodology 2: Reductive Analysis

Twentieth-century tonal music theory in North America was dominated by Schenkerian analysis, a methodology developed in the first half of the twentieth century by pianist, composer and editor Heinrich Schenker (1868–1935). Schenkerian analysis is a form of reductive analysis with two essential features: first, it examines a piece with the goal of determining which of its pitches are structural and which are ornamental. A structural pitch is emblematic of a governing harmony at a particular moment, and may be decorated, or prolonged, by ornamental ones. The process occurs on several levels of resolution, referred to as the foreground, middleground and background. A foreground analysis accounts for nearly every note of a work’s outer voices (bass and soprano). Notes determined to be structural at the foreground level are then reproduced at a middleground level, and are once again examined to determine their relative structural or ornamental significance in that higher-order context. The process can continue through several layers of middleground, but eventually the analysis will identify a specific set of tones that cannot be reduced further: these pitches constitute the background. One of the greatest insights of Schenker’s theory is the idea that tonal processes are self-similar – that is, those that govern the musical surface also govern the deepest structure of a work, up to and including the background.

The background level also highlights the second essential feature of the methodology: for Schenker, tonal music is organized around a teleological principle whereby the pitch representing either scale degree 3 or scale degree 5 undertakes a descent to arrive at scale degree 1, accompanied by a bass arpeggiation outlining a I–V–I progression – the combination of the melodic descent and the bass arpeggiation is called the work’s Ursatz, or fundamental structure (see Figure 12.4).Footnote ¹⁶ Tonal processes being self-similar, as noted above, a Schenkerian analysis is likely to identify many analogous teleological descents taking place in short form over the course of an entire work.

Figure 12.4 The two archetypal Schenkerian backgrounds (Ursätze). Schenker also allows for the theoretical possibility of a background that descends by octave, but these are rare in practice

To some degree, Schenkerian methodologies have fallen out of favour, in part due to criticism from musicology for their ideological privileging of the Austro-Germanic musical tradition and assumptions about the relationship between organicism and genius, amongst others.Footnote ¹⁷ Indeed, a principal criticism of the theory is its circularity, in that it prioritizes a certain kind of musical organization, and then deems as ‘good’ works that conform to its desired criteria, such works having been carefully selected from the common-practice canon.Footnote ¹⁸ But perhaps even more importantly, the late twentieth century saw a general disciplinary widening of American music theory, from the purviews of both methodology and repertoire. On the one hand, this meant that music theorists began analysing works from outside the canon of Western art music, the tonal compositions in which Schenkerian elucidation finds its richest rewards. On the other hand, new analytical methodologies arose that, if not entirely supplanted, then at least displaced Schenkerian analysis from its position as the dominant toolset for studying triadic music – one of these, neo-Riemannian analysis, we will consider below.Footnote ¹⁹

But while many criticisms of the theory as a whole are legitimate, that does not mean that it needs to be rejected wholesale: the techniques of reductive analysis that Schenker pioneered are powerful and can be employed without being oriented towards its problematic ends.Footnote ²⁰ It is, for example, perfectly reasonable to assign structural significance to certain musical tones without needing to implicate them as part of a controversial Ursatz, and when I teach Schenkerian analysis to my own students, I emphasize how identifying the relative structural weight of specific notes in a melody can contribute to their vision for an artful performance regardless of how well those tones group on the background level in the abstract. Certainly, other productive ends are possible, too, and this section will focus on one sequence in Super Mario Galaxy – the battle with the boss character King Kaliente – to show how the musical logic of the encounter proposes a hearing that is captured particularly well through a reductive reading that departs from many of the central tenets of conventional Schenkerian theory.Footnote ²¹

King Kaliente is encountered as part of a mission in the early Good Egg Galaxy. After navigating through the level, the player arrives on a planetoid on which is a pool of lava surrounded by a small ring of land. King Kaliente spits out two kinds of projectiles at Mario: fireballs, which must be dodged, and coconuts, which Mario must volley back at him (see Figure 12.5). The pattern of the battle is typical for Mario games: the first volley strikes the boss; on the second volley, the boss returns the coconut, which Mario must strike a second time before it connects; and on the third volley, the boss returns the coconut a second time, so Mario must strike it three consecutive times before it hits. Once it does, the encounter ends. In sum, on the first volley there are two hits – the first when Mario returns the coconut, and the second when it strikes King Kaliente; on the second volley there are four hits, and on the third volley six. Each hit is accompanied by a different tone, so on each successive volley two new tones are heard. The music that accompanies the battle alternates between two principal themes, and the pitches that sound with each hit change depending on which theme is playing. Table 12.1 presents the pitches that sound during the first theme only, but it is worth noting that both versions end on the same B♭.

Figure 12.5 Mario volleys a coconut at King Kaliente

Table 12.1 Number of hits and pitches sounded during each volley of the King Kaliente battle

Volley #	Number of hits	Pitches sounded
1	2	C, G
2	4	C, E♭, F, G
3	6	C, E♭, F, F♯, G, B♭

Figure 12.6 presents each of the three volleys in a rudimentary Schenkerian notation. In so doing, it suggests a reading wherein each of the successive volleys ‘composes out’ the preceding. Hence, the E♭ in the second volley is seen as a consonant skip from the opening C, and the F as an incomplete lower neighbour to the closing G: these additional tones thus ornament the structural perfect fifth. In the third volley, the notes from the second are given stems to indicate their relative structural weight as compared to the newly added tones: the F♯, a chromatic passing tone between the incomplete neighbour F and the structural G, and the B♭, a consonant skip from the structural G.

Figure 12.6 Reductive analysis of the King Kaliente hits

Although the techniques used to create Figure 12.6 are heavily drawn from Schenkerian practices, its interpretive implications are quite different. First, a conventional Schenkerian reading presupposes that the various levels examine the same passage of music at different degrees of resolution: the background level presents only its most structurally significant tones, the middleground level presents all tones above a certain threshold of structural significance and the foreground level replicates most tones, if not every tone of the outer voices. In typical practice, Schenkerian backgrounds and middlegrounds are not intended to reflect passages that are literally heard in performance, let alone in serial presentation,Footnote ²² yet here, the three successive volleys present, in order, background, middleground and foreground readings of a singular musical idea.

Figure 12.6 thus suggests that each successive volley is a more musically detailed version of the preceding one – that the tones accompanying the initial volley present only the background structure of the attack’s musical logic, and that the successive attacks come closer and closer to its musical surface. That musical observation may cause us to recast our experience of the gameplay challenge presented to us. In principle, the structure of the encounter will be familiar to most gamers: once players master a simple gameplay mechanic, they are then asked to contend with more difficult and complicated variations of it. In this view, the first volley may be understood as a prototype that is developed and varied in the successive volleys. Instead, rather than seeing later iterations of the attack as complications of a basic process, the Schenkerian reading allows us to read the final volley as the fundamental process. Rather than the encounter representing the growth and development of a simple technique, it can be understood as an enacted actualization of a complex task through an iterative presentation of versions that begin simply and become progressively more complex. This way of thinking is not only native to gameplay design;Footnote ²³ Jeffrey Swinkin has also shown that such a ‘thematic actualization’ is a productive lens for understanding Romantic variation sets, where ‘variations retroactively define what the theme is’ and ‘themes often do not reside at a determinate point in musical space [i.e., at the beginning] but rather come into being gradually as the piece unfolds’.Footnote ²⁴

Second, Schenkerian logic takes the position that the move from foreground to background reveals the basic tonal structure of a composition through the revelation that each tone of the foreground in some way participates in the prolongation of the tones of the Ursatz, which features a descending scale in either the major or minor mode – a suitable enough distinction for the common-practice tonal works for which Schenkerian analysis is most effectively deployed.Footnote ²⁵ Notably, the background presented in Figure 12.6 does not resemble the conventional backgrounds of Figure 12.4. Furthermore, the King Kaliente battle music is composed in a modal idiom that is not well captured by a major/minor binary: specifically, the C-minor blues scale, the complete collection of which not only appears in the third volley, but also constitutes the basic piano vamp accompanying the encounter (see Example 12.5). Thus, it is the foreground, rather than the background level, that clarifies the tune’s basic tonal language.

Example 12.5 King Kaliente battle music (A theme only) (excerpt)

Crucially, notwithstanding the ways in which the present reductive reading diverges from conventional Schenkerian practice, it is still able to engage with the first of the central Schenkerian insights: that being the relationship between background structures and foreground gestures. Specifically, as already noted, the foreground version of the hits exactly replicate the piano vamp; moreover, as shown in Figure 12.7, the background of Figure 12.6 is identical to the background of the synth melody, which also features structural motion from C up to G – also note the resemblance of the foreground of Figure 12.6 to that of Figure 12.7: each features an initial consonant skip from C to E♭, an F# approach to G and registral and melodic prominence of a high B♭.

Figure 12.7 Reductive analysis of first half of synth melody

The nuances elucidated by a reductive analysis of the King Kaliente theme further provide an explanation for something that long ago caught my attention, and this relates to the second feature of Schenkerian methodology. Video games and Schenkerian analysis are similarly goal-oriented: in this example, the player’s goal is to defeat the enemy, while in the Ursätze of Figure 12.4, the music’s goal is to descend to C. When first encountering this boss fight and experiencing the musical coordination of pitch hits with Mario’s attacks, I had expected that the final note would be the high C, thereby allowing the six-note figure of the third volley to cadence, thus co-ordinating musical closure with the end of the battle. Such a resolution would align well with other teleological theories, like Eugene Narmour’s implication-realization model,Footnote ²⁶ but a high C here would not serve a reductive reading well. First, as a cadential pitch, it would be tempting to grant it structural significance, but since it would only appear in the foreground version of the motive, there would be no compelling way to add it to the background version. Second, the presence of the B♭ is not only necessary to flesh out the complete blues scale, but it also ensures complete pitch variety for the third volley attack: a high C would replicate the pitch class of the first note. Finally, a high C does not participate at all in the musical surface of Example 12.5, except for a brief appearance as an incomplete upper neighbour to the B♭ in the second half of the melody – and so again, it would be a mistake to afford the high C the structural significance it would need to be a goal tone for the passage.

All told, then, the reductive reading presents a more compelling explanation for the pitch selection of the third volley than might a competing theory based on realized implications. In my modified Schenkerian view, the structural pitches of the encounter are C and G – a motive established in the first volley and confirmed in the second. But, and I thank Scott Murphy for phrasing it thusly, ‘permanently vanquishing is categorically different than temporarily stunning’,Footnote ²⁷ and the third volley makes the distinction audibly clear. While the hits for each volley feature the same musical structure, it is only the final, successful volley that transcends the structural G to land on the vanquishing upper B♭.

The criticisms of the late twentieth century certainly impacted Schenkerian analysis’s dominance within the field of North American music theory, but a thoughtful analysis can still carefully tailor the undeniably powerful and unique insights that the methodology is capable of elucidating while avoiding the larger ideological and contextual problems that hazard its practice. In my view, this is to be embraced: on the one hand, the jazzy King Kaliente music, like much music of the post-common-practice era, is not composed according to the assumptions that Schenkerian practice is designed to accommodate, and any attempt to ‘Schenkerize’ it according to them would be doomed to fail. On the other hand, the Observatory waltz of Examples 12.1 and 12.2 could, with relative ease, submit to a conventional Schenkerian reading with a normative Ursatz, but it is hard to imagine what the purpose of such an exercise would be, beyond making a claim that the Observatory waltz fulfils the tonal expectations of the common-practice idiom after which it is modelled, a fact that should already be obvious to anyone with enough background in music theory to ‘Schenkerize’ it. By contrast, reading the King Kaliente hits through a more flexible reductive instrument can certainly amplify and enhance our understanding of their specific idiomatic construction.

Methodology 3: Transformational Analysis

Current trends in transformational music theories began with the publication of David Lewin’s Generalized Musical Intervals and Transformations (GMIT), and although Lewin’s adaptation of mathematical group theory is designed to be applicable to any kind of musical structure, its most common current application is known as neo-Riemannian theory, which explores relationships between major and minor triads not from the standpoint of function (as did Hugo Riemann, the fin-de-siècle music theorist for whom it is named), but rather from the standpoint of voice-leading. As a theory that does not require that the tonal organizations it analyses orient chords with respect to a central tonic, neo-Riemannian theory is well equipped to handle triadic music of the late Romantic period and neo-Romantic music, where the latter enjoys a particular prominence in film scores.Footnote ²⁸

A central technique of neo-Riemannian analysis – as early as its inception in GMIT – is the creation of visual networks that describe transformational relationships amongst chords. Figure 12.8 provides a sample neo-Riemannian network for a hypothetical musical passage with the chords A major, A minor and F♯ minor.

Figure 12.8 A neo-Riemannian network

The diagram indicates that the closest relationship between A major and A minor is a P (Parallel) transformation, which holds the perfect fifth of a triad fixed and moves the remaining note by a half step. It also indicates that the closest relationship between A major and F♯ minor is an R (Relative) transformation, which holds the major third of a triad fixed and moves the remaining note by whole step. Finally, it indicates that the closest relationship between A minor and F♯ minor is a composite move where P is followed by R. The arrow between these latter two nodes indicates that the composite transformation PR will only transform A minor to F♯ minor; reversing the move requires reversing the transformations, so getting from F♯ minor to A minor requires the composite transformation RP.

There are three atomic transformations in neo-Riemannian theory: in addition to P and R, there is an L (Leading-tone exchange) transformation, which holds the minor third of a triad fixed and moves the remaining note by a half step. Any triad may be transformed into any other triad using some combination of these three basic moves. These basic transformations can be combined to form compound transformations, where, for example, S (Slide) = LPR or RPL, H (Hexatonic pole) = PLP or LPL, and N (Neighbour) = PLR or RLP. Some accounts of neo-Riemannian theory treat these compound transformations as unary in their own right.Footnote ²⁹ For the purposes of this chapter, I will be counting individual P, L and R moves when considering tonal distance, so that the compound transformations will be treated as three individual moves.

By itself, Figure 12.8 is of limited usefulness: ideally, networks will be deployed in the interest of a larger analytical point; say, drawing a hermeneutic meaning from its arrangement, like Frank Lehman’s discussion of the Argonath sequence in The Fellowship of the Ring (Peter Jackson, 2001),Footnote ³⁰ or showing that two disparate passages operate under the same transformational logic, such as David Lewin’s comparison of the Tarnhelm and Valhalla motives in Das Rheingold.Footnote ³¹ Often such networks are treated according to the presumptions of mathematical graph theory – in such cases, all that matters is the specific relationship amongst the network’s nodes and edges (here, chords and transformations) and its actual geometrical arrangement on the page is ancillary. Other times, theorists use a graph’s visual appearance to suggest a specific analytical interpretation – for example, as is often the case in Lewin’s diagrams, there could be a presumed temporal dimension that reads from left to right;Footnote ³² in Figure 12.8, perhaps the triads are heard in the order A major, F♯ minor, A minor.

In addition to analytic networks, another common way of visualizing harmonic motion is through a Tonnetz (see Figure 12.9). In such diagrams, pitch-classes are arranged along three axes – the horizontals represent perfect fifths; the ascending diagonals major thirds; and the descending diagonals minor thirds. Triads are then represented by the triangles established by the axes: upward-pointing triangles represent major triads while downward-pointing triangles represent minor triads. Neo-Riemannian thinking presumes enharmonic equivalence; so, resembling the logic of Pac-Man (Namco, 1980), the diagram wraps around itself, establishing the doughnut-shaped geometric figure called a torus. To clarify: continuing along the lower horizontal past G leads to the D on the middle left; continuing down along the descending diagonal past the G leads to the B♭ on the lower left; and continuing down along the ascending diagonal from the G leads to the D♯ in the upper right, which is enharmonically equivalent to E♭. The Tonnetz makes visually clear how combinations of the atomic transformations create greater tonal distances – how, for example, the PR transformation is more tonally remote than any of the atomic transformations alone.Footnote ³³

Figure 12.9 Tonnetz representation of the network in Figure 12.8 (left), and of the third atomic transformation, L (right)

A potentially puzzling aspect of neo-Riemannian networks is that we often understand music to be a linear art form, by which I mean that its events are usually performed and experienced through time in one specific order, and yet its diagrams present music on a plane, suggesting that its nodes can be explored in any number of ways. According to this logic, the chords in Figures 12.8 and 12.9 could appear in any order, with any number of internal repetitions. How can we reconcile the planar, non-linear aspect of neo-Riemannian theory’s geometric representations with the conventionally temporal, linear manner in which music is experienced?

The problem is compounded when we consider that Lewin would prefer not to understand analytical networks from a bird’s-eye perspective (what he calls the ‘Cartesian view’), but rather wants to imagine a listener inhabiting the networks, experiencing them from a first-person perspective, which he calls the ‘transformational attitude’: ‘“If I am at [a point] s and wish to get to [a point] t”’, Lewin asks, ‘“what characteristic gesture … should I perform in order to arrive there?” … This attitude’, he continues, ‘is by and large the attitude of someone inside the music, as idealized dancer and/or singer. No external observer (analyst, listener) is needed’.Footnote ³⁴ Considering again Figure 12.8, if we imagine ourselves inside the network, hearing these three chords from a first-person perspective, then what exactly would it mean for our musical experience to move ‘clockwise’ or ‘counterclockwise’? And is there a conceptual difference between these two experiences that should dictate the placement of the network’s nodes?

In contrast to instrumental music, and even music that accompanies opera, film, musical theatre or television – perhaps game music’s closest relatives – video games are inherently spatial: when we play, we are very frequently tasked with directing a character – our in-game avatar – through a digital world.Footnote ³⁵ Given that distinction, it is comparatively quite easy to imagine moving through a musical space, if that musical space is somehow co-ordinated with the virtual space the player moves through.

In Super Mario Galaxy, the principle way that the game space is organized is that the Comet Observatory serves as a central hub for a handful of domes; in each dome is a portal to a galaxy cluster, allowing Mario to select a specific galaxy to travel to. Some of these galaxies feature boss battles or other one-off tasks, and in such cases, the musical material is usually borrowed from similar levels elsewhere in the game. Table 12.2, then, lists each primary galaxy for each dome and the key centre for its principle musical accompaniment; I here define ‘primary’ galaxy to be one for which the game offers more than one mission. Observing that the Engine Room and Garden employ a lot of reused music (and one atonal tune without a clear key centre), the following discussion restricts its purview to the first four domes, which also happen to be the four domes that are accessible from the first floor of the Observatory.

Table 12.2 Primary galaxies with their key centres in Super Mario Galaxy

Dome	Galaxy	Key Centre	Notes
Terrace	Good Egg	C major
Terrace	Honeyhive	C major
Fountain	Space Junk	A♭ major
Fountain	Battlerock	E major
Kitchen	Beach Bowl	A major
Kitchen	Ghostly	D minor
Bedroom	Gusty Garden	D♭ major
	Freezeflame	D minor
	Dusty Dune	E major
Engine Room	Gold Leaf	C major	Reuses ‘Honeyhive’ music
	Sea Slide	A major	Reuses ‘Beach Bowl’ music
	Toy Time	C major	Uses theme from very first Super Mario Bros. game (Nintendo, 1985)
Garden	Deep Dark	N/A	Atonal – no clear key centre
	Dreadnought	E major	Reuses ‘Battlerock’ music
	Melty Molten	D minor

Figure 12.10 reimagines the first four domes, along with the Comet Observatory as a central hub, as part of a transformational network. In so doing, it extends the logic of the neo-Riemannian transformations beyond labelling voice-leadings between adjacent chords in a local progression to encompass modulations between key areas, and describes them by the transformations that would be required to move from one tonic triad to the next.Footnote ³⁶ The diagram analyses the transformations needed to get from the key centre of the Observatory to any of the included galaxies, and also analyses transformations amongst galaxies that appear in a single dome. The length of each segment in the diagram is relative to the number of transformational moves it takes to get from key to key. Consider first the transformational connections within each dome: for domes 1, 2 and 3, there are only two primary galaxies, and therefore there is only one transformation: in the first dome, the Identity transformation maps C major onto C major; in the second dome, a composite PL transforms A♭ major into E major, for a total of two moves; and in the third dome, a composite transformation N, which requires three moves as shown in the figure, turns A major to D minor. The growth in transformational complexity is indicated by the relative lengths of the edges between the nodes, and as a result, the space of the tonal galaxy clusters gets progressively larger. The largest tonal galaxy cluster of the four, of course, is the Bedroom, owing to the multiplicity of transformational edges required to connect all of the nodes. Significantly, the player’s access to each dome is unlocked in precisely the order from smallest to largest: Terrace, Fountain, Kitchen and Bedroom. Thus, progress through the available galaxy clusters is correlated with their internal tonal complexity.

Figure 12.10 Transformational analysis of the first four domes of Super Mario Galaxy

Note too, that in almost every case, the tonal distance between galaxies within a dome is smaller than the tonal distance between the Observatory and the galaxy cluster, where this distance is measured as the combined number of transformational moves to get from the Observatory to the galaxies in one cluster. Thus, it takes a total of four transformations to get from the Observatory to the galaxies in the Terrace, eight to get to the Fountain, three to the Kitchen and nine to the Bedroom. These distances nicely capture the implied vast astronomical space between the Observatory and each cluster, relative to the smaller astronomical distance between galaxies in any one cluster. Moreover, ordering the domes about the hub from shortest distance to longest replicates the order of the physical locations of each dome around the central hub of the Observatory. This implies a ready metaphor between the locations of each dome on the tonal map and their physical locations in the gameworld.

Whereas the planar logic of transformational networks often chafes against a typically linear listening experience, in the case of Figure 12.10, the spatial, exploratory nature of Super Mario Galaxy’s design readily lends itself to thinking about its musical score in a similarly spatial way. It is not difficult to imagine ourselves as listeners moving through Figure 12.10 in a first-person perspective, going from key area to key area in a manner isomorphic to the way that Mario travels between the Observatory and the various galaxies, because there is a ready metaphor that connects tonal distances with segment length and another that correlates size with tonal complexity; these metaphors are at the heart of a powerful interpretative resonance between the geography of the game and the geometry of the diagram. The composite tonal distances between the Observatory waltz and the galaxies of each cluster are correlated with the physical layout of the Comet Observatory. Tonal complexity within a cluster is correlated with the order in which Mario gains access to these four domes, which further suggests a reading that tonal complexity is also correlated with the game’s power stars. This means that greater power to the Comet Observatory implies a capacity for listener-Mario to negotiate greater tonal complexity, just as character-Mario is being asked to negotiate more and more difficult platforming tasks. When transformational moves are correlated with astronomical distance, the organization of Figure 12.10 can imply how remote each cluster is from the Comet Observatory relative to how close they are to each other. And finally, there is a happy coincidence that Figure 12.10 itself resembles the star maps in the various domes that Mario uses to travel to the game’s galaxies (see Figure 12.11). In these ways, Figure 12.10 presents a particularly apt representation of the musical – but also physical – adventures of Mario.

Figure 12.11 Travelling to the Good Egg Galaxy from the Terrace Dome

Conclusions

I began my research for this chapter by taking seriously two existential problems with modern music theory: first, the notion that a musical work can exist as a singular entity frozen in time through musical notation treats the act of musical performance as a supplement to the ‘real’ musical work that takes place between composer and listener. The second is that the techniques and methodologies that underpin modern music theory are well suited to music frozen in time, but strain when asked to account for music that cannot be fixed in such a way. As a result, modern music theory is incentivized to privilege notated music and to downplay non-notated and improvised musics, even if these are vastly more characteristic of the grand scope of human music-making than the former.

As a practising music theorist, though, my experience has been that most analysts are sensitive musicians in their own right, who use theoretical tools to advance deeply musical insights that can inform performance. At the same time, it is certainly true that most analytical methodologies rely on treating some kind of fixed musical notation as an input and begin to break down when such notation can be neither acquired nor self-produced through transcription. But since there continues to exist vast amounts of music that can be fixed into notation, there has not generally been felt a widespread, urgent need to rework or revise existing methodologies to accommodate those that cannot. This has been true despite the experiments in the 1960s with chance-based, or aleatoric, music; or the rise of the cultural importance of jazz, rock, jam-band or dee-jayed electronic music, genres that may all rely on moment-by-moment indeterminacy. Furthermore, this is true despite the historical reality that the vast majority of human music-making has been improvised and non-notated: as Philip V. Bohlman has observed, ‘The fundamentally oral nature of music notwithstanding, musicology’s canons arise from the field’s penchant for working with texts.’Footnote ³⁷ In the end, the problem is a specific case of a general one, identified by literary theorists, that overlooks the reality that a present-day conception of artworks as unified, static entities is both recent and anomalous, stemming from a nineteenth-century European shift in patterns of cultural consumption.Footnote ³⁸

Each of the three methodologies in this chapter was intentionally selected to apply an analytical technique in current practice in the field – and, equivalently, a technique designed with fixed, notated music in mind – to a genre of music that is often dynamic and indeterminate. What excites me as a music theorist is that the theories did not break, but were instead refracted by the logic of the game’s design. To adopt a video-gaming metaphor, the theories ‘levelled-up’: more than just proven in battle, they are enhanced and ready to take on new, challenging analytical encounters.

But of even greater significance than our ability to apply music-theoretical approaches to a new, underrepresented musical medium, is the realization that the indeterminate nature of video game play perhaps aligns the medium more closely with other improvised artistic practices than do its more static relatives in film, television or opera. Video game musical analysis requires that music theories grapple with uncertainty, and the toolsets used to analyse video game music can be adapted to shed light on other kinds of improvised musics. We can and should invert the obvious question – ‘what can music theory tell us about game audio?’ – and instead ask how video game music can transform music theory, and in so doing, create for it opportunities to study musics that at first glance seem wholly unrelated to those that accompany Mario’s journeys through the universe.

13 Semiotics in Game Music

Iain Hart

Introduction: Semiotics Is Not Hard

Playing a video game is a very communicative activity. Set aside the ideas that communication only happens between humans and that communication only happens with words. We communicate with animals, machines and the built environment all the time, conveying our needs, aspirations, designs and emotions as we live in and shape our world. We do the same when we play video games, inhabiting a virtual space and forging our path through it. Understanding how we communicate with a video game, and how a video game communicates with us, helps us understand the fundamental elements of the video game text (like graphics, sound, narrative and music) and how they fit together. It also helps us to be able to shape and direct those communications, if we are in the business of constructing or composing for video games.

One of the best ways to understand communication is to analyse the signs that are shared back and forth in communication and what they mean. This is known as ‘semiotics’. There is an enduring and erroneous belief amongst the general population that semiotics is difficult and perhaps a little dry. The purpose of this chapter is to demonstrate firstly that semiotics is not all that complicated, and secondly, that semiotics can make sense of a lot of how video game music works. In fact, a lot of the core functions of game music are semiotic processes so simple we barely notice them, but so effective that they utterly transform gameplay.

Semiotics

What Are Signs?

Semiotics is the study of signs. Charles Sanders Peirce, who was the founder of one of the two main branches of semiotics (the more universal of the branches; the founder of the other, more linguistic branch was Ferdinand de Saussure), described a sign as ‘anything, of whatsoever mode of being, which mediates between an object and an interpretant’.Footnote ¹ The ‘object’ in Peirce’s definition is relatively straightforward; it is a thing that exists, whether as a tangible lump of matter or as an abstract concept. The ‘interpretant’, meanwhile, is the effect of the object on one’s mind – the idea that would typically pop into your mind in relation to that object. The ‘sign’, then, is the intermediary between a thing and an idea, or between the external world and the thought world. In fact, Peirce wrote that ‘every concept and every thought beyond immediate perception is a sign’.Footnote ² Furthermore, signs can be the likeness of the object, like a photograph; they can be an indication towards the object, like an emergency exit sign; or they can be a symbol that only represents the object by convention or by abstract relationship, like a word or the concept of tonality.Footnote ³

Adopting such a broad definition of the term and concept of the ‘sign’ presents us with a massive scope of investigation from the very outset, which the astute semiotician will find profoundly liberating. If anything that represents something to us is a sign, then we are not bound by analogies to language systems or by prescriptive semantics. We can address meanings on the terms in which they are transmitted – describing them with words as humans are wont to do, but not assuming they will map neatly or even consistently to a set of words or concepts. This is quite handy when we look at music, which is a largely non-verbal and abstract medium (even singers can convey abstract and non-verbal meanings, like emotions, alongside the words being sung).Footnote ⁴ It becomes especially useful when we look at music in video games, where music can be directly associated with events, characters and areas, even while maintaining some level of abstraction from these things.

How Signs Are Used to Communicate

However, semiotics is not necessarily the study of communication. Jean-Jacques Nattiez, who developed a Peircean semiotics for music, observed that while communication through music is possible, it cannot be guaranteed.Footnote ⁵ Communication depends on a meaning being inscribed by one person, then a corresponding meaning being interpreted by another person. It is possible for the other person to interpret a meaning that is completely different to the meaning inscribed by the first person – we might call this a mis-communication.Footnote ⁶ And music is generally very open to interpretation.

Music cannot prescribe its own meanings, except within cultural and textual conventions and relationships. It can ‘connote’ meanings by implying relationships outside of itself – evoking emotions like the sadness we feel in Beethoven’s ‘Moonlight Sonata’,Footnote ⁷ or suggesting scenes like the flowing waters of Smetana’s VltavaFootnote ⁸ – but it does not typically have a mechanism to ‘denote’ or exactly specify meanings. Any denotations we hear in music (and even many of its connotations) are contingent on external relationships. For example, Smetana’s Vltava may suggest to us the movement of a grand river, but to do so it relies on our familiarity with Western musical tropes (such as the relationship between conjunct motion and the concept of ‘flow’); to link the music to the eponymous Czech river itself requires the composer’s direct explication in the title of the work.

Like all human creative output, music exists within contexts that provide such conventions, and it moves between contexts as it is encountered by people, in the process forming relationships between contexts. Even musical analysis (sometimes described as an investigation of ‘the music itself’) is what Theo van Leeuwen calls a ‘recontextualization’; meanings tend to attach themselves to works as they are experienced and distributed, even (or especially?) when this is done with an analytical purpose.Footnote ⁹ The ever-present existence and never-ending negotiation of context allows music to be denotative, within specific bounds. For van Leeuwen, ‘musical systems (systems of melody, harmony, rhythm, timbre, etc.) do have “independent” meaning, in the sense that they constitute meaning potentials which specify what kinds of things can be “said”. … Meaning potentials delimit what can be said, the social context what will be said’.Footnote ¹⁰ So when we start playing a video game and hear its music, we are not accidentally listening to some tunes with no meaningful relationship to the images we see or the interactions we perform. Each element of a video game has an influence over the meanings we interpret in the other elements, resulting in a cohesive and semiotically rich experience filled with the potential, at least, for communication.

For a ‘meaning potential’ to turn into a ‘meaning’, the potential just needs to be fulfilled through experience, much like how specific decisions can affect the final stages of some games, giving you one ending from a set of possible endings. For a ‘potential communication’ to turn into a ‘communication’, however, there needs to be a correlation between the composer’s intended meanings and the meaning suggested by the interpretants in the player’s mind. This is a minefield. Game composers and developers can try to influence the interpretants that music will form in the player’s mind by creating specific contexts for the music, such as by co-ordinating or juxtaposing music with other game elements. However, there are other factors that affect how the player’s mind works (everything from their past gaming experience, to their personality, to what they had for dinner). Every experience the player has during gameplay has a meaning, but these meanings form part of a negotiation of communication between the player and the game – a negotiation that involves sounds, images, movements and inputs. Musical signs form part of this negotiation, and their meanings can depend as much on the player themselves as on the music’s initial composition. Tracking down the source of musical meanings, and tracing how these meanings move through and influence gameplay, are key to understanding game music’s signs and communications.

Semiotics in Game Music

Two Semiotic Domains

Next time you play a game, take notice of the things you are observing. Take notice of the characters: how they are dressed, how they move, how they interact with you. Take notice of the scenery: its colours, shapes and textures; its architecture, time period, planet of origin or state of repair. Take notice of the sounds of the concrete jungle, or the living forest, or the competitors’ race cars or the zombie horde. Take notice of the music: the shape of its tune, the texture of its harmonies and rhythms, how it alerts you to characters or objects, how its ebbs and flows guide your emotional responses. The video game is providing you with a constant stream of signs that signify many different things in many different ways; your entire perception of the gameworld is constructed by communications through visual, aural and/or textural signs.

While you are at it, take notice of the things that you are communicating to the game. You are making inputs through a keyboard, mouse, controller, joystick or motion detector. Your inputs may be directional (‘move my character to the left’), functional (‘change up a gear’) or metafunctional (‘open the inventory menu’). Many of your inputs are reactions, prompted by conscious needs (‘I need to see what is in my inventory’) or by semi-conscious or unconscious needs (flinching when you hear a monster right behind your character). Many are the result of your choices, whether to move towards a strange sound, to cast a particular spell or to turn up the music volume. Your actions are providing the game program with a stream of input signs that come from you and that represent how you want to interact with the gameworld; your interactions are constructed of signs that could take the form of keystrokes, button presses, movements, keywords, voice commands, configuration file changes and so on.

The game music’s communications to you, and your communications with the game and its music, come from two separate semiotic domains that have distinct (though related) sets of semiotic activity. The first semiotic domain is centred around the creation of the video game and the composition of the music: this is when musical (and other) elements of the game are put into the video game program based on creative and developmental choices. The second semiotic domain is centred on your actions in gameplay: this is when the musical (and other) elements of the game program are actualized into your experience of the game through your interactions. When you are playing a game, you are hearing musical signs from both semiotic domains – in the most basic sense, you hear the composer’s original musical signs (initial composition) in the way they are being configured by your interactions (gameplay). And while it often makes sense to look at these together (since games are made to be played), there is some valuable information about both the game and how you play it, and the music and how you hear it, that can be obtained from looking at them separately.

Initial Composition and Initial Meanings

There are two kinds of signs that are put into video game music when it is composed and put into the game program: musical signs and configurative signs. Musical signs are the things that we typically think of as music, like notes, melody, harmony, instrumentation, dynamics and so on. These typically function the same way in video games as they could be expected to in any other context. For example, if you remove a music file from a video game’s program folder and play it in an audio player, it will often sound like regular non-interactive music.Footnote ¹¹ You can also analyse these signs much as you would any other piece of music. Accordingly, these are often the first signs to be analysed when we look at video game music. These are the signs that bear most resemblance to film music, that are transposed into orchestral settings, that are played on the radio, and that you buy as part of an Official Soundtrack album. And, if adequately accounting for the cultural contexts of both composition and analysis, investigating the musical signs as at initial composition can reveal a lot about the music and its role within the game.

Configurative signs, meanwhile, are the signs that turn music into video game music. These are things that come into the music through the functions and designs of the video game program, and can include timing, repetition, layering and responses to interactivity. They take musical signs and configure them to work as part of the game. For example, the video game Halo 3 (2007) plays music as the player enters certain spaces filled with enemies. The timing of the start of the music, and the consequent correspondence between the music and the newly entered space, are configurative signs. Like earlier games in the Halo series, Halo 3 also turns off the music after the player has spent too long in the one space to encourage them along and to minimize repetition, and this is also a configurative sign.Footnote ¹² Configurative signs contextualize musical signs within the game program by determining how the music will fit in with the other game elements. They are, therefore, very important in determining how (and how effectively) music can denote meanings within the game. The musical signs in Halo 3’s soundtrack may independently connote a sense of action, but the configurative signs that line up the start of the music with entering a theatre of action are what allow the music to denote the experience of battle.

Unlike musical signs, configurative signs are hard to remove from the video game and analyse separately. The functions built into the game program can be difficult to observe independently without access to the video game source code and the requisite knowledge to interpret it. Likewise, the precise mechanisms used by audio subsystems or middleware require a certain amount of technical expertise to both implement and interpret. However, we can make educated deductions about these functions based on gameplay experience. Battles in Halo 3 tend to take place in open spaces that are joined by corridors, roads or canyons. After entering several of the open spaces and hearing the music start, we can infer that the game contains a function to start music when entering open spaces. Further gameplay can confirm or qualify the hypothesis, and then analysis can focus on when and why these configurative signs are implemented in the game and what effect they have on the music. This must be done with a little care, however, as there are additional semiotic processes taking place during gameplay.

Gameplay and the Player’s Influence

Gameplay simultaneously changes nothing and everything about video game music. Just as in the semiotic domain of initial composition, music within gameplay consists of musical signs and configurative signs. These are, for the most part, the same sets of signs that were present at initial composition. However, during gameplay, the player’s interactivity actuates the configurative signs, which then act on the musical signs to determine what music the player hears and how the player hears it. So, while the Halo 3 game program contains both the music cues to play in certain areas and the program functions to make the music stop playing after a specific amount of time to hurry the player along, it is only the player’s choice to linger that cements the cessation of music in that area as a fact of the musical experience. If the player chose to clear the area and move on quickly, they would hear a transition to another music cue, but the cessation of music in that area would not be part of their gameplay experience. The player’s choices add and exclude musical potentials in the played experience (an important thing to remember when analysing music from within gameplay); they can activate a different set of configurative signs than another player and end up with a subtly but observably different experience.

However, the player brings more to the game experience than just the power to activate configurative signs. They bring a set of signs that correspond to them personally. These can include the choices they make and the reasons they make them, their musical, gameplay or stylistic preferences, their mood, their cognitive associations or even artefacts of the world around them. These personal signs strongly influence which of the configurative signs built into the game the player will activate during gameplay. The activated configurative signs subsequently act on the musical signs of the game, crystallizing a musical experience from the potentials that were built into the game mixed with the effects of the player’s personal experiences. One of the simplest examples of the influence of personal signs is the way your mood affects how you drive in a racing game. If you are having a bad day, you are more likely to drive rashly, make errors, spin out and lose the race. In this circumstance, not hearing the victory music at the end of the race is a result of the negative signs floating around in your mind being expressed in your interactions.

Figure 13.1 demonstrates how each set of signs relates to other sets of signs within the semiotic domains of initial composition and gameplay; an arrow from X to Y in this diagram indicates an ‘X acts on Y’ relationship. The arrow from personal signs to musical signs may be unexpected; it indicates that there are circumstances in which the player can add musical signs directly to the game (such as by playing their own music instead of the soundtrack, or by providing a music file to be played within and as part of the game like in Audiosurf, 2008) or in which the player’s perception of certain musical signs is strongly influenced by their personal experience (such as dancing to ‘I Don’t Want to Set the World on Fire’ by the Ink Spots at their wedding, then hearing the song played in association with a post-apocalyptic wasteland in Fallout 3, 2008).

Figure 13.1 Actions of signs in the semiotic domains of interactive configuration and gameplay

Because the musical experience of gameplay is constructed from a combination of signs from the music’s initial composition and signs from the player themselves, every player’s experience of a game’s music is unique. Of course, all experiences of a game are similar, since they are built from the same musical and configurative potentials that were originally put into the game. However, it is important to be mindful that approaching video game music through gameplay changes the semiotic domain in which listening and analysis occur, and that there are more sources of signs in the semiotic domain of gameplay than just ‘the text itself’. This is one of the great differences between video games and other texts, such as films; while any viewer of a film brings a lot of their own thoughts, opinions and signs to the task of interpreting a film, they (usually) do not change the film itself in the process of viewing it. Espen Aarseth describes texts like video games as ‘ergodic’, meaning that work has to be done to them in order to interpret them.Footnote ¹³ Video games and their music are not the only kinds of ergodic texts, but they are arguably the most widespread (or, at least, commercially successful) examples in today’s world. Furthermore, an interesting corollary of doing work to a text is that, as Hanna Wirman points out, the player can be considered as an author of the text (especially when considering the ‘text’ to be the played experience of the game).Footnote ¹⁴ So video game music is simultaneously just ‘regular old music’, and a relatively ‘new’ kind of text with potentials for expression and creativity that we have barely started to explore. The following case studies of musical examples from the Elder Scrolls games demonstrate both aspects of this duality.

Case Studies: Music of the Elder Scrolls Series

The Elder Scrolls video games are a popular series of high fantasy role-playing games published by Bethesda Softworks. The series was cemented as a fixture of video game culture with the third instalment, The Elder Scrolls III: Morrowind (2002, hereafter Morrowind), a position reinforced by the subsequent two instalments, The Elder Scrolls IV: Oblivion (2006, hereafter Oblivion) and The Elder Scrolls V: Skyrim (2011, hereafter Skyrim). In each of these three games, the player character starts the story as a prisoner of unknown heritage and ends it as the hero of the world, having risen to glory through battle, magic, theft, assassination, vampirism, lycanthropy, alchemy, commerce and/or homemaking. The unrivalled stars of the games, however, are their expansive, explorable and exquisite open worlds. Each game is set in a different province on a continent called Tamriel, and each provides a vast landscape filled with forests and grasslands, lakes and rivers, settlements and towns, mountains and ruins, along with caves and underground structures that serve as the loci of many missions. Finding the location of some missions requires a good deal of exploration, and free exploration can allow the player to stumble upon side quests, so the player can spend a lot of time in the great digital outdoors of Tamriel.

Accompanying Tamriel in games from Morrowind onwards are scores by composer Jeremy Soule. The scores are orchestral, keeping quite close to a Western cinematic aesthetic, but with occasional influences from northern European folk music. They infuse gameplay within their gameworlds and they link the gameworlds to each other, assisting the tapestry of the games’ stories, sights and sounds to envelop the player’s senses. The first case study below examines themes that run through the title music of each game, demonstrating the signifying power of novelty and nostalgia in musical introductions. The second case study shows how game music often acts as a sign for the player to take notice, and how the player’s learned lessons translate into personal signs that affect further gameplay.

Linking Games Through Musical Composition: Elder Scrolls Title Music

The scores for the Elder Scrolls games maintain an aesthetic congruence with the story world of each game. For example, Skyrim’s score frequently makes use of a male choir to invoke an aesthetic aligned with modern imaginations of Vikings, an overt expression of Scandinavian inspirations that are also built into the game’s musical foundations in references to Scandinavian nationalist symphonic music.Footnote ¹⁵ However, the scores of each game also maintain congruence with the scores of the other games using variations on several shared melodic and rhythmic themes, creating the potential for familiarity. This is a powerful mechanism for drawing the player into a video game series, and the Elder Scrolls games begin this work even before gameplay proper, as the title themes of each game exhibit some of the clearest thematic parallels. When a player plays Oblivion for several hundred hours, for instance, and then begins to play Morrowind because they want to experience the earlier game’s story as well, there are musical signs in Morrowind that suggest they are in the same gameworld that they enjoyed in Oblivion.

The title music from Morrowind uses a melodic theme that can be separated into three melody fragments, marked A, B and C in Example 13.1. Fragment A initiates a sense of action by moving quickly from the tonic to an emphasized minor third (a note which, as tonic of the relative major, suggests optimism in this context), and then repeating the action with further upward progress. The roundabout return to the tonic is like heading home after a mission with a momentary detour to pick a flower that you need to make a potion. Each ‘action’ in Fragment A is made up of a rhythmic figure of a minim on the first beat of the bar preceded by two quavers (which may be counted out as ‘three-and-one’) – a figure that occurs frequently in this theme and features prominently in the themes of subsequent games. Fragment B repeats the first ‘actions’ of Fragment A but then achieves and affirms the higher tonic. Fragment C triumphantly pushes on upwards to the relative major’s higher tonic – the fulfilment of the optimism of Fragment A – and then descends progressively to where this journey all began. Morrowind’s title music repeats this sequence three times, each with increasing volume and orchestral richness, each time attempting to draw the player deeper into a sense of adventure with which to better enjoy the imminent gameplay.

Example 13.1 Excerpt of title music melody from Morrowind

The primary melodic theme from Oblivion is a variation on the first two parts of Morrowind’s theme (each variation marked in Example 13.2 as A2 and B2). There is a marked increase in the pace of Oblivion’s version of the theme, which is somewhat disguised in this transcription by the shift to 6/8 time, a shift which results in an aesthetic more akin to that of a mainstream pirate film than of a high fantasy epic. The ‘three-and-one’ rhythmic figure is carried over, though here as ‘three-and-four/six-and-one’ figures. The additional pace of this theme gives the figures a somewhat more adventurous feel, bringing to mind swordplay or the sound of a horse’s galloping hooves. Notably, the primary melodic theme of Oblivion does not use Morrowind’s Fragment C at all – following Fragment B2, the title music instead enters a series of dramatic variations on Fragments A2 and B2. However, Fragment C does make a (somewhat veiled) appearance later in Oblivion’s title theme. Though Fragment C3 in Example 13.3 is a very simplified version of Fragment C, its rhythm and downward shape are both suggestive of the Morrowind fragment and fulfil a similar homeward purpose. The return to the tonic is, however, attained with much less certainty. With the benefit of hindsight, this could be seen as an allegory for the uneasy peace achieved at the resolution of Oblivion’s main quest line, which is revealed in Skyrim to have been very much temporary.

Example 13.2 Excerpt of title music melody from Oblivion

Example 13.3 Later excerpt of title music melody from Oblivion

Skyrim’s primary melodic theme is varied even more substantially from the original Morrowind material than was Oblivion’s primary theme, to the point that it may be more accurate to describe Skyrim’s primary melodic theme as a variation on Oblivion’s theme rather than that of Morrowind. It continues and expounds upon some of the ideas behind Oblivion’s variations, such as pacing, a 6/8 time signature and an air of adventurousness. Like that of Oblivion, the Skyrim title music initially uses only variations of Fragments A and B (marked as A4 and B4 respectively in Example 13.4), which it then repeats with slight variations (Fragments A4* and B4* in Example 13.4). The theme as a whole also has a much smaller pitch range than earlier themes, though it makes up for this in loudness. The main element that links Skyrim’s primary melodic theme back to Morrowind is the rhythmic figures or ‘actions’, as these are again featured prominently. It may even be argued that, at this stage, the rhythmic figures are the most significant shared elements tying the title music of Morrowind, Oblivion and Skyrim together, and the main musical sign to the player of commonality between the games (prior to the start of gameplay, at least).

Example 13.4 Excerpt of title music melody from Skyrim

However, later in Skyrim’s theme the original Morrowind theme returns in an almost pure state. Example 13.5 shows an excerpt from Skyrim’s title music including the two most prominent vocal parts and a trumpet accompaniment. The three original Fragments from Morrowind are reproduced very closely in Fragments A5, B5 and C5, most prominently in the lower vocal part and the trumpet line. This is arguably more of a choral and orchestral arrangement of the Morrowind theme than a variation upon it.

Example 13.5 Later excerpt of title music melody from Skyrim (with trumpet accompaniment)

From a semiotic point of view, the melody that I have called the Morrowind theme above is a sign, as are the Fragments A, B and C that constitute the theme, and even the rhythmic figures or ‘actions’ that recur throughout the theme and its variations. All of these fragments are capable of bearing meaning, both as small elements (sometimes called ‘musemes’)Footnote ¹⁶ and combined into larger structures. At the time of Morrowind’s release, these were symbols of the game Morrowind and, after gameplay, of the experience of playing Morrowind. Through the variations on the theme presented in the scores of the subsequent games, they have been extended into symbols of the experience of playing an Elder Scrolls game. The particular instances of the theme still refer to the game in which that instance is found (that is, Skyrim’s versions of the theme as sung by a ‘Nordic’ choir are still very much symbols of Skyrim), but they also refer laterally to the sequels and/or prequels of that game and to the series as a whole.

The latter example is a good illustration of this twofold signification, as it is functionally a part of the Skyrim theme but is also clearly derived from Morrowind. Its inclusion is, in fact, rather fascinating. The excerpt in Example 13.5 begins at 2 minutes 8 seconds into Skyrim’s title music, but Skyrim’s title menu is remarkably simple, presenting new players with only three options (start a new game, view the credits or quit). It is entirely possible that a first-time player could spend less than 20 seconds in the menu before starting a game and so could miss the close reproduction of the Morrowind theme entirely. Its inclusion allows Skyrim’s title music to powerfully signify both nostalgia for past games and the novelty of a fresh variation to long-term fans of the series,Footnote ¹⁷ while its delayed start prevents it from interfering with the title music’s essential function of priming seasoned Elder Scrolls players and neophytes alike for the imminent quest through established musical tropes of adventure and place.Footnote ¹⁸ Different groups of people may experience a different balance of nostalgia, novelty and expectation, which is both testament to the richness of the score’s weave of musical and configurative signs, and an example of how the semiotic content of music can be altered by the semiotic processes of gameplay.

Linking Experiences Through Musical Gameplay: Skyrim’s Action Music

Playing Skyrim takes you on many journeys. Some of these journeys give you a chance to discover new parts of the gameworld, new sights and sounds and characters and challenges. Other journeys are strangely similar to journeys past (all dungeons start to look the same after a while). The journeys are all somewhat alike, of course, on account of being made of the same basic elements: the parts that make up the game Skyrim. Building familiarity with gameplay mechanisms and signifiers early in gameplay can help the player engage with the game’s world and story. As long as familiarity avoids crossing the line into boredom, it can be a useful tool in the player’s arsenal for success. In a game like Skyrim, success can be achieved in large and small ways: finishing quests (both major and minor) and winning battles.

The data files of Skyrim contain eight full-length (between one- and two-minute) cues of combat music that are played during battles (there are also some derivative and/or smaller versions that are not considered here). These cues share several commonalities, including fast tempi, minor keys, high-pitch repeated staccato or accented notes and dramatic percussion that emphasizes the first beat of each bar – traits that align with the typical role of combat music, which is to increase the sense of action and tension in the sonic environment during combat scenarios.Footnote ¹⁹ Skyrim and its prequels are open-world games and the player can enter combat scenarios inadvertently; the pleasant music that accompanies exploration of the countryside can be suddenly interrupted by combat music, and this is sometimes (if the enemy has not yet been seen or heard) the first thing that alerts the player to danger. Helpfully, Skyrim’s combat music has a sign at the start of each combat music cue to point out what is happening. Figure 13.2 shows waveforms of the first five seconds of each of the main combat music cues. Each cue has an abrupt start that is, in most cases, louder than the ensuing music. Loud strikes on deep drums emphasize the shift from bucolic exploration to dangerous combat and prompt the player to respond accordingly.

Figure 13.2 Graphical representation in Audacity of waveforms of combat music from Skyrim (mixed down to mono for simplicity; cues listed by filename with Original Game Soundtrack titles indicated) [author’s screenshot]

The first time a player encounters combat music in Skyrim is during the introductory scenes of the game, when the player’s newly minted character has their head on the executioner’s block in the town of Helgen. Proceedings are interrupted by the appearance of the dragon Alduin, the prophesied consumer of the world, who flies in and attacks the town. The musical accompaniment to this world-changing event is the cue ‘mus_combat_04’ (named ‘Blood and Steel’ on the Original Game Soundtrack).Footnote ²⁰ Immediately preceding the initial crash of percussion in the music is Alduin’s ominous roar and visual appearance (see Figure 13.3). The sign linking combat music and danger could scarcely be more obvious.

Figure 13.3 Alduin’s appearance at Helgen (silhouette between tower and mountain), and moment of first combat music in Skyrim [author’s screenshot]

Having been introduced to combat music in this way, the player can more easily recognize combat scenarios later in Skyrim. Down the road from Helgen, the player may come across a wild wolf and once again hear combat music. A wolf is a smaller threat than a dragon, though likely still a test of the new character’s mettle. On hearing the initial blast of drums, the player can whip out their war axe or sword and take care of the situation. Yet, as previously mentioned, Skyrim is an open-world game and threats may come from any direction. The next wolf to attack may come from behind a tree (the music begins when an enemy becomes aware of the character’s presence, not the other way around), so the music may be the first sign of danger. It can operate in this way because it was initialized as denoting threats from the very start.

As the player learns more about how gameplay in Skyrim works, they may choose a play style that emphasizes strategic combat over brawn. The skill of ‘sneaking’ involves both moving quietly and staying out of sight in order to move close enough to an enemy to strike a critical blow, ideally without the enemy becoming aware of the character’s presence until it is too late. The player has learned that combat music starts when an enemy becomes aware of the character’s presence and starts to attack, so the absence of combat music becomes the sign of successful sneaking. Combat music during an attempt to sneak up on an enemy indicates to the player that the game is up – they have to run or fight, and quickly. Sneaking is a semiotic negotiation, wherein the player is trying to convey the right combination of signs to the game to get the desired combination of signs in return. It depends both on the game initializing combat music as a sign of danger, and on the player learning and subsequently manipulating the game’s signs for their own ends.

Conclusion

Semiotics is rather like mathematics: it describes the world rather well, and while you can live most of your life without it, it can give you a different, different, clearer, more precise, and rather exciting perspective on the things that you see every day. The semiotics of video game music shows how communication between the player and the game involves everything from physical movements, to the mental effects of emotions, to the nostalgic pull of past experiences. There are meanings placed into game music at its creation, and there are meanings you hear in video game music that come not from the composer, but from yourself. We communicate with the games we play as they communicate with us, weaving meanings into musical experiences and throughout gameworlds, and we do it almost without thinking. Semiotics is not hard – it is just an extension of that meaning-making process into conscious and deliberate thought, a description of video game music that helps make sense of the communication that is already happening.

14 Game – Music – Performance: Introducing a Ludomusicological Theory and Framework

Melanie Fritsch

Introduction

Since the late 2000s, the distinct field of ludomusicology has gained momentum. Reportedly, the neologism ludomusicology was coined by Guillaume Laroche and his fellow student Nicholas Tam, with the prefix ‘ludo’ referring to ludology, the study of games.Footnote ¹ In early 2008, Roger Moseley also used this term and introduced an additional dimension to the meaning:

Whereas Laroche’s deployment of the term has reflected a primary interest in music within games, I am more concerned with the extent to which music might be understood as a mode of gameplay. … Bringing music and play into contact in this way offers access to the undocumented means by which composers, designers, programmers, performers, players, and audiences interact with music, games, and one another.Footnote ²

In this chapter, I will outline my approach I have developed at full length in my 2018 book towards a distinct ludomusicological theory that studies both games and music as playful performative practices and is based on that broader understanding. This approach is explicitly rooted both in performance theory and in the musicological discourse of music as performance.Footnote ³ The basic idea is that with the help of a subject-specific performance concept, a framework can be developed which provides a concrete method of analysis. Applying this framework further allows us to study music games as well as music as a design element in games, and performances of game music beyond the games themselves. It is therefore possible to address all three ludomusicological subject areas within the frame of an overarching theory.

Performance – Performanz – Performativity

Performance theory is a complicated matter, due to firstly, the manifold uses of the term ‘performance’ and secondly, the multiple intersections with two other concepts, namely performativity and Performanz (usually translated to ‘performance’ in English, which makes it even more confusing). Due to these issues and a ramified terminological history, the three concepts as used today cannot be unambiguously traced back to one basic definition, but to three basic models as identified by Klaus Hempfer: the theatrical model (performance), the speech act model of linguistic philosophy (performativity) and the model of generative grammatics (Performanz).Footnote ⁴ Despite offering new and productive views, the evolution of the three concepts and their intersections has led to misreadings and misinterpretations that I have untangled elsewhere.Footnote ⁵ For the purpose of this chapter, it is sufficient to be only generally aware of this conceptual history, therefore I will confine the discussion to some aspects that are needed for our understanding.

The concept of performativity was originally introduced by linguistic philosopher John L. Austin during a lecture series entitled How to Do Things with Words in 1955.Footnote ⁶ He suggested a basic differentiation between a constative and a performative utterance: While a constative utterance can only be true or false and is used to state something, performatives are neither true nor false, but can perform an action when being used (for example, ‘I hereby declare you husband and wife’). This idea was taken up, further discussed and reshaped within philosophy of language and linguistics.Footnote ⁷ Additionally, it was adopted in the cultural and social sciences, where it sparked the ‘performative turn’, and the power of performatives to create social realities was particularly discussed (for example, the couple actually being wedded after an authorized person has uttered the respective phrase in the correct context such as in a church or registry office). On its way through the disciplines, it was mixed or used interchangeably with the other two widely debated concepts of performance (as derived, for example, from cultural anthropology including game and ritual studies as e.g. conducted by Victor Turner) and Performanz (as discussed in generative grammatics e.g. by Noam Chomsky). In performance research as conducted by gender studies, performance studies or theatre studies scholars, the term performativity was not only referred to, but further politicized. A major question was (and still is) the relation between the (non-)autonomous individual and society, between free individual behaviour and the acting out of patterns according to (social) rules. Another major issue is the relationship between live and mediatized performances, as scholars such as Peggy Phelan,Footnote ⁸ Erika Fischer-LichteFootnote ⁹ or Phillip Auslander have emphasized.Footnote ¹⁰ The evanescence and ephemerality of live performances and the creation of an artistic space that was seen to be potentially free from underlying schemata of behaviour and their related politics (e.g. in art forms such as Happenings as opposed to art forms such as the bourgeois theatre of representation), were set in harsh opposition to recorded performances that were returned into an ‘economy of reproduction’.Footnote ¹¹

When trying to understand games and music as performances, it is therefore necessary to state clearly which concept one is referring to. There are two options: Either one uses an existent concept as introduced by a specific scholar, or one needs to develop a subject-specific approach that is informed by the respective terminological history, and takes the entanglements with the other two concepts into account.

The goal of this chapter is to approach the three subject areas of ludomusicology from the perspective of performance studies using performance theory.Footnote ¹²

Analysing Games as Performances

The concept of ‘performance’ that we chose as our starting point has two basic dimensions of meaning. To illustrate these, a classic example from the field of music games is helpful: Guitar Hero. In this so-called rhythm-action game, that is played using a guitar-shaped peripheral,Footnote ¹³ the player is challenged to push coloured buttons and strum a plastic bar in time with the respectively coloured ‘notes’ represented on screen to score points and make the corresponding sound event audible. As Kiri Miller has observed, two basic playing styles have been adopted by regular players: ‘The score-oriented [players] treat these games as well-defined rule-bound systems, in which the main challenge and satisfaction lies in determining how to exploit the scoring mechanism to best advantage’.Footnote ¹⁴ On the other hand, she describes

Rock-oriented players [who] recognize that rock authenticity is performative. They generally do value their videogame high scores, but they also believe creative performance is its own reward. As they play these games, they explore the implications of their role as live performers of prerecorded songs.Footnote ¹⁵

This second playing style has particularly been highlighted in advertisements, in which we see ‘typical’ rock star behaviours such as making the ‘guitar face’,Footnote ¹⁶ or even smashing the guitar, thereby inviting potential players to ‘unleash your inner rock star’. Beyond hitting buttons to score points, these rock-oriented players are demonstrating further competencies: the knowledge of the cultural frame of rock music, and their ability to mimic rock-star-ish behaviours. As Miller highlights, ‘Members of both groups are generally performance-oriented, but they employ different performance-evaluation criteria’.Footnote ¹⁷

With the performance concept formulated by theorist Marvin Carlson we can specify what these different evaluation criteria are: ‘If we … ask what makes performing arts performative, I imagine the answer would somehow suggest that these arts require the physical presence of trained or skilled human beings whose demonstration of their skills is the performance.’Footnote ¹⁸ Carlson concludes that ‘[w]e have two rather different concepts of performance, one involving the display of skills, the other also involving display, but less of particular skills than of a recognized and culturally coded pattern of behavior’.Footnote ¹⁹ This can be applied to the two playing styles Miller has identified, with the former relating to Miller’s score-oriented players, and the latter to the rock-oriented players. Carlson further argues that performance can also be understood as a form of efficiency or competence (Leistung) that is evaluated ‘in light of some standard of achievement that may not itself be precisely articulated’.Footnote ²⁰ It is not important whether there is an external audience present, because ‘all performance involves a consciousness of doubleness, through which the actual execution of an action is placed in mental comparison with a potential, an ideal, or a remembered original model of that action’.Footnote ²¹ In other words: someone has to make meaning of the performance, but it does not matter whether this someone is another person, or the performer themselves.

In summary, this dimension of performance in the sense of Leistung consists of three elements: firstly, a display of (playing) skills, that secondly happens within the frame and according to the behavioural rule sets of a referent (in our case musical) culture, and that, thirdly, someone has to make meaning out of, be it the performer themselves or an external audience.

Additionally, the term performance addresses the Aufführung, the aesthetic event. This understanding of performance as an evanescent, unique occurrence is a key concept in German Theaterwissenschaft.Footnote ²² At this point it is worthwhile noting that the debate on games and/as performances reveals linguistic tripwires: The English term ‘performance’ can be translated to both Leistung and Aufführung in German, which allows us to clearly separate these two dimensions. But due to the outlined focus of German Theaterwissenschaft, when used in the respective scholarly German-language discourse, the English word ‘performance’ usually refers to this understanding of Aufführung. As Erika Fischer-Lichte has described it:

The performance [in the sense of Aufführung, M. F.] obtains its artistic character – its aesthetics – not because of a work that it would create, but because of the event that takes place. Because in the performance, … there is a unique, unrepeatable, usually only partially influenceable and controllable constellation, during which something happens that can only occur once[.]Footnote ²³

While offering a clear-cut definition, the focus on this dimension of performance in the German-language discourse has at the same time led to a tendency to neglect or even antagonize the dimension of Leistung (also because of the aforementioned politicization of the term). This concept of performance that was also championed by Phelan leads to a problem that has been hotly debated: If the Aufführung aspect of performance (as an aesthetic event) is potentially freed from the societal context, external structures and frames of interpretation – how can we make meaning out of it?

Therefore, ‘performance’ in our framework not only describes the moment of the unique event, but argues for a broader understanding that purposely includes both dimensions. The term connects the Aufführung with the dimension of Leistung by including the work and training (competence acquisition) of the individual that is necessary to, firstly, enable a repeatable mastery of one’s own body and all other involved material during the moment of a performance (Aufführung) and, secondly, make the performance meaningful, either in the role of an actor/performer or as a recipient/audience, or as both in one person. The same applies to the recipient, who has to become competent in perceiving, decoding and understanding the perceived performance in terms of the respective cultural contexts. Furthermore, a performance can leave traces by being described or recorded in the form of artefacts or documents, which again can be perceived and interpreted.Footnote ²⁴

But how can we describe these two dimensions of performance (the aesthetic Aufführung and the systematic Leistung), when talking about digital games?

Dimensions of Game Performance: Leistung

In a comprehensive study of games from the perspective of competencies and media education, Christa Gebel, Michael Gurt and Ulrike Wagner distinguish between five competence and skill areas needed to successfully play a game:

1. cognitive competence,
2. social competence,
3. personality-related competence,
4. sensorimotor skills,
5. media competence.Footnote ²⁵

In this system, the area of cognitive competence comprises skills such as abstraction, drawing conclusions, understanding of structure, understanding of meaning, action planning and solving new tasks and problems, all of which can easily be identified in the process of playing Guitar Hero. In an article from 2010, Gebel further divides cognitive competence into formal cognitive and content-related cognitive competencies (Figure 14.1).Footnote ²⁶

Figure 14.1 The area of cognitive competency following Gebel (2010)

Formal cognitive competencies include skills such as attention, concentration or the competence of framing (which in the case of Guitar Hero, for example, means understanding the difference between playing a real guitar and the game).

Content-related cognitive competencies concern using and expanding one’s existing game- and gaming-specific knowledge, and are further subdivided into areas of instrumental, analytical and structural knowledge. Instrumental knowledge includes skills such as learning, recognizing and applying control principles. In short, instrumental knowledge concerns the use of the technical apparatus. For example, that might involve dealing with complex menu structures.

Analytical and structural knowledge means, firstly, that players quickly recognize the specific features and requirements of a game genre and are able to adjust their actions to succeed in the game. For example, experienced music game players will immediately understand what they need to do in a game of this genre for successful play, and how UI (User Interface) and controls will most likely be organized.

However, as we have seen in the Guitar Hero example, content-related cognitive competency does not only include game- and gaming-specific knowledge, but also knowledge from a wider cultural context (in our case rock music culture). This area of competence, the knowledge of the cultural contexts and practices to which a game refers or uses in any form, is not explicitly addressed in Gebel’s model, but can be vital for understanding a game. Therefore, we need to add such general cultural knowledge to the category of content-related cognitive skills (Figure 14.2).

Figure 14.2 Extended cognitive competencies model

The area of social competency includes social skills such as empathy, ambiguity tolerance or co-operation skills as well as moral judgement. This area is closely linked to personality-related competence, which includes self-observation, self-criticism/reflection, identity preservation and emotional self-control. Sensorimotor competence contains skills such as reaction speed and hand–eye co-ordination. The fifth area of competence relates to media literacy and explicitly addresses digital games as a digital medium. This comprises the components of media-specific knowledge, autonomous action, active communication and media design.

Regarding this last aspect, considerations from the discourse on game literacy particularly help to understand the different roles players adopt during gameplay.Footnote ²⁷ In this context, we have already stated that players can adopt either the role of an actor/performer or that of a recipient/audience of a game, or even both at the same time (Carlson’s double consciousness). In the case of Guitar Hero, players can watch each other play or do without any external audience at all. When playing alone at home, they can make meaning out of their performances themselves by placing the actual performance in mental comparison with their idea of the ‘guitar hero’ as a model.

But players do not always play given games as intended by the designers.Footnote ²⁸ In fact, oftentimes they either try to find out what is possible in the game, or even look for ways in which to break the rules and do something completely different than the intended manner of playing. In Ultima IX, for example, players found out that the game allowed for building bridges using material such as bread, corpses or brooms. Subsequently, some of them started bridge-building competitions, thereby playfully (mis)using the technology and the given game as the material basis for staging their own gameplay performances by following their own rule set.Footnote ²⁹ Eric Zimmerman addresses such practices when he includes ‘the ability to understand and create specific kinds of meaning’ in his concept of gaming literacy.Footnote ³⁰ From his point of view, players must be literate on three levels: systems, play and design, because ‘play is far more than just play within a structure. Play can play with structures. … [B]eing literate in play means being playful – having a ludic attitude that sees the world’s structures as opportunities for playful engagement’.Footnote ³¹ He emphasizes that he explicitly names his concept gaming literacy

because of the mischievous double-meaning of “gaming”, which can signify exploiting or taking clever advantage of something. Gaming a system means finding hidden shortcuts and cheats, and bending and modifying rules in order to move through the system more efficiently – perhaps to misbehave, but perhaps to change that system for the better.Footnote ³²

Hence, gaming literacy includes the competencies and knowledge to playfully explore and squeeze everything out of a given structure, be it successfully finishing a game by playing it as intended, or by exploiting emergent behaviour such as the creative use of bugs or other types of uncommon play to create one’s own game. It further entails breaking the system, and making creative use of singular components, for example by hacking. That way, players can become designers themselves. But in order to do so, it is necessary to be competent regarding design. In his reasoning, Zimmerman relies on his and Katie Salen’s definition of design as ‘the process by which a designer creates a context to be encountered by a participant, from which meaning emerges’.Footnote ³³ A designer might be one single person or a group, such as a professional design team, or in the case of folk or fan cultural gaming practices, ‘culture at large’.Footnote ³⁴ That way, players do not just inhabit the traditional roles of performer and/or audience, but can even themselves become designers of new contexts, ‘from which meaning emerges’ by playing with the given material. But this play does not occur randomly. Instead, the emerging participatory practices happen within the frames of players’ own cultural contexts according to the specific rule systems that are negotiated by the practitioners themselves. Breaking the rules of such a practice could, at worst, lead to dismissal by this expert audience, as I have outlined elsewhere.Footnote ³⁵ For example, when writing a fan song about a game, it is important to have reasonably good musical skills (regarding composing, writing lyrics and performing the song), and technological skills (knowing how to create a decent recording and posting it online), but also to be literate both in the game’s lore and its respective fan culture as well as in gaming culture in general, which can for example be demonstrated by including witty references or puns in the music or the lyrics.Footnote ³⁶ A similarly competent audience on video platforms such as YouTube will most certainly give feedback and evaluate the performance regarding all these skills.

With this framework, the dimension of Leistung can be helpfully broken down into different areas of competence. In particular, the complex area of cognitive competencies and knowledge can be structured for analysis by helping to identify and address specific skill sets.

Dimensions of Game Performance: Aufführung

As game designer Jesse Schell has stated, ‘[t]he game is not the experience. The game enables the experience, but it is not the experience’.Footnote ³⁷ On a phenomenological level, a game (be it digital or otherwise) only manifests itself during the current act of play, the unique event, as a unique ephemeral structure (the Aufführung) that can be perceived and interpreted by a competent recipient. Considering games in play as a gestalt phenomenon against the backdrop of gestalt theory may be helpful.Footnote ³⁸ A useful concept that applies gestalt theory for a subject-specific description of this structure can be found in Craig Lindley’s concept of a gameplay gestalt: ‘[I]t is a particular way of thinking about the game state, together with a pattern of perceptual, cognitive, and motor operations. … [T]he gestalt is more of an interaction pattern involving both the in-game and out-of-game being of the player’.Footnote ³⁹ So, a gameplay gestalt is one founded on a process, centred on

the act of playing a game as performative activity based on the game as an object. On the part of the players, any information given by the game that is relevant for playing on the level of rules as well as on the level of narrative, need interpretation, before carrying out the appropriate bodily (re)action,Footnote ⁴⁰

thereby bringing the gestalt into existence and enabling the experience. But how can we describe experiences beyond personal impressions?

Regarding the aesthetic dimension of Aufführung, a slightly adjusted version of Steve Swink’s concept of game feel can be made fruitful. He defines game feel as ‘Real-time control of virtual objects in a simulated space, with interactions emphasized by polish’.Footnote ⁴¹ He emphasizes that playing a game allows for five different experiences:

The aesthetic sensation of control,
The pleasure of learning, practising and mastering a skill,
Extension of the senses,
Extension of identity,
Interaction with a unique physical reality within the game.Footnote ⁴²

In other words, game feel highlights aspects of embodiment and describes the current bodily and sensual experience of the player as a co-creator of the Aufführung (in the sense of Fischer-Lichte). Therefore, the concept can be used to describe the aesthetic experience of performing (aufführen) during the actual execution (ausführen) offered by the game as a concept of possibilities. Taking up the example of Guitar Hero: Players learn to handle the plastic guitar to expertly score points (the pleasure of learning, practising and mastering a skill), thereby controlling the musical output (the aesthetic sensation of control) and incorporating the guitar-shaped controller into their body scheme (extension of the senses). When playing the game, they do not just press buttons to score points, but also perform the ‘rock star fantasy’, embodying the guitar hero (extension of identity). This works in any case: No matter which playing style players choose regarding the dimension of Leistung, during the performance in the sense of Aufführung both can be matched with the ‘guitar hero’ as a cultural pattern of behaviour – the button-mashing virtuoso or the rocking-out show person. Even the practising mode is tied to the rock star fantasy, when players struggle through the journey from garage band to stadium-filling act by acquiring necessary playing skills (the pleasure of learning, practising and mastering a skill). In other words, during every possible performance of the game (including practising) the emergent music-based gameplay gestalt Footnote ⁴³ is in all cases tied to the promised subjective experience project of ‘unleash[ing] your inner rock star’, and is in this way charged with meaning. Even an ironic, exaggerated approach can work in this case, as this is also an accepted pattern in rock and heavy metal culture as comedy bands such as JBO or Tenacious D demonstrate.

In summary: Regarding digital game playing as performance, the dimension of Aufführung can helpfully be addressed with Swink’s concept of game feel, and broken down for description into his five experiential categories. It further becomes clear that the dimensions of Leistung and Aufführung are interlocked. The respective competencies and skill sets need to be acquired for the competent creation and understanding of an aesthetically pleasing gameplay gestalt.

As it is the goal of this chapter to introduce my basic ludomusicological framework that studies both games and music as playful performative practices, we need to address the two dimensions of performance regarding music in the next step.

Music as Performance: Leistung

As we have seen, for the analysis of a game such as Guitar Hero, a broader understanding of music beyond that of a written textual work is vital. In this context, the theorization about music as performance as conducted by researchers such as Nicholas Cook,Footnote ⁴⁴ Christopher Small,Footnote ⁴⁵ Carolyn AbbateFootnote ⁴⁶ and others comes to mind. In this discourse the idea of music being a performative art form and a social practice is emphasized. But how should this be described in analysis?

In his book Markenmusik, Stefan Strötgen proposes a terminological framework that approaches music as a sounding but also as a socio-cultural phenomenon.Footnote ⁴⁷ The basis for his argument is a five-tier model by music-semiotician Gino Stefani responding to the question ‘in what ways does our culture produce sense with music?’Footnote ⁴⁸ Stefani differentiates between the general codes (the psychoacoustic principles of auditory perception), and in a second step, the social practices on which basis ‘musical techniques’ are developed as organizational principles for sounds. Strötgen summarizes that with ‘social practices’ Stefani describes a ‘network of sense’ on which ‘the relationships between music and society or, rather, between the various social practices of a culture’ are built.Footnote ⁴⁹ He further concretizes Stefani’s model: Different musical styles do not just sound different. Music also becomes meaningful through its use in society by linking it with other content such as pictorial worlds, shared ideas about the philosophy behind it, performer and fan behaviours, body images, or social rule sets regarding who is ‘allowed’ to perform which music in what context and so on, thus becoming part of a social network of meaning. These meanings have to be learned and are not inherent in the music itself as text or aural event. These three factors (general codes, social practices, musical techniques) then provide the basis for the emergence of ‘styles’, and in the end of a concrete-sounding ‘opus’ performed by competent performers and understood by competent recipients.Footnote ⁵⁰

At this point, one aspect needs to be highlighted that we have addressed regarding performance theory in general and that is also vital for the debate on music as performance: the politics of (musical) play and reception. This aspect is a core facet of Small’s concept of musicking:

A theory of musicking, like the act itself, is not just an affair for intellectuals and ‘cultured’ people but an important component of our understanding of ourselves and of our relationships with other people and the other creatures with which we share our planet. It is a political matter in the widest sense. If everyone is born musical, then everyone’s musical experience is valid.Footnote ⁵¹

Musicking, defined as taking part ‘in any capacity, in a musical performance, whether by performing, by listening, by rehearsing or practising, by providing material for performance (what is called composing), or by dancing’Footnote ⁵² demands a radical de-hierarchization of musical participation not only in terms of performers, but also in terms of recipients and designers. As Abbate and Strötgen also stress, audiences contribute their own meanings and interpretive paths when attending a musical performance. Subsequently, not only is everyone allowed to musick, but everyone is also allowed to talk about it and step into the role of designer by contributing in whatever capacity to the social discourse, for example, around a certain piece and how it has to be performed. Musicking means moving away from the idea of the one correct meaning that a composer has laid down in the musical text, which a passive recipient understands either correctly or not at all. This aspect of musical participation and music as a (recorded) commodity is also discussed by Cook and Abbate. According to Cook, a constant social (and therefore political) renegotiation of what, for example, a string quartet is supposed to be, which instruments are allowed in it, how to talk about it, how a specific one is to be performed and so on is the very core of how music-related social practices are established.Footnote ⁵³ But similar to social practices, when and where we have the right to musick and play games is constantly being renegotiated against the backdrop of the respective cultural discourses.Footnote ⁵⁴ Regarding music and the right to play it and play with it, we have seen this prominently in the sometimes harsh debates around Guitar Hero, when musicians such as Jack White or Jimmy Page strongly opposed such games and advocated ‘kids’ learning to play a real guitar instead of playing the game. Looking at the social practices of rock and its discourses of authenticity and liveness, it is quite obvious why they rejected the game: They accused game players of finagling an experience (‘Unleash your inner rock star’) they have no right to experience, since they have avoided the years of blood, sweat and tears of learning to play a real guitar.Footnote ⁵⁵

Music as Performance: Aufführung

Regarding the dimension of Aufführung, the aesthetic dimension, again the game feel concept as explained above comes in handy. In her seminal text Carolyn Abbate makes the case for a ‘drastic musicology’ that also includes personal experiences, as well as bodily and sensual experiences that occur during performances.Footnote ⁵⁶ In our context, Abbate’s example, in which she describes her experience of playing ‘Non temer, amato bene’ from Mozart’s Idomeneo at the very moment of performance is quite interesting: ‘doing this really fast is fun’ is not just a description of her momentarily playing experience, but also refers to the fact that she has acquired the necessary skills firstly, to play this piece of music at all and secondly, to play it fast.Footnote ⁵⁷ This points to the fact that both dimensions of performance are not only linked in terms of knowledge but also in terms of embodiment. In other words: In order to play a piece of music or, more generally speaking, create a musical performance, not only the acquisition of the respective knowledge is required, but equally the above-mentioned sensorimotoric and other playing competencies in order to create a satisfying Aufführung in the form of a coherent gestalt, the sounding ‘opus’. Her description of musical movement with ‘here comes a big jump’Footnote ⁵⁸ also follows this logic: on the one hand, she hereby refers to an option in the written work understood as a concept of possibilities that allows for this moment, which she can anticipate and master (the aesthetic sensation of control) thanks to her already acquired skills and knowledge (the pleasure of learning, practising and mastering a skill). On the other hand, she also describes her playing experience regarding the dimension of aesthetics using a metaphor of embodiment (extension of identity).

Table 14.1 summarizes the terminological framework we have developed so far, both for games and music understood as playful performative practices.

Table 14.1 Overview of the ludomusicological framework

Competence (Leistung)

Presentation (Aufführung)

1. Concepts of media education
Five areas of competence and skills^a

Slightly modified game-feel-model, after Steve Swink^b

Social skills
Personal competence
Sensomotoric skills
Media competence
Cognitive competence
following Gebel, subdivided into:^c

1. Formal cognitive skills
2. Content-related cognitive competencies
1. a. Instrumental knowledge
2. b. Analytical knowledge
3. c. Structural knowledge
4. d. + Cultural knowledge

= the aesthetic experience of performing (aufführen) during the execution (ausführen) Offered by the game as concept of possibilities; it highlights the aspect of embodiment and describes the current bodily and sensual experience of the player as a co-creator of the performance (in the sense of Aufführung, after Fischer-Lichte)^d

2. Game Literacy/Gaming Literacy
Literacy model, after Zagal and Zimmerman.^e Areas of competence enable the player to perform in several roles
1. As actor
2. As recipient
3. As designer

Five experiential dimensions
1. The aesthetic sensation of control
2. The pleasure of learning, practising and mastering a skill
3. Extension of the sense
4. Extension of identity
5. Interaction with a unique physical reality within the game

a Gebel, Gurt and Wagner, Kompetenzförderliche Potenziale populärer Computerspiele.

b Swink, Game Feel.

c Gebel ‘Kompetenz Erspielen – Kompetent Spielen?’

d Fischer-Lichte, Ästhetik des Performativen.

e Zagal, Ludoliteracy and Zimmerman, ‘Gaming Literacy.’

With the Guitar Hero example, we have already seen how this model can be usefully applied when studying music games. But what about the other two areas of ludomusicology, namely music as a design element in games, and game music beyond games?

Music as a Design Element in Games: Super Mario Bros.

Super Mario Bros. (1985) is one of the best-selling games of all time and started a long-running franchise that have been sold and played worldwide. Further, the title has been extensively studied in ludomusicological writing.Footnote ⁵⁹ The game was designed under the lead of Shigeru Miyamoto and was distributed in a bundle with the Nintendo Entertainment System in North America and Europe, making it one of the first game experiences many players had on this console.

The staging of Super Mario Bros. is not intended to be realistic, neither in terms of its visual design, presenting a colourful comic-styled 8-bit diegetic environment, nor in terms of what the player can do. As I have stated before,

I describe the performative space in which all actions induced by the game take place, including those in front of the screen, as the gameworld. In this gameworld, the game’s narrative, the specific and unique sequence of fictional and non-fictional events happening while playing the game, unfolds. In order to address the world, which can be seen on screen, and set this apart from the gameworld, I will henceforth refer to this as the diegetic environment. I use the term diegesis here in the sense of Genette: ‘diegesis is not the story, but the universe in which it takes place’[.]Footnote ⁶⁰

Super Mario Bros.’ diegetic environment presents the player with manifold challenges such as abysses, multifarious enemies, moving platforms and other traps and obstacles. In the words of Swink,

In general, the creatures and objects represented have very little grounding in meaning or reality. Their meaning is conveyed by their functionality in the game, which is to present danger and to dominate areas of space. … [W]e’re not grounded in expectations about how these things should behave.Footnote ⁶¹

The game does not offer an introduction or a tutorial, but gameplay starts immediately. That way, players have to understand the specific rules of the abstract diegetic environment, and either already possess or acquire all the necessary cognitive competencies and instrumental knowledge required to master all challenges. The avatar Mario also possesses specific features, such as his three states (Mario, Super Mario, Fire Mario) or the ability to execute high and wide jumps, which never change through the entire game. Regarding the dimension of Leistung, in addition to developing sensorimotor skills such as reaction speed and hand–eye co-ordination, learning and applying a whole range of cognitive skills and game-related knowledge is therefore necessary to successfully master the game. Formal cognitive (for example, attention, concentration, memory) and personality-related skills (coping with frustration and emotional self-control) are vital. The experience offered during the Aufführung while manoeuvring Mario through the diegetic environment featuring its own rules (interacting with a unique physical reality within the game) can be understood as the player discovering and mastering challenges by increasing not the avatar’s skill set, but their own playing skills (the pleasure of learning, practising and mastering a skill). This is done through embodiment by getting a feel for Mario (extending the player’s own identity) and the surrounding world (extending their senses). For example, as Steve Swink notes, ‘Super Mario Brothers has something resembling a simulation of physical forces … there are in fact stored values for acceleration, velocity and position. … If Mario’s running forward and suddenly I stop touching the controller, Mario will slide gently to a halt’.Footnote ⁶² A skilled player who has mastered the controls (experiencing the aesthetic sensation of control) and developed a feel for Mario’s movements (extending the player’s identity), can make use of this, for example by running towards a low-hanging block, ducking and gently sliding under it, thereby getting to places that are normally only accessible with a small Mario.

But how do music and gameplay come together? Composer Koji Kondo created six basic pieces for the game, namely the Overworld, Underworld, Starman and Castle themes, the Underwater Waltz and the ending theme. As detailed musicological analyses and explanations of the technological preconditions and subsequent choices can be found elsewhere, as noted above, I will confine myself here to highlighting how the music and sound effects are perfectly designed to create a coherent gameplay gestalt.

In an interview Kondo explains that, based on the specifications initially communicated to him, the Underwater Waltz was the first melody he wrote for the game, since it was easy for him to imagine how underwater music must sound.Footnote ⁶³ This indicates that for certain representations on the visual level such as water, swimming and weightlessness, cultural ideas exist on how these should be underscored, namely in the form of a waltz. Kondo further states that a first version of the Overworld theme that he had also written on the basis of the specifications did not work (which were formulated: ‘Above ground, Western-sounding, percussion and a sound like a whip’). He describes that he had tried to underscore the visual appearance of the game, namely the light blue skies and the green bushes. But when Kondo played the prototype, the melody did not fit Mario’s jumping and running movements, therefore it felt wrong. As players have only a specific amount of time to succeed in a level, the music as well as the level design encourage them to not play carefully, but rather to adopt a ‘Fortune Favours the Bold’Footnote ⁶⁴ playing style. Kondo’s second approach, now based on the actual embodied experience of gameplay, reflected this very game feel. As Schartmann puts it:

Instead of using music to incite a particular emotional response (e. g., using faster notes to increase tension), he tried to anticipate the physical experience of the gamer based on the rhythm and movement of gameplay. … In essence, if music does not reflect the rhythm of the game, and, by extension, that of the gamer, it becomes background music.Footnote ⁶⁵

That way, the distinct laws and rhythms of the diegetic world, and the way in which the player should move Mario through them using their own learned skills and the subsequent ‘extension of the senses’ are both acoustically transported by the music and by the sound effects, also by using techniques such as movement analogy, better known as Mickey-Mousing (the ‘ka-ching’ of the coins, the uprising jumping sound, the booming sound of Bullet Bills etc.). In addition to game-specific competencies, general cultural knowledge from other media types, namely cartoons and movies regarding the logic of how these are underscored, help to develop a quick understanding of the inner workings of the abstract world presented on screen.

Kondo conceptualized the soundtrack in close connection with the game feel evoked by the game performance, understood as the player’s sensual and physical experience as they follow the intended scheme of play that is encouraged by the level design. The game’s music sounds for as long as the player performs the game, which is determined by their skill level (unskilled players may have to restart many times over) and by their actions. The sounding musical performance is therefore directed by their gameplay and their (game-) playing skills. Playing Super Mario Bros. can be understood as a musical performance; firstly, when understood in terms of Leistung, the generation of this performance requires not only knowledge but also the physical skills needed to play the game at all; and secondly, it requires the ability to adopt the intended speed and playing style encouraged by the level design, play time limit and music.Footnote ⁶⁶ All elements of the staging (gameplay, diegetic world, level design, music, etc.) are from a functionalist point of view designed to support game feel in the sense of an interlocking of both dimensions of performance, so that a coherent gameplay gestalt can be created during the current act of play, which involves the player in the gameworld as defined above. The player constantly shifts between the role of performer and that of recipient and continuously develops their respective skills in the course of the game’s performance.

Music Beyond Games: Chip Music as a ‘Gaming-a-System’ Practice

Since the late 1970s, game sounds and music have found an audience outside the games themselves. They can be heard in TV shows, commercials, films and other musical genres. In Japan the first original soundtrack album of Namco titles was released in 1984, and the first dedicated label for chip music – G.M.O. Records, as an imprint of Alfa Records – was founded in 1986.Footnote ⁶⁷

As explained in the introduction to Part I of this book, thanks to the coming of affordable home computers and video game systems, a participatory music culture emerged at the same time. Instead of simply playing computer games, groups banded together to remove the copy protection of commercial titles. Successfully ‘cracked’ games were put into circulation within the scene and marked by the respective cracker group with an intro so that everyone knew who had cracked the title first. Due to the playful competition behind this, the intros became more and more complex and the groups invented programming tricks to push the existing hardware beyond its boundaries. As Anders Carlsson describes:

Just as a theatre director can play with the rules of what theatre is supposed to be, the demosceners and chip music composers found ways to work beyond technological boundaries thought to be unbreakable. … These tricks were not rational mathematical programming, but rather trial and error based on in-depth knowledge of the computer used. Demosceners managed to accomplish things not intended by the designers of the computers.Footnote ⁶⁸

In addition to the idea of competition, an additional goal can be described as an independent accumulation of knowledge and competencies through playful exploration and ‘gaming a system’ in the sense of Zimmerman, but according to a specific rule set as negotiated in the community. The goal of this self-invented meta-game with the personal goal of ‘technology mastery’ (the pleasure of learning and mastering a skill, seamlessly merging in an aesthetic sensation of control) is an optimization of the staging – in the sense of an exhibition – of skills during the actual performance of the songs. These skills include knowing the specific sound characteristics and technological features of the respective sound chip, as well as optimizing and generating sounds that had been thought to be impossible. The finished composition and concrete opus sounds as a musical performance when the practitioner plays their work themselves or when someone else performs their composition based on the staging of the technical apparatus (that is the computer technology). Such a media-mediated music performance can in turn be evaluated accordingly by a competent recipient, savvy in the evaluation rules negotiated in the chip music scene. Aesthetic notions relate not only to the composition and the sounding opus, but also to the elegance or extravagance of the staging, that is, the use of hardware and software. Further, practitioners constantly renegotiate how a well-made artefact is to be created, which affordances it must fulfil, how it should be judged and so on.

Additionally, demosceners and chip musicians do not keep the knowledge they have gained through ‘gaming a system’ to themselves, but share their insights with the community, for example by publishing their programming routines, or self-created accessible software such as Hülsbeck’s Soundmonitor or the Soundtracker 2, a hacked, improved and freely distributed version of Karsten Obarski’s Soundtracker, created by the Dutch hacker Exterminator.Footnote ⁶⁹ As the songs produced with these trackers were saved in the MOD format, they are effectively open books for competent users, as Anders Carlsson describes:

[I]t was possible to look into the memory to see how the songs worked in terms of the routine that defines how to access the sound chip, and the key was to tweak the chip to its fullest. … [I]t was possible for users to access the player routines, along with instruments, note data, effects and samples. Even though these elements could be hidden with various techniques, there would usually be a way to hack into the code[.] As long as a user has the tracker that was used for making the song, it is possible to load the song into an editor and gain total control of the song.Footnote ⁷⁰

By creating and openly sharing their tools and knowledge, chip musicians actively practised the democratization of musical participation advocated by Small. The resulting music performances were also shared within the community, were increasingly made available free of charge via the Internet and were made available to others for use or further processing, thereby undermining the idea of music as a commodity to be paid for. ‘Gaming a system’ therefore does not only take place in relation to the hard- and software, but also avoids the usual distribution and performance rules of Western music, thereby creating new genres and giving musicians access to musical production that they would otherwise have been denied, and allowing everyone to musick. As for example Bizzy B., ‘godfather of breakbeat hardcore and drum ‘n’ bass, responsible for overseeing hardcore’s transition into jungle’ (as he is described by the interviewer Mat Ombler) explains in an interview about the role of the Amiga and the tracker OctaMED, written by Teijo Kinnunen:

It allowed me to have a doorway into the music business. … I wouldn’t have been able to afford the money for a recording studio; I wouldn’t have been able to practice music production and take my music to the next level if it wasn’t for the Commodore Amiga.Footnote ⁷¹

Since the early days, the community has diversified into what Leonard J. Paul calls the ‘old school’ and the ‘new school’.Footnote ⁷² Whereas the old school practitioners focus on using the original systems or instruments to create their performances, the new school evolved with the new technological possibilities of 1990s personal computers, as well as the trackers and other tools developed by the scene itself, and is interested in on-stage performances which include the performers themselves being on the stage. They further explore new hybrid forms by combining the chip sound – which does not have to be created with the original systems; instead new synthesizers are used – with other instruments,Footnote ⁷³ thereby gaming a third system by freely combining the systems of different musical genres.

Chip musicians invented their own meta-game with the goal of creating music performances with the given materials. Participation is only restricted in that they must have access to the respective technology and must be literate, firstly in using the technology and secondly in the socially negotiated rules of the meta-game; this demands all five above-mentioned areas of competence outlined by Gebel, Gurt and Wagner,Footnote ⁷⁴ and the acquisition of these respective skills to create chip music performances. Over time, these rules have changed, and the new school have developed their own social practices and rules for ‘gaming a system’.

Conclusions

Using this terminological framework, we can respectively describe and analyse the design-based use of music as an element of staging, and the resulting relationship between players, music and game, in the context of play on the two dimensions of performance. It further enables a differentiated distinction between the performance dimensions of Leistung and Aufführung, as well as a differentiation between music- and game-specific competencies regarding the Leistung dimension. Understood as forms of performance, both music and games can also be addressed as social acts, and aspects of embodiment can be taken into account thanks to the adapted game feel concept. Used in addition to other approaches introduced in this book, this framework can help to describe specific relationships and highlight aspects of these in order to further our understanding of games and music.

Footnotes

9 Music Games

¹ Mario Dozal, ‘Consumerism Hero: The “Selling Out” of Guitar Hero and Rock Band’, in Music Video Games: Performance, Politics and Play, ed. Michael Austin (New York: Bloomsbury, 2016), 127–52.

² Andrew Webster, ‘Roots of Rhythm: A Brief History of the Music Game Genre’, Ars Technica, 3 March 2009, accessed 8 April 2020, http://arstechnica.com/gaming/2009/03/ne-music-game-feature.

³ Joanna Demers, ‘Dancing Machines: “Dance Dance Revolution”, Cybernetic Dance, and Musical Taste’, Popular Music 25, no. 3 (2006): 401–14; Jacob Smith, ‘I Can See Tomorrow in Your Dance: A Study of Dance Dance Revolution and Music Video Games’, Journal of Popular Music Studies 16, no. 1 (2004): 58–84.

⁴ Kiri Miller, ‘Gaming the System: Gender Performance in Dance Central’, New Media and Society 17, no. 5 (2015): 939–57 and Miller, Playable Bodies: Dance Games and Intimate Media (New York: Oxford University Press, 2017).

⁵ Dana Plank, ‘Mario Paint Composer and Musical (Re)Play on YouTube’, in Music Video Games: Performance, Politics, and Play, ed. Michael Austin (New York: Bloomsbury, 2016), 43–82.

⁶ James Mielke, ‘Korg DS-10 Review’, 1up.com, 2008, accessed 8 April 2020,

https://web.archive.org/web/20160617012539/http://www.1up.com/reviews/korg-ds-10; Christopher Ewen, ‘KORG DS-10 Review’, GameZone, 2009, accessed 8 April 2020, https://web.archive.org/web/20100225131407/http://nds.gamezone.com/gzreviews/r35919.htm.

⁷ Eriq Gardner, ‘EMI Sues Over Def Jam Rapstar Video Game’, The Hollywood Reporter, 2012, accessed 8 April 2020, www.hollywoodreporter.com/thr-esq/emi-def-jam-rapstar-video-game-lawsuit-305434.

⁸ See David Roesner, Anna Paisley and Gianna Cassidy, ‘Guitar Heroes in the Classroom: The Creative Potential of Music Games’, in Music Video Games: Performance, Politics, and Play, ed. Michael Austin (New York: Bloomsbury, 2016), 197–228.

⁹ Donald M. Taylor, ‘Support Structures Contributing to Instrument Choice and Achievement Among Texas All-State Male Flutists’, Bulletin of the Council for Research in Music Education 179 (2009): 45–60.

¹⁰ Miller, Playable Bodies, 84.

¹¹ See Steven B. Reale, ‘Transcribing Musical Worlds; or, Is L.A. Noire a Music Game?’, in Music in Video Games: Studying Play, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 77–103.

10 Autoethnography, Phenomenology and Hermeneutics

¹ Author’s video, Hearing VGM, ‘Bastion Clip 1’, 19 September 2014, www.youtube.com/watch?v=pnqRWbJQIy8.

² David Bordwell and Kristin Thompson, Film Art: An Introduction, 8th ed. (New York: McGraw-Hill, 2008), 192.

³ Author’s video, Hearing VGM, 19 September 2014, ‘Bastion Clip 2’, www.youtube.com/watch?v=7bkIExk4PGI.

⁴ Wilhelm Windelband and Guy Oakes, ‘History and Natural Science’, History and Theory 19, no. 2 (1980): 165–8 at 167.

⁵ ‘The noetic-noematic structure of consciousness’, in Husserl’s terms. See, for example, Dermot Moran, Introduction to Phenomenology (London; New York: Routledge, 2000), 16.

⁶ See, for example, Rudolf A. Makkreel, ‘Wilhelm Dilthey’, in The Blackwell Companion to Hermeneutics, ed. Niall Keane and Chris Lawn (Chichester: Wiley, 2016), 378–82 at 378; Paul H. Fry, Theory of Literature (New Haven; London: Yale University Press, 2012), 27.

⁷ In this tradition, ‘The Intentional Fallacy’ by Wimsatt and Beardsley is usually seen as an important manifesto for a kind of interpretation that denies the authority of an author. See W. K. Wimsatt and M. C. Beardsley, ‘The Intentional Fallacy’, The Sewanee Review 54, no. 3 (1946): 468–88.

⁸ James A. W. Heffernan, ‘Ekphrasis and Representation’, New Literary History 22, no. 2 (Spring 1991): 297–316 at 297.

⁹ See also Tim Summers, Understanding Video Game Music (Cambridge, UK: Cambridge University Press, 2016), 41.

¹⁰ Zach Whalen, ‘Play Along – An Approach to Videogame Music’, Game Studies 4, no. 1 (2004), www.gamestudies.org/0401/whalen/; Isabella van Elferen, ‘Un Forastero! Issues of Virtuality and Diegesis in Videogame Music’, Music and the Moving Image 4, no. 2 (2011): 30–9.

¹¹ The idea of interpretation as a circular process between parts and whole is usually associated with Friedrich Schleiermacher; the idea that interpretation involves working through one’s preconceptions or prejudices of a text or object comes from Hans-Georg Gadamer, see Hans-Georg Gadamer, Truth and Method, trans. Joel Weinsheimer and Donald G. Marshall, 2nd ed. (London; New York: Continuum, 2004).

¹² Lawrence Kramer, Interpreting Music (Berkeley: University of California Press, 2011), 25.

¹³ Lawrence Kramer, Musical Meaning: Toward a Critical History (Berkeley: University of California Press, 2002), 17–18.

¹⁴ Zach Whalen, ‘Case Study: Film Music vs. Video-Game Music: The Case of Silent Hill’, in Music, Sound and Multimedia: From the Live to the Virtual, ed. Jamie Sexton (Edinburgh: Edinburgh University Press, 2007), 68–81 at 74.

¹⁵ Janet Murray, Hamlet on the Holodeck: The Future of Narrative in Cyberspace, Updated edition (Cambridge, MA: The MIT Press, 2017), 178.

¹⁶ Susan McClary, Feminine Endings: Music, Gender, and Sexuality (Minneapolis: University of Minnesota Press, 1991), 128.

¹⁷ It is telling that one of McClary’s critics, Pieter van den Toorn, referred to an earlier version of McClary’s text which contains the even more controversial characterization ‘throttling, murderous rage of a rapist incapable of attaining release’, instead of the more nuanced, edited phrasing from Feminine Endings. See Pieter C. van den Toorn, ‘Politics, Feminism, and Contemporary Music Theory’, The Journal of Musicology 9, no. 3 (1991): 275–99.

¹⁸ Markku Eskelinen, ‘The Gaming Situation’, Game Studies 1, no. 1 (2001), accessed 20 October 2020, http://gamestudies.org/0101/eskelinen/.

¹⁹ Here, I am referring to another critique of hermeneutics in musicology by Carolyn Abbate, who gives the phrase ‘doing this really fast is fun’ as an example of what is in our minds when ‘dealing with real music in real time’. See Carolyn Abbate, ‘Music – Drastic or Gnostic?,’ Critical Inquiry 30, no. 3 (2004): 505–36 at 511.

²⁰ K. J. Donnelly, ‘Lawn of the Dead: The Indifference of Musical Destiny in Plants vs. Zombies’, in Music in Video Games: Studying Play, ed. K. J. Donnelly, William Gibbons, and Neil Lerner (New York: Routledge, 2014), 151–65.

²¹ Donnelly, ‘Lawn of the Dead’, 163.

²² This is essentially the same argument that Jacques Derrida makes in the case of text and context in his infamous aphorism ‘Il n’y a pas de hors-texte’ – ‘there is no outside-text’; see Jacques Derrida, Of Grammatology, trans. Gayatri Chakravorty Spivak (Baltimore, MD; London: Johns Hopkins University Press, 1976), 158.

²³ Carolyn Ellis and Art Bochner, ‘Autoethnography, Personal Narrative, Reflexivity: Researcher as Subject’, in Handbook of Qualitative Research, ed. Norman K. Denzin and Yvonna S. Lincoln (Thousand Oaks, CA: Sage, 2000), 733–68 at 737.

²⁴ Deborah E. Reed-Danahay, ‘Introduction’, in Auto/Ethnography: Rewriting the Self and the Social, ed. Deborah E. Reed-Danahay (Oxford; New York: Berg, 1997), 1–17 at 2.

²⁵ Summers, Understanding Video Game Music, 30–1.

²⁶ William Cheng, Sound Play: Video Games and the Musical Imagination (New York: Oxford University Press, 2014), 52–3.

²⁷ Edmund Husserl, Phenomenology and the Crisis of Philosophy, trans. Quentin Lauer (New York: Harper & Row, 1965), 115. See also David R. Cerbone, ‘Phenomenological Method: Reflection, Introspection, and Skepticism’, in The Oxford Handbook of Contemporary Phenomenology, ed. Dan Zahavi (Oxford; New York: Oxford University Press, 2013), 7–24.

²⁸ Moran, Introduction to Phenomenology, 11–12.

²⁹ Maurice Merleau-Ponty, Phenomenology of Perception, trans. Donald A. Landes (London: Routledge, 2012), lxxvii.

³⁰ Moran, Introduction to Phenomenology, 11–12.

³¹ See Merleau-Ponty, Phenomenology of Perception. See also Martin Heidegger, ‘The Origin of the Work of Art’, in Off the Beaten Track, trans. Julian Young and Kenneth Haynes (Cambridge: Cambridge University Press, 2002), 1–56; Martin Heidegger, ‘The Question Concerning Technology’, in The Question Concerning Technology and Other Essays, trans. William Lovitt (New York; London: Garland, 1977), 3–35; Martin Heidegger, Being and Time, trans. John Macquarrie and Edward Robinson (Oxford: Blackwell, 1962).

³² Edmund Husserl, On the Phenomenology of the Consciousness of Internal Time (1893–1917), trans. John Barnett Brough (Dordrecht; Boston; London: Kluwer, 1991).

³³ See Thomas Clifton, Music as Heard: A Study in Applied Phenomenology (New Haven; London: Yale University Press, 1983); Alfred Schutz, ‘Fragments on the Phenomenology of Music’, trans. Fred Kersten, Music and Man 2, no. 1–2 (January 1, 1976): 5–71; David Lewin, ‘Music Theory, Phenomenology, and Modes of Perception’, Music Perception: An Interdisciplinary Journal 3, no. 4 (1986): 327–92.

³⁴ Schutz, ‘Fragments on the Phenomenology of Music’, 43. See also Jerrold Levinson, Music in the Moment (Ithaca, NY: Cornell University Press, 1997).

³⁵ See, for example, Schutz‚ ‘Fragments on the Phenomenology of Music’, 43. Schutz refers to this as ‘pure’ listening.

³⁶ Anahid Kassabian, Ubiquitous Listening: Affect, Attention, and Distributed Subjectivity (Berkeley: University of California Press, 2013).

³⁷ Elizabeth Medina-Gray, ‘Modular Structure and Function in Early 21st-Century Video Game Music’ (PhD dissertation, Yale University, 2014), 31–2.

³⁸ Jonathan D. Kramer, The Time of Music: New Meanings, New Temporalities, New Listening Strategies (New York; London: Schirmer Books, 1988), 375.

³⁹ Jonathan de Souza, for instance, also links phenomenology to music theory through the work of David Lewin. See Jonathan de Souza, Music at Hand: Instruments, Bodies, and Cognition (Oxford: Oxford University Press, 2017), 4.

⁴⁰ David Bessell, ‘An Auto-Ethnographic Approach to Creating the Emotional Content of Horror Game Soundtracking’, in Emotion in Video Game Soundtracking, ed. Duncan Williams and Newton Lee (Cham: Springer International Publishing, 2018), 39–50.

⁴¹ See, for example, Melanie Fritsch, ‘“It’s a-Me, Mario!’ Playing with Video Game Music’, in Ludomusicology: Approaches to Video Game Music, eds. Michiel Kamp, Tim Summers and Mark Sweeney (Sheffield: Equinox, 2016), 92–115.

11 Interacting with Soundscapes: Music, Sound Effects and Dialogue in Video Games

¹ Karen Collins, Playing with Sound: A Theory of Interacting with Sound and Music in Video Games (Cambridge, MA: The MIT Press, 2013), 3.

² Mark Grimshaw, ‘Sound and Player Immersion in Digital Games’, in The Oxford Handbook of Sound Studies, ed. Trevor Pinch and Karin Bijsterveld (Oxford: Oxford University Press, 2011), 347–66 at 350.

³ Game Audio Network Guild, ‘2018 Awards’, www.audiogang.org/2018-awards/ (accessed 14 October 2018).

⁴ John Richardson and Claudia Gorbman, ‘Introduction’, in The Oxford Handbook of New Audiovisual Aesthetics, ed. John Richardson, Claudia Gorbman and Carol Vernallis (New York: Oxford University Press, 2013), 3–35 at 29–30.

⁵ Grimshaw, ‘Sound and Player Immersion’, 349–50.

⁶ Grimshaw, ‘Sound and Player Immersion’, 351–2; Michel Chion, Audio-Vision: Sound on Screen, ed. and trans. Claudia Gorbman (New York: Columbia University Press, 1994), 63.

⁷ Collins, Playing with Sound, 32.

⁸ Collins, Playing with Sound, 33.

⁹ Grimshaw, ‘Sound and Player Immersion’, 360.

¹⁰ Eve Hoggan and Stephen Brewster, ‘Nonspeech Auditory and Crossmodal Output’, in The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies, and Emerging Applications, ed. Julie A. Jacko, 3rd ed. (Boca Raton: CRC Press, 2012), 211–35 at 220, 222. See also William W. Gaver, ‘Auditory Icons: Using Sound in Computer Interfaces’, Human-Computer Interaction 2, no. 2 (1986): 167–77; Meera M. Blattner, Denise A. Sumikawa and Robert M. Greenberg, ‘Earcons and Icons: Their Structure and Common Design Principles’, Human-Computer Interaction 4, no. 1 (1989): 11–44.

¹¹ Hoggan and Brewster, ‘Nonspeech Auditory and Crossmodal Output’, 222–3.

¹² Kristine Jørgensen, A Comprehensive Study of Sound in Computer Games: How Audio Affects Player Action (Lewiston, NY: Edwin Mellen Press, 2009), 84–90.

¹³ Axel Stockburger, ‘The Play of the Voice: The Role of the Voice in Contemporary Video and Computer Games’, in Voice: Vocal Aesthetics in Digital Arts and Media, ed. Norie Neumark, Ross Gibson and Theo van Leeuwen (Cambridge, MA: The MIT Press, 2010), 281–99 at 285–7; Roland Barthes, ‘The Grain of the Voice’, in Image-Music-Text, trans. Stephen Heath (New York: Hill and Wang, 1977), 179–89.

¹⁴ Mark Ward, ‘Voice, Videogames, and the Technologies of Immersion’, in Voice: Vocal Aesthetics in Digital Arts and Media, ed. Norie Neumark, Ross Gibson and Theo van Leeuwen (Cambridge, MA: The MIT Press, 2010), 267–79 at 272.

¹⁵ For more on players’ voices in games, see Collins, Playing with Sound, 79–82; William Cheng, Sound Play: Video Games and the Musical Imagination (New York: Oxford University Press, 2014), 139–66; Stockburger, ‘The Play of the Voice.’

¹⁶ Tim Summers, ‘Dimensions of Game Music History’, in The Routledge Companion to Screen Music and Sound, ed. Miguel Mera, Ronald Sadoff and Ben Winters (New York: Routledge, 2017), 139–52 at 140.

¹⁷ See James Newman’s chapter (Chapter 1) in this volume for further discussion of this game.

¹⁸ Tom Langhorst, ‘The Unanswered Question of Musical Meaning: A Cross-Domain Approach’, in The Oxford Handbook of Interactive Audio, ed. Karen Collins, Bill Kapralos and Holly Tessler (Oxford: Oxford University Press, 2014), 95–116 at 109–13.

¹⁹ Elizabeth Medina-Gray, ‘Analyzing Modular Smoothness in Video Game Music’, Music Theory Online 25, no. 3 (2019), accessed 20 October 2020, www.mtosmt.org/issues/mto.19.25.3/mto.19.25.3.medina.gray.html.

²⁰ Cheng, Sound Play, 58.

²¹ Theo van Leeuwen, ‘Vox Humana: The Instrumental Representation of the Human Voice’, in Voice: Vocal Aesthetics in Digital Arts and Media, ed. Norie Neumark, Ross Gibson and Theo van Leeuwen (Cambridge, MA: The MIT Press, 2010), 5–15 at 5.

²² Tim Summers, Understanding Video Game Music (Cambridge, UK: Cambridge University Press, 2016), 193–7; Roger Moseley, Keys to Play: Music as Ludic Medium from Apollo to Nintendo (Berkeley: University of California Press, 2016), 236–74.

²³ Cheng, Sound Play, 99. For more on unsettling experiences arising from the use of sound-effect-like sounds in the music of Silent Hill, see Summers, Understanding Video Game Music, 127–30.

²⁴ Michael Sweet, Writing Interactive Music for Video Games: A Composer’s Guide (Upper Saddle River, NJ: Addison-Wesley, 2015), 188.

²⁵ For more details, see Steven Reale’s transcription of portions of Bit.Trip Runner’s score and earcons: Steven B. Reale, ‘Transcribing Musical Worlds; or, Is L.A. Noire a Music Game?’, in Music in Video Games: Studying Play, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 77–103 at 82–9.

²⁶ Elizabeth Medina-Gray, ‘Musical Dreams and Nightmares: An Analysis of Flower’, in The Routledge Companion to Screen Music and Sound, ed. Miguel Mera, Ronald Sadoff and Ben Winters (New York: Routledge, 2017), 562–76.

²⁷ For more on the sound design of Super Mario Galaxy and the developers’ intentions, and an account of one author’s experience of delight when encountering musical interactivity in this game, see Summers, Understanding Video Game Music, 193–7.

12 Analytical Traditions and Game Music: Super Mario Galaxy as a Case Study

^* The author wishes to thank the editors of this volume, as well as Scott Murphy, for their considerable feedback and suggestions during the process of preparing this chapter.

¹ Claudia Gorbman, Unheard Melodies: Narrative Film Music (Indiana University Press, 1987), 22.

² Collins also notes that the interactive nature of video games creates further complications of the diegetic/non-diegetic divide. Karen Collins, Game Sound: An Introduction to the History, Theory and Practice of Video Game Music and Sound Design (Cambridge, MA: The MIT Press, 2008), 125–7.

³ Don Michael Randel, ‘The Canons in the Musicological Toolbox’, in Disciplining Music: Musicology and Its Canons, ed. Katherine Bergeron and Philip V. Bohlman (Chicago: The University of Chicago Press, 1992), 10–22 at 15.

⁴ Lydia Goehr, The Imaginary Museum of Musical Works (Oxford: Clarendon, 1992).

⁵ Jean-Jacques Nattiez’s Music and Discourse: Toward a Semiology of Music, trans. Carolyn Abbate (Princeton: Princeton University Press, 1990).

⁶ Heinrich Schenker, The Art of Performance, ed. Heribert Esser, trans. Irene Schreier Scott (New York: Oxford University Press, 2000), 3. Nicholas Cook, Beyond the Score (New York: Oxford University Press, 2013) provides a detailed account of the discursive problems attending music theory’s emphasis of the score and relegation of performance.

⁷ See Milton Babbitt, ‘The Revolution in Sound: Electronic Music (1960)’, in The Collected Essays of Milton Babbitt, ed. Stephen Peles with Stephen Dembski, Andrew Mead and Joseph N. Straus (Princeton University Press, 2003), 70–7; also Frank Zappa with Peter Occhiogrosso, The Real Frank Zappa Book (New York: Touchstone, 1989), 172–4.

⁸ David Lewin, Generalized Musical Intervals and Transformations (New Haven: Yale University Press, 1987), 27.

⁹ For some existing approaches, see Elizabeth Medina-Gray, ‘Modular Structure and Function in Early 21st-Century Video Game Music’ (PhD dissertation, Yale University, 2014); Tim Summers, ‘Analysing Video Game Music: Sources, Methods and a Case Study’, in Ludomusicology: Approaches to Video Game Music’, ed. Michiel Kamp, Tim Summers and Mark Sweeney (Sheffield: Equinox Publishing, 2016), 8–31; and Steven Reale, ‘Transcribing Musical Worlds; or, is L.A. Noire a Music Game?’ in Music in Video Games: Studying Play, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 77–103.

¹⁰ Elaine Sisman, ‘Variation’, in Grove Music Online, http://oxfordmusiconline.com (accessed 26 November 2020).

¹¹ Roger Moseley, Keys to Play: Music as a Ludic Medium from Apollo to Nintendo (Oakland, CA: University of California Press, 2016), 27.

¹² Mario Galaxy Orchestra, Super Mario Galaxy: Official Soundtrack Platinum Edition (Nintendo, 2007).

¹³ Steven Reale, ‘Variations on a Theme by a Rogue AI: Music, Gameplay, and Storytelling in Portal 2’, SMT-V: The Videocast Journal of the Society for Music Theory 2, no. 2 and 2, no. 3 (August and December 2016), www.smt-v.org/archives/volume2.html#variations-on-a-theme-by-a-rogue-ai-music-gameplay-and-storytelling-in-portal-2-part-1-of-2 (accessed 26 November 2020).

¹⁴ On ‘iterative narration’, see Gérard Genette, Narrative Discourse: An Essay in Method, trans. Jane Levin (Ithaca, NY: Cornell University Press, 1980), 113–60.

¹⁵ Video game reviewer Ben (‘Yahtzee’) Croshaw humorously observes that the world design of most games in the Super Mario franchise have an iterative structure as well; from his review of Super Mario Odyssey (2017): ‘You guessed it: it’s the classic ditty: “Grasslands, Desert, Ocean, Jungle, Ice World, Fire World, Boss”’. Ben Croshaw, ‘Zero Punctuation: Super Mario Odyssey’, Escapistmagazine.com (8 November 2017; accessed 26 November 2020) https://v1.escapistmagazine.com/videos/view/zero-punctuation/117154-Yahtzee-Reviews-Super-Mario-Odyssey. In his own meta-iterative way, Croshaw self-referentially echoes the same joke he has made in other reviews of franchise entries, such as ‘Zero Punctuation: Mario & Luigi Paper Jam’, Escapistmagazine.com (20 January 2016; accessed 26 November 2020) http://v1.escapistmagazine.com/videos/view/zero-punctuation/116678-Mario-Luigi-Paper-Jam-Review.

¹⁶ For detailed overviews of Schenkerian techniques, see Allen Cadwallader, David Gagné and Frank Samarotto, Analysis of Tonal Music: A Schenkerian Approach, 4th ed. (New York: Oxford University Press, 2019); and Allen Forte and Steven E. Gilbert, Introduction to Schenkerian Analysis (New York: Norton, 1982).

¹⁷ Joseph Kerman, ‘How We Got into Analysis, and How to Get Out’, Critical Inquiry 7, no. 2 (1990): 311–31.

¹⁸ Eugene Narmour, Beyond Schenkerism: The Need for Alternatives in Music Analysis (Chicago: University of Chicago Press, 1977), 28–30.

¹⁹ Patrick McCreless offers a detailed and compelling account of the history of music theory as a twentieth-century North American academic discipline, including the rise and subsequent criticism of Schenkerian analysis. See his ‘Rethinking Contemporary Music Theory’, in Keeping Score: Music Disciplinarity, Culture, ed. David Schwarz, Anahid Kassabian and Lawrence Seigel (Charlottesville: University Press of Virginia, 1997), 13–53.

²⁰ As this book went to press, a new controversy arose that, as a consequence, severely undermines this position. During the plenary session of the 2019 meeting of the Society for Music Theory, Philip Ewell offered a presentation showing that a ‘white racial frame’ continues to permeate the discipline of music theory (‘Music Theory’s White Racial Frame’, Society for Music Theory, Columbus, OH, 9 November 2019). As part of the presentation, Ewell included excerpts of Heinrich Schenker’s writing in which he celebrates whiteness and castigates ‘inferior races’, and argued that Schenker’s racist views not only cannot be isolated from his theory of music, but that they are in fact fundamental to it. In response, the Journal of Schenkerian Studies published a special issue responding to Ewell’s paper (Volume 12, 2019). The issue drew considerable criticism, as the journal included an essay by an anonymous author, published ad hominem attacks against Ewell while failing to invite him to submit a response or any other kind of contribution and appeared to sidestep the peer-review process. On 7 July 2020, the Executive Board of the Society for Music Theory released a statement condemning the issue: ‘The conception of this symposium failed to meet the ethical, professional, and scholarly standards of our discipline.’ As Schenkerian analysis is a methodology at the core of American music theory (see McCreless, ‘Rethinking Contemporary Music Theory’), the discipline can no longer wait to reckon with the ugly nature of its past, and Schenkerian practitioners can no longer bulwark by claiming as their purview the purely musical, believing it to be a realm cloistered from the messy humanity of music making. Ewell published an expanded version of his keynote talk in Philip Ewell, ‘Music Theory and the White Racial Frame’, Music Theory Online 26, no. 2 (September 2020).

²¹ Two other modified Schenkerian applications to video music include Jason Brame, ‘Thematic Unity in a Video Game Series’, Act: Zeitschrift für Musik und Performance 2 (2011), accessed 26 November 2020, www.act.uni-bayreuth.de/de/archiv/2011-02/03_Brame_Thematic_Unity/; and Peter Shultz, ‘Music Theory in Music Games’, in From Pac-Man to Pop Music: Interactive Audio in Games and New Media, ed. Karen Collins (Brookfield, VT: Ashgate Publishing Company, 2008), 177–88.

²² A conventional scenario in which this might occur would be in a set of variations on a theme, but even there it would be unusual for successive variations to do nothing more than add additional ornamental tones to the version that came immediately before it.

²³ Frank L. Greitzer, Olga Anna Kuchar and Kristy Huston, ‘Cognitive Science Implications for Enhancing Training Effectiveness in a Serious Gaming Context’, Journal on Educational Resources in Computing 7, no. 3 (November, 2007), Article 2, accessed 26 November 2020, https://doi.org/10.1145/1281320.1281322.

²⁴ Jeffrey Swinkin, ‘Variation as Thematic Actualisation: The Case of Brahms’s Op. 9’, Music Analysis 31, no. 1 (2012): 38–9.

²⁵ Although later theorists proposed other kinds of background structures besides those presented in Figure 12.4. See Walter Everett, ‘Deep-Level Portrayals of Directed and Misdirected Motions in Nineteenth-Century Lyric Song’, Journal of Music Theory 48, no. 1 (2004): 25–68.

²⁶ Eugene Narmour, The Analysis and Cognition of Melodic Complexity: The Implication-Realization Model (Chicago: University of Chicago Press, 1992).

²⁷ Personal email to the author, 27 January 2019.

²⁸ See, for example, Frank Lehman, ‘Transformational Analysis and the Representation of Genius in Film Music’, Music Theory Spectrum 35, no. 1 (Spring 2013): 1–22; Frank Lehman, Hollywood Harmony: Musical Wonder and the Sound of Cinema (New York: Oxford University Press, 2018); Scott Murphy, ‘Transformational Theory and the Analysis of Film Music’, in The Oxford Handbook of Film Music Studies, ed. David Neumeyer (New York: Oxford University Press, 2013), 471–99; and Guy Capuzzo, ‘Neo-Riemannian Theory and the Analysis of Pop-Rock Music’, Music Theory Spectrum 26, no. 2 (2004): 196–7.

²⁹ See, for example, Richard Cohn, Audacious Euphony (New York: Oxford University Press, 2012), 30.

³⁰ Lehman, Hollywood Harmony, 192–7.

³¹ David Lewin, ‘Some Notes on Analyzing Wagner: “The Ring” and “Parsifal”’, 19th Century Music 16, no. 1 (Summer 1992): 49–58.

³² Julian Hook, ‘David Lewin and the Complexity of the Beautiful’, Intégral 21 (2007): 155–90.

³³ Amongst music theorists of a neo-Riemannian bent, it is not universally agreed that measuring tonal distance on the basis of counting individual neo-Riemannian transformations provides an unproblematic account of the voice-leading procedures it is intended to model. Tymoczko, for example, has shown that transforming F minor to C major (PLR or RLP) requires one more transformation than does transforming F major to C major (LR), whereas comparing the half-step between A♭ and G on the one hand and the whole-step between A and G on the other implies that the F minor is tonally closer to C major than is F major. Dmitri Tymoczko, ‘Three Conceptions of Musical Distance’, in Mathematics and Computation and Music, ed. Elaine Chew, Adrian Childs and Ching-Hua Chuan (Heidelberg: Springer, 2009), 258–72 at 264–7. For a detailed discussion of ways that the discrepancy may be reconciled, see Scott Murphy, ‘Review of Richard Cohn, Audacious Harmony: Chromaticism and the Triad’s Second Nature’, Journal of Music Theory 58, no. 1 (2014): 79–101.

³⁴ David Lewin, Generalized Musical Intervals and Transformations, 159. Emphasis original.

³⁵ Although theorizing a distinction between concepts like ‘digital world’, ‘virtual space’ and ‘game space’ is outside the scope of the present work, it is worth noting Melanie Fritsch’s careful distinction between the concepts of ‘gameworld’ and ‘diegetic environment’. See Melanie Fritsch, ‘Worlds of Music: Strategies for Creating Music-based Experiences in Videogames’, in The Oxford Handbook of Interactive Audio, ed. Karen Collins, Bill Kapralos and Holly Tessler (New York: Oxford University Press, 2014), 167–78 at 170.

³⁶ Dmitri Tymoczko has argued that voice-leading operations apply just as well to scales and modulations as they do to chords and progressions. See A Geometry of Music: Harmony and Counterpoint in the Extended Common Practice (New York: Oxford University Press, 2011), 17–19. Similarly, Steven Rings introduces the concept of ‘pivot intervals’ in his system of transformational analysis to describe modulations between keys. See Tonality and Transformation (New York: Oxford University Press, 2011), 58–66. It should be noted that both authors consider voice-leading operations as they operate on scales rather than tonic triads. By contrast, René Rusch proposes a ‘Toveyian’ blending of Schenkerian and neo-Riemannian procedures that examines modulations through Tovey’s conception of key relations, which relates triads through modal mixture in a diatonic universe, and suggests the plausibility of the present approach. See René Rusch, ‘Schenkerian Theory, Neo-Riemannian Theory and Late Schubert: A Lesson from Tovey’, Journal of the Society for Musicology in Ireland 8 (2012–2013): 3–20.

³⁷ Philip V. Bohlman, ‘Epilogue: Musics and Canons’, in Disciplining Music: Musicology and Its Canons, ed. Katherine Bergeron and Philip V. Bohlman (Chicago: The University of Chicago Press, 1992), 197–210 at 202.

³⁸ See Terry Eagleton, Literary Theory: An Introduction (Minneapolis: University of Minnesota Press, 1983) and Walter J. Ong, Orality and Literacy: 30th Anniversary Edition, 3rd ed. (New York: Routledge, 2013).

13 Semiotics in Game Music

¹ Charles Sanders Peirce, The Essential Peirce: Selected Philosophical Writings, ed. Peirce Edition Project, vol. 2 (Bloomington, IN: Indiana University Press, 1998), 410.

² Peirce, The Essential Peirce, 402.

³ Peirce, The Essential Peirce, 5.

⁴ Jean-Jacques Nattiez, Music and Discourse: Toward a Semiology of Music, trans. Carolyn Abbate (Princeton, NJ: Princeton University Press, 1990), 37; Philip Tagg, Music’s Meanings: A Modern Musicology for Non-Musos (New York: The Mass Media Music Scholars’ Press, 2013), 171.

⁵ Nattiez, Music and Discourse, 17.

⁶ Tagg, Music’s Meanings, 177–8.

⁷ Ludwig van Beethoven, Piano Sonata No. 14, Opus 27, No. 2, 1802.

⁸ Bedřich Smetana, Vltava (or The Moldau), JB 1:112/2, 1874.

⁹ Theo van Leeuwen, ‘Music and Ideology: Notes Toward a Sociosemiotics of Mass Media Music’, Popular Music and Society 22, no. 4 (1998): 25–6.

¹⁰ Van Leeuwen, ‘Music and Ideology’, 26.

¹¹ Tim Summers, Understanding Video Game Music (Cambridge, UK: Cambridge University Press, 2016), 38–9.

¹² Karen Collins, Game Sound: An Introduction to the History, Theory, and Practice of Video Game Music and Sound Design (Cambridge, MA: The MIT Press, 2008), 141.

¹³ Espen Aarseth, Cybertext: Perspectives on Ergodic Literature (Baltimore, MD: Johns Hopkins University Press, 1997), 1.

¹⁴ Hanna Wirman, ‘On Productivity and Game Fandom’, Transformative Works and Cultures 3 (2009), ¶ 2.4, http://journal.transformativeworks.org/index.php/twc/article/view/145/115.

¹⁵ Sun-ha Hong, ‘When Life Mattered: The Politics of the Real in Video Games’ Reappropriation of History, Myth, and Ritual’, Games and Culture 10, no. 1 (2015): 42–3; Mark Sweeney, ‘The Aesthetics of Video Game Music’ (PhD thesis, University of Oxford, 2014), 219.

¹⁶ Tagg, Music’s Meanings, 232–8.

¹⁷ On nostalgia in video game music, see Sarah Pozderac-Chenevey, ‘A Direct Link to the Past: Nostalgia and Semiotics in Video Game Music’, Divergence Press, no. 2 (2014), http://divergencepress.net/2014/06/02/2016-11-3-a-direct-link-to-the-past-nostalgia-and-semiotics-in-video-game-music/

¹⁸ See Isabella van Elferen, ‘Analysing Game Musical Immersion: The ALI Model’, in Ludomusicology: Approaches to Video Game Music, ed. Michiel Kamp, Tim Summers and Mark Sweeney (Sheffield: Equinox, 2016), 34–7.

¹⁹ Winifred Phillips, A Composer’s Guide to Game Music (Cambridge, MA: The MIT Press, 2014), 151–2.

²⁰ Jeremy Soule, The Elder Scrolls V: Skyrim (Original Game Soundtrack) (DirectSong, 2011), digital audio.

14 Game – Music – Performance: Introducing a Ludomusicological Theory and Framework

¹ On Guillaume Laroche and Nicholas Tam’s use of the term see Tasneem Karbani, ‘Music to a Gamer’s Ears’, University of Alberta Faculty of Arts News (22 August 2007), accessed 10 April 2020, https://web.archive.org/web/20070915071528/http://www.uofaweb.ualberta.ca/arts/news.cfm?story=63769 and Nicholas Tam, ‘Ludomorphballogy’, Ntuple Indemnity (7 September 2007), accessed 10 April 2020, www.nicholastam.ca/2007/09/07/ludomorphballogy. Unlike narratologists, who proposed to study games as a form of text or narrative using established methods from other fields, ludologists favoured the idea of finding subject-specific approaches. On this debate in game studies and more information on the field in general see Frans Mäyrä, An Introduction to Game Studies: Games in Culture (London: Sage, 2008).

² Roger Moseley, ‘Playing Games with Music (and Vice Versa): Ludomusicological Perspectives on Guitar Hero and Rock Band’, in Taking It to the Bridge: Music as Performance, ed. Nicholas Cook and Richard Pettengill (Ann Arbor: University of Michigan Press, 2013), 279–318 at 283.

³ The full development of the approach summarized here can be found in Melanie Fritsch, Performing Bytes: Musikperformances der Computerspielkultur (Würzburg: Königshausen & Neumann, 2018), chs. 2, 3 and 4.

⁴ Klaus W. Hempfer, ‘Performance, Performanz, Performativität. Einige Unterscheidungen zur Ausdifferenzierung eines Theoriefeldes’, in Theorien des Performativen: Sprache – Wissen – Praxis. Eine kritische Bestandsaufnahme, ed. Klaus W. Hempfer and Jörg Volbers (Bielefeld: transcript, 2011), 13–41, here 13.

⁵ Fritsch, Performing Bytes, 18–35.

⁶ John L. Austin, How to Do Things with Words: The William James Lectures Delivered at Harvard University in 1955, 2nd ed. (Oxford: Clarendon, 1975).

⁷ See also James Loxley, Performativity (London: Routledge, 2007).

⁸ Peggy Phelan, Unmarked: The Politics of Performance (New York: Routledge, 1993).

⁹ Erike Fischer-Lichte, Ästhetik des Performativen (Frankfurt am Main: Suhrkamp, 2004).

¹⁰ Phillip Auslander, Liveness. Performance in a Mediatized Culture (New York: Routledge, 2008); see also Melanie Fritsch and Stefan Strötgen, ‘Relatively Live: How to Identify Live Music Performances?’, Music and the Moving Image 5, no. 1 (2012): 47–66.

¹¹ Phelan, Unmarked, 146.

¹² For an approach informed by performativity, see Iain Hart’s chapter, Chapter 13 in this volume.

¹³ For more information on this genre see Michael Austin’s chapter, Chapter 9, in this volume.

¹⁴ Kiri Miller, ‘Schizophonic Performance: Guitar Hero, Rock Band, and Virtual Virtuosity’, Journal of the Society for American Music 3, no. 4 (2009): 395–429 at 418.

¹⁵ Footnote Ibid.

¹⁶ David Roesner, ‘The Guitar Hero’s Performance’, Contemporary Theatre Review 21, no. 3 (2012): 276–85 at 278.

¹⁷ Miller, ‘Schizophonic Performance’, 418.

¹⁸ Marvin Carlson, Performance: A Critical Introduction (New York: Routledge, 1996), 3.

¹⁹ Carlson, Performance, 4–5.

²⁰ Carlson, Performance, 5.

²¹ Footnote Ibid.

²² Regarding the differences between German Theaterwissenschaft and the Anglo-American disciplines of theatre studies and performance studies see Marvin Carlson, ‘Introduction. Perspectives on Performance: Germany and America’, in Erika Fischer-Lichte, The Transformative Power of Performance: A New Aesthetics, ed. Erika Fischer-Lichte, trans. Saskya Iris Jain (New York: Routledge 2008), 1–11.

²³ Fischer-Lichte, Ästhetik des Performativen, 53–4. Translation by the author.

²⁴ See Fritsch 2018, Performing Bytes, 34–35.

²⁵ Christa Gebel, Michael Gurt and Ulrike Wagner, Kompetenzförderliche Potenziale populärer Computerspiele: Kurzfassung der Ergebnisse des Projekts ‘Kompetenzförderliche und kompetenzhemmende Faktoren in Computerspielen’ (Berlin: JFF – Institut für Medienpädagogik in Forschung und Praxis, 2004), www.jff.de/fileadmin/user_upload/jff/veroeffentlichungen/vor_2015/2004_kompetenzfoerderliche_potenziale_von_computerspielen/2004_Kurzfassung_computerspiele_jff_website.pdf (accessed 11 April 2020).

²⁶ Christa Gebel, ‘Kompetenz Erspielen – Kompetent Spielen?’, merz. medien + erziehung 54, no. 4 (2010): 45–50.

²⁷ See José Zagal, Ludoliteracy: Defining, Understanding, and Supporting Games Education (Pittsburgh: Figshare, 2010) and Eric Zimmerman, ‘Gaming Literacy: Game Design as a Model for Literacy in the Twenty-First Century’, in The Video Game Theory Reader 2, ed. Bernard Perron and Mark J. P. Wolf (New York: Routledge, 2009), 23–31.

²⁸ See James Newman, Playing with Videogames (New York: Routledge, 2008); Karen Collins, Playing with Sound: A Theory of Interacting with Music and Sound in Video Games (Cambridge, MA: The MIT Press, 2013) and Fritsch, Performing Bytes, chapter 3.

²⁹ Danny Kringiel, ‘Spielen gegen jede Regel: Wahnsinn mit Methode’, Spiegel Online, 30 September 2005, www.spiegel.de/netzwelt/web/spielen-gegen-jede-regel-wahnsinn-mit-methode-a-377417.html (last accessed 15 September 2016).

³⁰ Zimmerman, ‘Gaming Literacy’, 25.

³¹ Zimmerman, ‘Gaming Literacy’, 27.

³² Zimmerman, ‘Gaming Literacy’, 25.

³³ Katie Salen and Eric Zimmerman, Rules of Play: Game Design Fundamentals Cambridge, MA: The MIT Press, 2004), 41.

³⁴ Footnote Ibid.

³⁵ See Melanie Fritsch, ‘‘It’s a-me, Mario!’ – Playing with Video Game Music’, in Ludomusicology: Approaches to Video Game Music, ed. Michiel Kamp, Tim Summers and Mark Sweeney (Sheffield: Equinox, 2016), 92–115.

³⁶ See Fritsch, Performing Bytes, 268–90.

³⁷ Jesse Schell, The Art of Game Design: A Book of Lenses (San Francisco: Taylor & Francis, 2008), 10.

³⁸ Gestalt theory originally derives from an area of psychological theory introduced in the early twentieth century, which underwent changes in the different research areas by which it was adopted, such as musicology. See, for example, Mark Reybrouk, ‘Gestalt Concepts and Music: Limitations and Possibilities’, in Music, Gestalt, and Computing: Studies in Cognitive and Systematic Musicology, ed. Marc Leman (Berlin: Springer, 1997), 57–69.

³⁹ Craig A. Lindley, ‘The Gameplay Gestalt, Narrative, and Interactive Storytelling’, in Computer Games and Digital Cultures Conference Proceedings, Tampere 2002, 203–15 at 207, www.digra.org/wp-content/uploads/digital-library/05164.54179.pdf (last accessed 5 March 2020).

⁴⁰ Melanie Fritsch, ‘Worlds of Music: Strategies for Creating Music-based Experiences in Videogames’, in The Oxford Handbook of Interactive Audio, ed. by Karen Collins, Bill Kapralos and Holly Tessler (New York: Oxford University Press, 2014), 167–78 at 170.

⁴¹ Steve Swink, Game Feel: A Game Designer’s Guide to Virtual Sensation (Burlington: Morgan Kaufmann Publishing, 2009), 6.

⁴² Swink, Game Feel, 10.

⁴³ See Fritsch, ‘Worlds of Music.’

⁴⁴ See for example Nicholas Cook, Beyond the Score: Music as Performance (New York: Oxford University Press, 2013).

⁴⁵ Christopher Small, Musicking: The Meanings of Performing and Listening (Hanover, NH: Wesleyan University Press, 1998).

⁴⁶ Carolyn Abbate, ‘Music – Drastic or Gnostic?’, Critical Inquiry 30, no. 3 (2004): 505–36.

⁴⁷ Stefan Strötgen, Markenmusik (Würzburg: Königshausen & Neumann, 2014).

⁴⁸ Gino Stefani, ‘A Theory of Musical Competence’, Semiotica 66, no. 1–3 (1987): 7.

⁴⁹ Strötgen, Markenmusik, 119–50. Here Stefani as quoted and summarized by Strötgen on 129.

⁵⁰ For the full discussion see Strötgen, Markenmusik, 119–137.

⁵¹ Small, Musicking, 13.

⁵² Small, Musicking, 9.

⁵³ Nicholas Cook, ‘Between Process and Product: Music and/as Performance’, Music Theory Online 7, no. 2 (2001), accessed 16 October 2020, https://mtosmt.org/issues/mto.01.7.2/mto.01.7.2.cook.html

⁵⁴ This can either happen in a constructive or destructive way. A common phenomenon is gatekeeping; for example when game music started to enter the concert halls, not every classical music fan or critic was in favour of this development. Game culture itself has cut a bad figure in terms of inclusivity, with certain mostly straight-white-male-dominated groups not accepting non-white, non-male-identifying people as gamers. As the Gamergate debate has proven, this is not only fan culture being discussed, but a complex and highly political controversy, which cannot be laid out here in detail.

⁵⁵ See also Fritsch and Strötgen, ‘Relatively Live’; and Moseley, ‘Playing Games with Music.’

⁵⁶ Abbate, ‘Music – Drastic or Gnostic?’, 505–36. For a ludomusicological reading of Abbate’s approach see Isabella van Elferen, ‘Ludomusicology and the New Drastic’, Journal of Sound and Music in Games 1, no. 1 (2020): 103–12.

⁵⁷ Abbate, ‘Music – Drastic or Gnostic?’, 511.

⁵⁸ Footnote Ibid.

⁵⁹ A full version of this case study can be found in Fritsch, Performing Bytes, 130–44. See also Zach Whalen, ‘Play Along – An Approach to Videogame Music’, Game Studies 4, no. 1 (2004); Andrew Schartmann, Koji Kondo’s Super Mario Bros. Soundtrack (New York: Bloomsbury, 2015).

⁶⁰ Fritsch, ‘Worlds of Music’, 170.

⁶¹ Swink, Game Feel, 225.

⁶² Swink, Game Feel, 206–9.

⁶³ Satoru Iwata, ‘Iwata Asks: Volume 5: Original Super Mario Developers’, Iwata Asks (2010), accessed 11 April 2020, http://iwataasks.nintendo.com/interviews/#/wii/mario25th/4/4.

⁶⁴ Swink, Game Feel, 221.

⁶⁵ Schartmann, Super Mario Bros., 32–3.

⁶⁶ The creators of another game in the franchise, namely Super Mario Galaxy, mentioned this very philosophy as the basic idea behind their game’s soundtrack. See Satoru Iwata, ‘Mario Is Like a Musical Instrument’, Iwata Asks (2010), accessed 11 April 2020, http://iwataasks.nintendo.com/interviews/#/wii/supermariogalaxy2/1/4.

⁶⁷ See, for example, Melanie Fritsch, ‘Heroines Unsung: The (Mostly) Untold History of Female Japanese Composers’, in Women’s Music for the Screen: Diverse Narratives in Sound, ed. Felicity Wilcox (New York: Routledge, in press). A full version of this case study can be found in Fritsch, Performing Bytes, 252–67.

⁶⁸ Anders Carlsson, ‘Chip Music: Low-Tech Data Music Sharing’, in From Pac-Man to Pop Music: Interactive Audio in Games and New Media, ed. Karen Collins (Aldershot: Ashgate, 2008), 153–62 at 155.

⁶⁹ Regarding hacker ethics and ideas, see Kenneth B. McAlpine’s chapter (Chapter 2) in this volume.

⁷⁰ Carlsson, ‘Chip Music’, 158.

⁷¹ Mat Ombler, ‘Megadrive to Mega Hit: Why Video Games Are So Tied to Club Music’, Vice (2018), accessed 11 April 2020, https://www.vice.com/en_uk/article/a3pb45/video-games-90s-club-music-commodore-amiga.

⁷² Leonard J. Paul, ‘For the Love of Chiptune’, in The Oxford Handbook of Interactive Audio, ed. Karen Collins, Bill Kapralos and Holly Tessler (New York: Oxford University Press, 2014), 507–30 at 510.

⁷³ See also Matthias Pasdzierny, ‘Geeks on Stage? Investigations in the World of (Live) Chipmusic’, in Music and Game Perspectives on a Popular Alliance, ed. Peter Moormann (Wiesbaden: Springer, 2013), 171–90 and Chris Tonelli, ‘The Chiptuning of the World: Game Boys, Imagined Travel, and Musical Meaning’, in The Oxford Handbook of Mobile Music Studies Volume. 2, ed. Sumanth Gopinath and Jason Stanyek (New York: Oxford University Press, 2014), 402–26.

⁷⁴ Gebel, Gurt and Wagner, Kompetenzförderliche Potenziale populärer Computerspiele.

¹ K. J. Donnelly, ‘Emotional Sound Effects and Metal Machine Music: Soundworlds in Silent Hill Games and Films’, in The Palgrave Handbook of Sound Design and Music in Screen Media, ed. Liz Greene and Danijela Kulezic-Wilson (London: Palgrave, 2016), 73–88; Miguel Mera, ‘Invention/Re-invention’, Music, Sound and the Moving Image 3, no. 1 (2009): 1–20; Florian Mundhenke, ‘Resourceful Frames and Sensory Functions – Musical Transformations from Game to Film in Silent Hill’, in Music and Game: Perspectives on a Popular Alliance, ed. Peter Moormann (Wiesbaden: Springer, 2013), 107–24; Zach Whalen, ‘Case Study: Film Music vs. Video-Game Music: The Case of Silent Hill’, in Music, Sound and Multimedia: From the Live to the Virtual, ed. Jamie Sexton (Edinburgh: Edinburgh University Press, 2007), 68–81.

² Neil Lerner, ‘Mario’s Dynamic Leaps: Musical Innovations (and the Specter of Early Cinema) in Donkey Kong and Super Mario Bros.’, in Music in Video Games: Studying Play, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 1–29.

³ William Gibbons, ‘Song and the Transition to “Part-Talkie” Japanese Role-Playing Games’, in Music in the Role-Playing Game: Heroes and Harmonies, ed. William Gibbons and Steven Reale (New York: Routledge, 2019), 9–20.

⁴ Giles Hooper, ‘Sounding the Story: Videogame Cutscenes’, in Emotion in Video Game Soundtracking, ed. Duncan Williams and Newton Lee (Cham: Springer, 2018), 115–42.

⁵ Isabella van Elferen, ‘Un Forastero! Issues of Virtuality and Diegesis in Videogame Music’, Music and the Moving Image 4, no. 2 (2011): 30–9.

⁶ Jason Brame, ‘Thematic Unity Across a Video Game Series’, ACT. Zeitschrift für Musik & Performance. 2 (2011), accessed 29 October 2020, www.act.uni-bayreuth.de/de/archiv/2011-02/03_Brame_Thematic_Unity/index.html.

⁷ Guillaume Laroche, ‘Analyzing Musical Mario-Media: Variations in the Music of Super Mario Video Games’ (MA thesis, McGill University 2012); Andrew Schartmann, Koji Kondo’s Super Mario Bros. Soundtrack (New York: Bloomsbury, 2015).

⁸ Frank Lehman, ‘Methods and Challenges of Analyzing Screen Media’, in The Routledge Companion to Screen Music and Sound, ed. Miguel Mera, Ronald Sadoff and Ben Winters (New York: Routledge, 2017), 497–516.

⁹ Thomas B. Yee, ‘Battle Hymn of the God-Slayers: Troping Rock and Sacred Music Topics in Xenoblade Chronicles’, Journal of Sound and Music in Games 1, no. 1 (2020): 2–19.

¹⁰ Sean E. Atkinson, ‘Soaring Through the Sky: Topics and Tropes in Video Game Music’, Music Theory Online 25, no. 2 (2019), accessed 29 October 2020, https://mtosmt.org/issues/mto.19.25.2/mto.19.25.2.atkinson.html.

¹¹ Peter Shultz, ‘Music Theory in Video Games’, in From Pac-Man to Pop Music: Interactive Audio in Games and New Media, ed. Karen Collins (Aldershot: Ashgate, 2008), 177–88.

¹² Elizabeth Medina-Gray, ‘Modularity in Video Game Music’, in Ludomusicology: Approaches to Video Game Music, ed. Michiel Kamp, Tim Summers and Mark Sweeney (Sheffield: Equinox, 2016), 53–72 at 53, 55.

¹³ Elizabeth Medina-Gray, ‘Meaningful Modular Combinations: Simultaneous Harp and Environmental Music in Two Legend of Zelda Games’, in Music in Video Games, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 104–21 and Elizabeth Medina-Gray, ‘Musical Dreams and Nightmares: An Analysis of Flower’, in The Routledge Companion to Screen Music and Sounded, ed. Miguel Mera, Ronald Sadoff and Ben Winters. (New York: Routledge, 2017), 562–76.

¹⁴ Elizabeth Medina-Gray, ‘Analyzing Modular Smoothness in Video Game Music’, Music Theory Online 25, no. 3 (2019).

¹⁵ William Gibbons, ‘Wandering Tonalities: Silence, Sound, and Morality in Shadow of the Colossus’, in Music in Video Games, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 122–37.

¹⁶ Kate Galloway, ‘Soundwalking and the Aurality of Stardew Valley: An Ethnography of Listening to and Interacting with Environmental Game Audio’, in Music in the Role-Playing Game, ed. William Gibbons and Steven Reale (New York: Routledge, 2019), 159–78; Elizabeth Hambleton, ‘Gray Areas: Analyzing Navigable Narratives in the Not-So-Uncanny Valley Between Soundwalks, Video Games, and Literary Computer Games’, Journal of Sound and Music in Games 1, no. 1 (2020): 20–43.

¹⁷ William Cheng, Sound Play: Video Games and the Musical Imagination (New York: Oxford University Press, 2014).

¹⁸ Michiel Kamp, ‘Four Ways of Hearing Video Game Music’ (PhD thesis, Cambridge University, 2014), 15, 89, 131.

¹⁹ Kiri Miller, ‘Schizophonic Performance: Guitar Hero, Rock Band, and Virtual Virtuosity’, Journal of the Society for American Music 3, no. 4 (2009): 395–429 at 408.

²⁰ Henry Adam Svec, ‘Becoming Machinic Virtuosos: Guitar Hero, Rez, and Multitudinous Aesthetics’, Loading … 2, no. 2 (2008), accessed 29 October 2020, https://journals.sfu.ca/loading/index.php/loading/article/view/30/28; Dominic Arsenault, ‘Guitar Hero: ‘Not Like Playing Guitar at All’?’, Loading … 2, no. 2 (2008), accessed 29 October 2020, https://journals.sfu.ca/loading/index.php/loading/article/view/32/29; and David Roesner, ‘The Guitar Hero’s Performance’, Contemporary Theatre Review 21, no. 3 (2011): 276–85.

²¹ Melanie Fritsch and Stefan Strötgen, ‘Relatively Live: How to Identify Live Music Performances’, Music and the Moving Image 5, no. 1 (2012): 47–66.

²² Kiri Miller, Playing Along: Digital Games, YouTube, and Virtual Performance (New York: Oxford University Press, 2011), 15.

²³ Martin Pichlmair and Fares Kayali, ‘Levels of Sound: On the Principles of Interactivity in Music Video Games’, DiGRA ’07 – Proceedings of the 2007 DiGRA International Conference: Situated Play, University of Tokyo, September 2007, 424–30.

²⁴ Anahid Kassabian and Freya Jarman, ‘Game and Play in Music Video Games’, in Ludomusicology: Approaches to Video Game Music, ed. Michiel Kamp, Tim Summers and Mark Sweeney (Sheffield: Equinox, 2016), 116–32.

²⁵ Melanie Fritsch, ‘Worlds of Music: Strategies for Creating Music-Based Experiences in Video Games’, in The Oxford Handbook of Interactive Audio, ed. Karen Collins, Holly Tessler and Bill Kapralos (New York: Oxford University Press, 2014), 167–177.

²⁶ Philip Auslander, ‘Musical Personae’, TDR: The Drama Review 50, no. 1 (2006): 100–19.

²⁷ Melanie Fritsch, ‘Beat It! Playing the “King of Pop” in Video Games’, in Music Video Games: Performance, Politics, and Play, ed. Michael Austin (New York: Bloomsbury, 2016), 153–76.

²⁸ Stephanie Lind, ‘Active Interfaces and Thematic Events in The Legend of Zelda: Ocarina of Time’, Music Video Games: Performance, Politics, and Play, ed. Michael Austin (New York: Bloomsbury, 2016), 83–106.

²⁹ Cheng, Sound Play; and Mark Sweeney, ‘Aesthetics and Social Interactions in MMOs: The Gamification of Music in Lord of the Rings Online and Star Wars: Galaxies’, The Soundtrack 8, no. 1–2 (2015): 25–40.

³⁰ Steven B. Reale, ‘Transcribing Musical Worlds; or, Is L.A. Noire a Music Game?’, in Music in Video Games, ed. K. J. Donnelly, William Gibbons and Neil Lerner (New York: Routledge, 2014), 77–103 at 100.

³¹ Roger Moseley, Keys to Play: Music as a Ludic Medium from Apollo to Nintendo (Berkeley: University of California Press, 2016), 216–17.

³² Moseley, Keys to Play, 22.

Book contents

Part III - Analytical Approaches to Video Game Music

Summary

Relationship with Film Music

Borrowing Art Music Techniques

The Experience of Interactivity

‘Music Games’

Play Beyond Music Games

Music Game (Sub)Genres

Rhythm Games

Peripheral-based Rhythm Games

Non-Peripheral Rhythm Games

Dance-based Rhythm Games

Musical Rail-Shooter Games

Sampling/Sequencing and Sandbox Games

Karaoke Music Games

Mnemonic Music Games and Musical Puzzle Games

Musician Video Games

Music Industry Games

Edutainment Music Games and Musical Gamification

Music Game Technology

Peripheral Controllers

Guitars

Drums and Other Percussion

Turntables

Microphones

Mats, Pads and Platforms

Motion Controls

Wii Nunchuks

Smartphone or Portable Listening Device Touchscreens

Wider Culture and Music Games

Types of Music Games

Procedural and Conceptual Musical Aspects of Games

Types of Conceptual Musical Content

Conclusions

Hearing Video Game Music

Hermeneutics

Autoethnography

Phenomenology

Individual Contributions of Music, Sound Effects and Dialogue in Video Games

Blurring Audio Categories in Early Video Game Soundscapes

Blurring Sound Effects and Music in More Recent Games

Methodology 1: Formal Analysis – Theme and Variation

Methodology 2: Reductive Analysis

Table 12.1 Number of hits and pitches sounded during each volley of the King Kaliente battle

Methodology 3: Transformational Analysis

Table 12.2 Primary galaxies with their key centres in Super Mario Galaxy

Conclusions

Introduction: Semiotics Is Not Hard

Semiotics

What Are Signs?

How Signs Are Used to Communicate

Semiotics in Game Music

Two Semiotic Domains

Initial Composition and Initial Meanings

Gameplay and the Player’s Influence

Case Studies: Music of the Elder Scrolls Series

Linking Games Through Musical Composition: Elder Scrolls Title Music

Linking Experiences Through Musical Gameplay: Skyrim’s Action Music

Conclusion

Introduction

Performance – Performanz – Performativity

Analysing Games as Performances

Dimensions of Game Performance: Leistung

Dimensions of Game Performance: Aufführung

Music as Performance: Leistung

Music as Performance: Aufführung

Table 14.1 Overview of the ludomusicological framework

Music as a Design Element in Games: Super Mario Bros.

Music Beyond Games: Chip Music as a ‘Gaming-a-System’ Practice

Conclusions

Footnotes

9 Music Games

10 Autoethnography, Phenomenology and Hermeneutics

11 Interacting with Soundscapes: Music, Sound Effects and Dialogue in Video Games

12 Analytical Traditions and Game Music: Super Mario Galaxy as a Case Study

13 Semiotics in Game Music

14 Game – Music – Performance: Introducing a Ludomusicological Theory and Framework

References

Further Reading