Introduction
If, as suggested by Donald T. Campbell,Footnote 1 the result of our particular abilities to sense and perceive is that we are distanced from a fundamental reality,Footnote 2 then what precisely is the nature and role of presence with respect to that reality? Furthermore, given the theme of this companion, what is the role of sound in relation to presence in virtual gameworlds? These are the two questions that underpin this chapter and to which I provide some answers. One question that might be asked, but which I do not attempt to answer, is: what is the role, if any, of music in presence in virtual gameworlds? The answer to this particular question I leave to the reader to attempt once the companion has been read. Other chapters in this companion deal more directly with music and its relationship to narrative and ludic processes or its abilities to provoke emotion in the game player and to establish meaning. These are areas, I suggest, that might be helpfully informed by answering questions about music and presence. Here, I content myself merely with providing some of the groundwork that will help the reader attempt the question. Before moving on to deal with my two questions, I must first clarify some terminology in order to furnish a framework from within which I can then debate them. I begin with a definition of sound.
The Framework
What Is Sound?
In agreement with Pasnau,Footnote 3 I would describe the standard definition of sound, various forms of which are to be found in dictionaries and acoustics textbooks – namely that sound is an audible pressure wave propagating in a medium – as incoherent. Incoherent in that the definition and its use to explain our relationship to sound holds up neither to scrutiny nor to experience. Furthermore, I would describe the definition’s use within physics and acoustics as inconsistent and imprecise. Such instances abound and are, perhaps, manifestations of the incoherency of the standard definition. Elsewhere,Footnote 4 I have given a more detailed exegesis of the problems with the definition and its use, so here, a few examples of inconsistency and imprecision will suffice to support my contention; the matter of incoherency I also deal with in the following discussion.
Consider the following quotations from John Tyndall: ‘It is the motion imparted to this, the auditory nerve, which, in the brain, is translated into sound’ (so sound arises in the brain), but ‘thus is sound conveyed from particle to particle through the air’ (so sound is a physical phenomenon in the air), and yet it is in the brain that ‘the vibrations are translated into sound’.Footnote 5 Here, sound seems to be at once a physical acoustic phenomenon and a neurological phenomenon. Admittedly, these are from a book published in 1867 when the new scientific discipline of acoustics was beginning to find its feet, so perhaps I should not be so harsh. Yet, such muddled thinking persists, as demonstrated by the following description from the 2010s: ‘The [hearing] implant is placed … in the area [in the brain] where the axons (nerve fibres) and cochlear nucleus (synapses) – which transport sounds picked up by the ear to the cerebral cortex – are found’.Footnote 6 What is wrong with the statements is that the brain comprises electrical energy, while sound (i.e., a sound wave) comprises acoustic energy – sound waves cannot be phenomena both in the air and in the brain, or indeed, as suggested by Tyndall’s last statement, something only to be found in the brain. The American National Standards Institute (ANSI) documentation provides, in fact, two definitions of sound:
a. Oscillation in pressure, stress, particle displacement, particle velocity etc., propagated in a medium with internal forces (e.g. elastic or viscous) or the superposition of such propagated oscillation …
b. Auditory sensation evoked by the oscillation described in (a).Footnote 7
The second of these could, with great latitude in the definition of the word ‘sensation’, be taken to be sound in the brain, but they should not be used interchangeably in the same sentence (otherwise one can easily end up with the amusing but absurd statement that sound is evoked by sound).Footnote 8
The standard definition of sound is more precisely and usefully a definition of sound waves, and, in this, it is a perfectly adequate definition, if used a little inconsistently in the literature. It is not, though, a sufficient or coherent definition of sound, and for this reason, in the book Sonic Virtuality, Tom Garner and I devised a new definition to account for various sonic anomalies and inconsistencies that are found when regarding sound solely as the physical phenomenon that is a sound wave: ‘Sound is an emergent perception arising primarily in the auditory cortex and that is formed through spatio-temporal processes in an embodied system’.Footnote 9
One advantage of this new definition is that it accounts for the multimodalityFootnote 10 of our hearing in a way that the standard definition does not. This multimodality is clearly evidenced by the McGurk effect. In this demonstration, two films are shot of someone repeatedly enunciating ‘baa’ and ‘faa’ respectively. The audio from the ‘baa’ video is then superimposed and synchronized to the ‘faa’ video. One hears ‘baa’ on the ‘baa’ video (as would be expected), but one hears ‘faa’ on the ‘faa’ video: in the latter case, the sight of a mouth articulating ‘faa’ overrides the sound wave that is ‘baa’. How can it be that one perceives different sounds even though the sound waves are identical?
In psychoacoustics, the McGurk effect is described as an illusion, a perceptual error if you like, because it does not square with the standard definition of sound. However, since every one of the hundreds of people to whom I have now demonstrated the effect experiences that ‘error’, I prefer instead to question the coherency of the definition: how can sound be just a sound wave if everyone perceives two different sounds while sensing the same sound wave?Footnote 11
Before moving onto an exposition of presence, I will present one more aspect of the sonic virtuality definition of sound that will prove particularly useful when it comes to discussing presence in the context of reality. This aspect is best illustrated by asking the question where is sound? In acoustics and psychoacoustics, explanations for this come under the banner of ‘sound localization’. In this regard, we can understand sound as a sound wave as being at a point distant from us, hence distal: sound is thus where the object or objects that produce the sound wave are, and our stereophonic hearing (i.e., the use of two physically separated ears) is the means to such localization. Sound seems to be happening ‘over there’. Yet sound, in the standard conception of it, is travelling through the medium (typically air for humans) to us, and so the location of sound is a moving, medial location somewhere between the sound wave source and our ears. So, sound seems to be ‘coming towards us’. A third theory is that the location of sound is proximal, at our ears. Here, sound is ‘what we are hearing’ at our particular point of audition. As noted above, the ANSI acoustics documentation provides a second definition of sound that, while contradicting the first ANSI definition of sound, supports the proximal notion. This proximal-based definition of sound as sensation contrasts with the medial-based definition of sound as a sound wave, and both contrast with the distal-based concept of the localization of sound as a sound wave. In summary, sound is variously described as being located a) at a particular point (distal), b) somewhere between the source and us (medial), or c) at our ears (proximal). Inconsistency, indeed.Footnote 12
In Sonic Virtuality, Garner and I proposed a different theory of the localization of sound. In this case, sound as a perception is actively localized by us through superimposing it mentally on various artefacts from the world. The sonic virtuality definition of sound defines aural imaging, the imagining of sound, as sound no less than sound perceived in the presence of a sound wave. Should the sound be perceived in the presence of sound waves, very often the location of the sound is quite distinct to the location of the sound wave source. Imagine yourself in a cinema, watching a dialogue scene. Where do you perceive the sound to be: at one of the loudspeakers ranged along the side or back walls, or somewhere between those loudspeakers and your ears, or even at your ears, or is it located on the mouth of the screen character now talking? Most cinemagoers would suggest the last option: the sound has been superimposed on the artefact in question (here, the character). This is an example of what Chion calls ‘synchresis’.Footnote 13 This common perceptual phenomenon demonstrates the problems with the acoustics and psychoacoustics conceptualization of sound localization. While the sonic virtuality concept of sound localization suffices to explain synchresis, also known as the ventriloquism effect and related to the binding problem, it can also be used to explain how we fashion a perceptual model of reality.
In recent work,Footnote 14 I have explored the idea of the environment as being a perception rather than something physical, and have shown how the sonic virtuality notion of sound localization aids in explaining the process of constructing this model of the physical world. Briefly, Walther-Hansen and I reduced reality (the world of sensory things) to a salient world (the set of sensory things of which we are aware at any one time), then to the environment that, in our conceptualization, is a perceptual model of the salient world. The environment is that model of the salient world chosen from a number of alternate models that are tested and refined as we are subjected to sensory stimuli, and this process continues unabated as the salient world changes. In this, I base my conception on work by Clark, who developed further the notion of perceptual models of the world in order to account for knowledge of that world.Footnote 15 The theory behind such hypothetical modelling of the world has been used by others in connection with presence, such as Slater and Brenton et al.Footnote 16 Clark explains it as a rolling generation and testing (according to experience) of hypotheses that proceed under time pressure until a best-fit model is arrived at. In my conception, this hypothetical model of the salient world is the environment. That is, our perceptual ‘environment’ is a particular constructed version of reality, based on our sensory experiences.
I come back to this environment in relation to presence and sound, and so, for now, I content myself with suggesting that we construct, test and refine our model environment in large part through the localization of sound. I now turn my attention to presence, for it is in the environment, at a remove from reality, that we are present.
A Brief Exposition of Presence
It is not my intention here to enumerate all the extant definitions of and explanations for the hotly debated topic of presence. I and others, to various extents, have already undertaken this task.Footnote 17 Rather, I prefer to draw attention to the main threads common to many discussions on presence, the main bones of contention in the debates and the relationship of presence to immersion (the term that is more widely used in computer games literature but which is not necessarily synonymous with presence).
The concept of presence is typically used in the field of Virtual Reality (VR), a field and an industry that is now (once again) converging with that of computer games. Here, presence is usually defined as the sense or feeling of being there, where ‘there’ is a place in which one might be able to act (and which one might typically have an effect on). The definition arises from the concept of telepresence and betrays the origin of presence research, which was originally concerned with remote control of robots (moon rovers and so on). Slater provides a succinct definition of presence that, in its broad sweep, encapsulates many others: ‘[presence] is the extent to which the unification of simulated sensory data and perceptual processing produces a coherent “place” that you are “in” and in which there may be the potential for you to act’.Footnote 18 Leaving aside the question of what ‘simulated data’ are,Footnote 19 the definition is notable for its limiting of presence only to virtual worlds (hence the simulation). Presence, in this definition, only occurs through the mediating effects of the type of technology to be found in VR systems: it is thus not possible to be present outside such systems, such as when one is in the real world. This is problematic to say the least, for how precisely does presence in virtual worlds (being there, potential to act and so forth) differ from the same sense or feeling when we experience it (as we do) in the real world? Although Slater goes on to suggest that there is such a thing as presence in the real world (in which case his definition is imprecise), much of the presence literature deals with presence in virtual worlds with little effort to explain what presence is in the real world (if indeed there is such a thing).
There is a debate in presence theory as to the relationship between attaining presence and the level of fidelity of the virtual world’s sensory simulation to sensations in the real world. Similarly, there is a debate as to whether presence arises solely from the fidelity of the sensations or whether other factors need to be taken into account. Slater’s definition implies a directly proportional relationship between fidelity and reality in the production of sensory data and level of presence. IJsselsteijn, Freeman and de Ridder are more explicit in stating that ‘more accurate reproductions of and/or simulations of reality’ are the means to enhance presence.Footnote 20 Slater further states that ‘[p]resence is about form’,Footnote 21 where that form is dictated by the level of fidelity of the VR system to sensory reality – the content of the virtual world can be engaging or not, but this has nothing to do with presence. Yet, as I and others have noted,Footnote 22 it is possible to be absent even in the real world, and presence requires attention (is one present when one sleeps or is presence only possible when one is awake and alert?)Footnote 23
Immersion is a term that is widely synonymous to presence when used in computer game research and industry marketing, but in presence research it means something else. Here, most follow Slater’s view that immersion is ‘what the technology delivers from an objective point of view [while presence] is a human reaction to immersion’.Footnote 24 Immersion can thus be objectively measured as a property of the technology – such-and-such a piece of equipment provides 91.5 per cent immersion in terms of its fidelity to reality – while presence is a subjective experience that lends itself less readily to precise measurement. This seems a reasonable distinction to me, but, as noted above, one should not make the mistake of assuming (a) that there is a direct proportionality between immersion and presence or (b) that immersive technology, and its level of simulative fidelity, is all that is required for presence.
There is surprisingly little empirical research on the role of sound in presence in virtual worlds. Much of it is to do with testing the fidelity-to-the-real-world (i.e., immersiveness) of the audio technology – particularly in regard to the production and function of ‘realistic’ soundsFootnote 25 – and spatiality and/or localization of sound wave sources.Footnote 26 A large part of this research implicitly assumes that an increase in fidelity of real-world simulation equals an increase in presence: this might well be true for spatial positioning of audio in the virtual world, but is doubtful when it comes to the use of ‘realistic’ sounds. Most of this research neglects to discuss what ‘realistic’ means but, assuming ‘authentic’ is meant, then one might well wonder at the authenticity of a computer game’s inherently unrealistic, but carefully crafted dinosaur roar or the explosion of a plasma rifle, even if we do not doubt their power to contribute to presence.Footnote 27 Verisimilitude would be the better word for such a quality, but this often has little to do with reality or authenticity, so is of little concern to those designing VR audio technology, for whom the mantra tends to be one of realism and objectivity over experience and subjectivity.
Some empirical research on sound and presence has taken place in the field of computer games under the banner of immersion,Footnote 28 and there has been a fair bit of philosophical or otherwise theoretical research on the role of sound in the formation of presence/immersion.Footnote 29 Of particular interest are those few works championing the necessity of sound to presence that base their ideas on studies of hearing loss. An especially notable example is a study of World War II veterans in which it is argued that hearing is the primary means of ‘coupling’ people to the world.Footnote 30 Such ‘coupling’ has been suggested to be a synonym for presence by researchers who argue that the use of background or ambient sounds is crucial to presence.Footnote 31
Stepping Back from Reality
I am now in a position to return to the suggestion with which I began this chapter – that the end result of sensation and perception is to distance us from reality – and I separate the discussion first into a section dealing with presence and reality in the context of virtual worlds and then, second, a section embedding sound into that thinking.
Presence and Reality
As Lee notes, while there has been much research on the mechanism of presence and factors contributing to it, there is little research on why humans are capable of feeling presence.Footnote 32 Lee was writing in 2004, and the situation he describes is much the same today: most research conducted on presence is of the empirical type, experimentally attempting to find the factors causing presence, the belief being that results will lead to improvements in the efficacy of VR technology in inducing presence (namely, the immersiveness of the technology). Lee, using research from the field of evolutionary psychology, suggests that humans cannot help but be present in virtual worlds because humans have a natural tendency to accept sensory stimuli first as being sourced from reality, and then to reject this instinctive assumption, if necessary, following longer assessment (and, I might add, should such time for reflection be available): ‘Humans are psychologically compelled to believe in relatively stable cause-effect structures in the world, even though they are not a perfect reflection of reality’.Footnote 33 Put another way, humans naturally tend to the logical post hoc fallacy (after this, therefore because of this), where we tend to assume causality between events, one occurring after the other. Additionally, despite knowing that virtual objects and effects are not real, ‘people keep using their old brains’,Footnote 34 and so their first reaction is to treat virtuality as real, and this is why we feel presence in virtual worlds.Footnote 35 Lee’s suggestion is a worthy one (and one that allows for the feeling of presence outside virtual worlds) but he implies that we already know what reality definitively is, which conflicts with the ideas of Campbell with which I began this chapter.
Like Lee, Campbell is inspired by theories of evolution for his ideas concerning the development of knowledge. Both authors deal with the purpose or effect of sensation and perception, and converge on what it is to know reality when virtuality (Lee) or illusion (Campbell) come into play. Campbell’s view, where it differs to Lee’s, is best expressed by an example he provides: ‘Perceived solidity is not illusory for its ordinary uses: what it diagnoses is one of the “surfaces” modern physics also describes. But when reified as exclusive, when creating expectations of opaqueness and impermeability to all types of probes, it becomes illusory’.Footnote 36 In other words, the sensations we receive from our eyes and our fingertips might persuade us that the table top in front of us is solid and smooth, and, on a day-to-day level, this works perfectly well, but, when the table top is subjected to closer inspection (either through technological means or through the sensory apparatus of smaller organisms), its impermeability and opaqueness are shown to be illusions. This neatly encapsulates a broad philosophical framework extending back through Kant (the impossibility of knowing the true nature of an object, the thing-in-itselfFootnote 37) to the hoary Platonic Allegory of the Cave. Campbell extends this further through the framework of evolution: ‘Biological theories of evolution … are profoundly committed to an organism-environment dualism, which when extended into the evolution of sense organ, perceptual and learning functions, becomes a dualism of an organism’s knowledge of the environment versus the environment itself’.Footnote 38 Thus, the evolution of sensation and perception as the basis for cognition (learning and knowing) goes hand in hand with a distancing from the reality of the world. We have to summarize, simplify and filter reality to function and survive. In the case of humans, I would be more explicit in stating that conceptual and abstract thinking have as their purpose a distancing from reality, the better (a) to safeguard us from the all-too-real dangers of that reality and (b) to enable us to rearrange reality’s building blocks by constructing a more commodious reality to suit our species’ trajectory.Footnote 39
Sound and the Feeling of Presence
From the above, if the purpose of sensation (and all that follows) is indeed to distance us from a, by now, unknowable reality, then the sensory data of virtual worlds, being at best somewhat poor simulations of what we experience outside such worlds, merely ease this task and this is why we can experience presence in such worlds. Having used the preceding section to lead up to this suggestion, I now turn my attention to the role of sound in the attainment of presence.
In order to do this, I concentrate on the function of sound in constructing an environment. As I have previously noted, where we are present is in the environment,Footnote 40 and the environment is a perception that is the result of the evolutionary imperative to distinguish self from everything that is not the self (the non-self). In other words, sensation, as the experience of the boundary between self and non-self, is the initial means to impose a distance between ourselves and reality.
The environment, as a perception, is a metonym of (stands for) the non-self that is the salient world, and, like all metonyms, it encapsulates a conceptualization that is lacking in its details (thus distancing from what it represents) but that is perfectly functional for what we require (presence, a place in which to act). There are many sensations that contribute to the construction of the environment, but, in terms of sound, I will concentrate on the role of sound localization. The localization of sound, as stated above, is the mental projection of sound onto perceptions of artefacts from the world. What we perceive forms part of the environment as a model of the world, and thus the localization of sound onto other perceptions (e.g., visual percepts) constructs in large part the spatio-temporality of that model.
Although there are sounds that I can locate as being of me – breathing, my pen scratching the surface of the paper as I write – most sounds of which I am aware, that are within my saliency horizon, are not of me. I make this distinction from experience and thus this basic distinction becomes part of the process of distinguishing self from non-self, the perceptual carving out of a space in which I can be present. By saliency, I mean not only conscious attending to sound waves but also subconscious awareness. There are many sound waves in the world that can be sensed, and, though I do not necessarily attend to and focus on them, my perceptual apparatus is aware of them and is processing them. Such sound waves become conspicuous by their absence. For example, in an anechoic chamber (a room without reverberation),Footnote 41 bar the auditory modality, there is as much richness of sensory stimuli as in any other room. Here though, there are no background or ambient sound waves due to the soundproofing, and the absorption and lack of reflection of sound waves in the room further contribute to the lack of ambience – these are sound waves one is not typically consciously aware of. Ramsdell defines three levels of hearing of which the ‘primitive level’ comprises ambient sounds that
maintain our feeling of being part of a living world and contribute to our own sense of being alive. We are not conscious of the important role that these background sounds play in our comfortable merging of ourselves with the life around us because we are not aware that we hear them. Nor is the deaf person [i.e. the adult who has become deaf] aware that he has lost these sounds; he only knows that he feels as if the world were dead … By far the most efficient and indispensable mechanism for ‘coupling’ the constant activity of the human organism to nature’s activity is the primitive function of hearing.Footnote 42
Thus it is with the anechoic chamber; lacking the ambient sound waves, one loses one’s coupling to the world. One is unable to fully distinguish self from non-self because we are accustomed to making use of a full complement of our familiar sensory stimuli in order to do so. At the very least, presence begins to slip away and the world closes in on the self.
Ambient sound thus has a role in presence in establishing a basic environmental spatiality – other selves and events, and distances from self to nearby surfaces – in which the self can be present. But those sound waves of which we are consciously aware also contribute to distinguishing between self and non-self by means of sound (as an emergent perception) that is then localized on other perceptions drawn from the world (mainly through vision) to be combined into percepts (objects that are perceived). Additionally, the spatiality of the environment is further developed by this sound localization as a topography of the salient world. Unlike vision, auditory sensation takes place omnidirectionally, and so the topography and artefactuality of the unseen world can be modelled through hearing – my feet under the table shifting position on the floor, birds singing in the garden behind me, dogs barking in the distance. The temporality of the environment is again in large part contributed to by the localization of sound. In this case, percepts provide a form of interface to the reality that cannot be fully apprehended by our senses, and interfaces have potential to initiate actions and events, an unfolding future, hence the basic temporality of the environment. But temporality also derives from the knowledge of the relationship between moving object (vision) and hearing of the event (audition) – we know that, compared to light, sound waves take a perceptible amount of time to travel between their source and our ears. The localization of sound (in the active, sonic virtuality sense) makes use of the benefit of experience in its probing of the salient world, in its bindingFootnote 43 of different modalities (e.g., vision and audition) together into multimodal percepts (objects and events – the substrate of the spatio-temporality of the environment).
Although the above description is of the role of sound in establishing presence in an environment modelled on the real world, it can equally be applied to presence in virtual worlds, particularly if the assumption is that the mechanisms of achieving presence are the same in both realities. One significant difference should be briefly noted, though, especially where it brings me back to my contention that conceptual and abstract thinking, being founded on sensation and perception, have as their purpose a distancing from reality. This is that the sensations provided by a virtual world represent a particular model of reality (in the form of the sensations if not always the content); the environment we then make from this virtual world is, then, perceptually poorer than those made of the real world (not only because the number of sensations available is fewer and their dispositions more primitive but also because the number of modalities used is fewer) – the abstraction that is the environment is itself based on an abstraction. The ease with which we appear to be present in certain forms of computer games might then be not because of the virtual world’s attempted ‘realism’ but, perversely, because of the virtual world’s imposition of an even greater distance from reality than that we normally experience.
In Summary
I have argued for the role of sound in constructing the fundamental spatio-temporality of the perceptual model of the salient world that is the environment. I have further argued that it is in the environment that we feel present, and that the environment arises under time constraints and the evolutionary requirement to distinguish self from non-self that aids in the need to survive. Thus, the modelling of the environment, as triggered by sensation (and refined by cognition), is a means to distance one from a reality that by now is not, if indeed it ever was, knowable. As a model, the environment is an abstraction (with all the loss of detail that implies); it is a topographically arranged container of interfaces to an inscrutable reality. The interfaces contained within are the percepts of which sounds are one form. Sound, in its active localization, provides spatiality to the environment, and the experience of the relative speeds of light and sound waves creates temporality, while the potential for action inherent in the interfaces provides a futurity and therefore further temporality to the environment. Thus, one is present in the environment, and, though one is buffered from reality by it, one is able to act within and upon reality through the interfaces of that environment.