I can’t breathe!
I can’t even hear. He just slammed my f**king head into the ground … Thank you for recording!
Stay with me! We got pulled over for a busted taillight in the back. And the police, just – he’s covered … They killed my boyfriend! [Discussion with police officer; phone falls to ground as speaker is handcuffed.] They threw my phone, Facebook!
In March 1991, Rodney King was pulled from his car and beaten by officers of the Los Angeles Police Department. A nearby resident, George Holliday, shot a homemade video of the event that would become one of the most important pieces of American forensic media since the Zapruder film made at the time of John F. Kennedy’s assassination. That importance was recognised immediately by police administrators, news broadcasters and academics. In popular media, the video was played on a seemingly endless loop in the immediate aftermath as well as during the trial of the police officers. For instance, a 7 March 1991 broadcast by ABC News opened with anchor Peter Jennings introducing the case and video as follows: ‘Now the story that might never have surfaced if somebody had not picked up his home video camera. We’ve all seen the pictures of Los Angeles police officers beating a man they had just pulled over. The city’s police chief said today he will support criminal charges against some of the men.’1 In one of the most provocative academic articles written on video as a medium, Avital Ronell argues that the video clip functioned as a form of truth-telling testimonial relative to the mythologies of television. In particular, she sees the depiction of police (especially the Los Angeles Police Department) as the epitome of television programming: ‘the Rodney King show is about television watching the law watching video’ (Reference Ronell, Bender and Druckrey1994, 295). Writing shortly before the officers’ acquittal, Ronell presciently describes a judicial and cultural apparatus that would likely acquit them anyway, highlighting how easily such video can be ignored because it records everything indiscriminately – a kind of machinic excess that is ‘simply present while at the same time devoid of presence’ (297).
Yet for all the commentary about the Holliday video from so many quarters, the tape was considered largely self-explanatory, save for the question of whether King took a step toward the police or charged them. (This same debate has been central in several recent police shootings of African-Americans, especially that of Michael Brown in Ferguson, Missouri, in 2014.) Ronell underscores that video serves a testimonial function that television – literally, a distant viewing – never can, even if it is often disregarded in legal proceedings. But the Holliday video is also a distant viewing: despite the appearance of close proximity, the video is actually shot at a distance, with Holliday zooming in on King and the police officers. I remember seeing the Holliday clip on the news when I was growing up – always a kind of mute presentation with voiceover interpretation by newscasters. In the clip I mention above, for example, Peter Jennings turns to a reporter for ABC, Gary Shepard, who then simply speaks over the Holliday video. This video was frequently dredged up again in spring 2017 as part of twenty-five-year commemorations of the Los Angeles riots. Watching it again – and more closely – I am struck by the effect of the zoom on the video. And more to the point: for the first time in my life, I listened to the audio recorded with it. The actual sound of violence in the moment is relatively minimal. At certain points, I hear the voices of police officers yelling at King, but language is indistinct. It’s simply the sound of authoritative commands. I never hear any sound of impact, despite the revulsive image of police repeatedly hitting King’s body with batons and kicking him.
Instead, I hear something less obvious, but perhaps more systemically ominous: a helicopter hovering just overhead. In the aftermath of this event, apologists for the police force (including Los Angeles Police Chief Darryl Gates) argued that this event was an aberration. Yet the inescapable chopping of the helicopter’s rotor blades evokes a much broader assemblage of police machinery, in this case hovering overhead audibly but not visibly. State-sanctioned violence, it seems to imply, is not accidental but by design. (All that’s missing is Wagner’s ‘Ride of the Valkyries’ in the spirit of Apocalypse Now as a final exclamation point.) Although the low thudding drone of the helicopter is relatively subdued in the video, it emits a higher-pitched whistle that slowly rises and falls as the chopper circles. Only when the helicopter leaves (after about 04:30) is it possible to hear anything clear from the scene: police scanners and radios, doors slamming, and a few more orders being barked out. The helicopter not only provided technological cover, it provided audio cover too, masking sounds of police violence that might have further intensified the affective power of the video.
Occasionally the sound of voices near the camera becomes audible too. While the helicopter is present, these voices are unclear. Once it leaves, it’s possible to hear at least two different groups of people discussing what has just happened, with one group describing how the police had been beating King and another speaking in Spanish about the event more generally. In a sense, these are the first documented analysts of this violence, embedded in the video record as eyewitness interpreters. In addition, I hear handling noise on the camera. Holliday is often described as an amateur videographer, and these noises confirm that claim. The camera was primarily a tool of visual documentary; its microphone was an automatic but useful supplement, documenting audio traces of the event from a distance without recourse to an audio equivalent of video zoom technologies. As a result, we have two distinct audio-visual spaces: visually, we inhabit the space of the police violence; but aurally, we remain in conversational, close (and safe) proximity to Holliday, though with the looming sonic apparatus of police force circling overhead.
Sound and image have been disjoined. Given the horrific nature of the moment, it would be distasteful to call this disjuncture ‘productive’. But attending to both the audio and the visual, and how they overlap or document separate and asynchronous sensory spaces, allows a kind of mediated witnessing by potential viewing audiences – by which I mean, quite literally, those that see and hear. In recent years, media theorists have increasingly begun to raise the question of what it means to bear witness to an event – especially a traumatic or violent incident – that a person encounters only through indirect means like a recording (Peters Reference Peters2001; Rentschler Reference Rentschler2004; Frosh and Pinchevski Reference Frosh and Pinchevksi2009; Krämer and Weigel Reference Krämer and Weigel2017). In the United States these questions have taken on a greater urgency in the past few years as police violence against black people (and especially black men) has come more forcefully into public consciousness beyond communities of colour – in particular as a result of the Movement for Black Lives (including Black Lives Matter activists), which connects the current predicament to the Los Angeles riots a quarter century ago. And indeed, writing a decade before Black Lives Matter emerged as a movement, Fred Moten traced a sonic history from the beating of Rodney King to the killing of Emmett Till, whose brutal death was immortalised in a photograph of his open-casket funeral, as events demanding audition in order to be understood properly: ‘This means we’ll have to listen to it along with various other sounds that will prove to be nonneutralizable and irreducible’ (Reference Moten2003, 196). And Moten’s aural witnessing itself fits into an even older tradition of ‘bearing witness’ as a critical and collective sonic practice in African American religion and politics that remains highly relevant today (Ross Reference Ross2003; Floyd-Thomas Reference Floyd-Thomas, Pollard and Duncan2016). Indeed, these sonic forms of participatory witnessing that grow out of the Black Church and the Civil Rights Movement augment less race-conscious forms of media theory in which witnessing is in many ways a visual and individualistic practice.2
In the past few years, a recurring set of commentaries has highlighted connections between the video documentation of police violence in the Rodney King beating and the increasingly common (and deeply disturbing) digital recordings of more recent police violence. Headlines such as ‘The viral video that set a city on fire’ (Young Reference Young2017) have circulated online, while several film-makers have released documentary projects about the riots, including The Lost Tapes: LA Riots, composed almost exclusively from audio-visual footage from 1991–2. Throughout these discussions, music and sound have often emerged alongside the more obvious aspect of the visual. For instance, in his 2016 op-ed piece for the Los Angeles Times, James Peterson opens by connecting questions of music, documentary media, and race: ‘The rapper KRS-One famously posed this question to law enforcement: “Who protects us from you?” Exactly 25 years after Los Angeles police officers beat up Rodney King near a 210 Freeway offramp, the answer is the same as ever: The camera does, but only to a point.’ Peterson continues by noting that Holliday was ‘armed with an analog video camera’, nodding to the technological shifts that have taken place in the past quarter century. He then proceeds to discuss the recent deaths of Eric Garner and Walter Scott, victims of police violence who have been central to the Black Lives Matter movement. Although Peterson does not return to rap music, he easily could have, given the prominence in recent protests of rapper Kendrick Lamar’s 2015 song, ‘Alright’.3
In this chapter, I explore the sonic and musical aspects of digital screen culture. A near infinitude of possible directions for such an essay exists, spanning music videos and animal videos, whispered ‘ASMR massages’4 and chanted hate speech, soundmaps and audio-visual museum installations. Moving beyond such content-based themes, one might also write about the massive infrastructure that supports digital audio-visuality in its many manifestations, including the political-economic and environmental impact of server farms, smartphones (and their planned obsolescence), energy grids and the labour forces that are hidden behind these already-hidden infrastructures. But the questions of race, sound, digital transmission and power that swirl around the admittedly American-centric question of police violence and Black Lives Matter not only illustrate the breadth of contemporary media practices, but also point to a kind of media-cultural reckoning that is taking place today. If YouTube and other forms of new digital cinema previously offered a kind of expansive, quick-to-go-viral form of entertainment, the recent spate of video documentation of police violence reminds us of Friedrich Kittler’s dictum that ‘the entertainment industry is, in any conceivable sense of the word, an abuse of army equipment’ (Reference Kittler, Winthrop-Young and Wutz1999, 96–7). In this case, however, we might invert this idea: in recent years, the do-it-yourself entertainment industry of homemade video has increasingly paid attention to the abuses of military-grade equipment passed along to American police forces. Online video services can no longer pretend to be simple distribution hubs for cat and music videos (though music videos will play an important role here). Rather, these technologies offer important new possibilities for addressing the trauma of such violence.
In particular, reconfiguring relationships between audio and video – as well as our expectations of those relationships and our abilities to ‘read’ them – may allow for new forms of witnessing that are expressly mediated. Nicholas Cook has written of the critical ‘perceptual interaction between [multimedia’s] various individual components, such as music, speech, moving images, and so on: for without such interaction there is nothing to analyse’ (Reference Cook1998, 24). Generic conventions or technical limitations may lead us to assume that the ‘perceptual interaction’ of a particular (multi)media piece is fixed: in the case of the Rodney King video, one may well assume there is no audio or that whatever it may include is unnecessary for understanding the video. As I’ve written above, I disagree. Those perceptual interactions are subject to manipulation (whether intentional or not) by artists and media forms. But they also allow for an audience to exercise what Ingrid Monson calls its ‘perceptual agency’ (Reference Monson2008): we can choose to attend to certain musical (or audio-visual) aspects more or less than others. And while Cook and Monson are concerned with things we would readily identify as ‘music’ or ‘musical multimedia’, I hope here to extend their models of interactive, dynamic sensation to include other forms of audio more generally, whether speech/recited poetry, ambient environmental sound, or the particularly violent sounds of police brutality. Experiencing those audio-visual media and perceiving – or more aptly, choosing to perceive – the ways they witness about the world, especially when relayed on further by ‘sharing’ or commenting on them, quickly moves beyond just analysis (though analysis remains critically important) and into a realm of mediated co-witnessing.
Thus, after offering some theoretical background, I focus in this essay on three case studies in which the interplay between additional images creates new opportunities – as well as pitfalls – for digital witnessing: Beyoncé’s ‘visual album’ Lemonade; recordings of the killing of Philando Castile by police; and to conclude, protests at American football games against ‘The Star-Spangled Banner’, the national anthem of the United States. These examples, as well as related forms of music video and documented police violence, show some of the divergent uses of audio-visual media today, while underscoring the acute political forces at play within them and giving rise to them. These media offer an opportunity, in particular, to rethink longstanding notions of witnessing and mediation: bystanders can readily become activists with the push of a button, and distant viewers are invited to view decimated black bodies, both as cultural witnesses and/or as voyeurs of violence. At the same time, these videos and crowdsourced documentary practices also raise unsettling questions about technocapitalism and the companies like Google (parent company of YouTube), Apple and Facebook that profit – whether inadvertently or, perhaps, by design – from violence against black bodies and the repeated viewings of media documenting that violence. In an age full of new forms of technological mediation, witnessing and gazing, producing and consuming, activism and spectatorship blur with one another, and the political consequences can be significant.
Musicians as Multisensory Witnesses
Let me proceed with a YouTube clip. At a concert in Seattle’s Key Arena in December 2014, Stevie Wonder prefaced his 1973 song ‘Living for the City’ with a short speech while he and his band played the vamping synthesiser ostinato over a bass pedal point that opens the song. As documented by YouTube user ‘Zoltan Grossman’ on what appears to be a camera phone, Wonder said:
I want you to know truly sincerely, I love sincerely each and every one of you. [audience cheers] You can put your heart on that. You know, I’ve always seen all of us, no matter what our ethnicities are, no matter what our color, are seen as one family. [cheers] And I’m not saying it just because I’m on stage. I’m saying it because that’s how I really feel.
Can you believe that within one month, two grand juries – secret grand juries – declined to indict two policemen for the killing of two Black men? I just don’t understand that.
Let me just say this also: I don’t understand why a legal system would choose secrecy when there’s so much mistrust of what they’re saying. [cheers] I don’t understand why there could not have been a public trial where we would be able to hear all sides to this deal. [cheers] I just don’t understand.
I tell you what I do understand. I heard Eric Garner say, with my own ears: ‘I can’t breathe!’ And as much as he’s apologized, I don’t understand why he [the police officer] did not stop … You see, I feel that – when people say to me – and you know, I’ve heard this from various politicians as well, ‘You’ve got all this black-on-black crime’. But my feeling’s that guns are too accessible to everybody. [cheers]
I do, I do – I do understand that something is wrong, real wrong. And we as family, Americans, all of us of all colors, need to fix it – with a quickness, real soon. [cheers]
I love you. And I really love you, you know that. And this is why this song unfortunately is still relevant today. If you know the words you can sing along with me.5
Several aspects of this performance bear on the question of audio-visuality, witnessing and screens. First, Stevie Wonder is functioning as a sensory witness of sorts, challenging the secret (and thus impossible-to-perceive) proceedings of the grand juries in question. In the American legal system, grand juries stand as a preliminary legal proceeding. Although Garner had been killed nearly six months earlier, Wonder’s remarks came immediately on the heels of the grand jury non-indictment on 3 December, which led to a wave of protests around the United States, as well as the recirculation of a video made by Ramsey Orta showing police choking Garner to death while he sputters, ‘I can’t breathe!’6 Wonder emphasises that he has personally listened to the audio from that same clip: ‘I heard Eric Garner say, with my own ears: “I can’t breathe!”’ Presumably Wonder is referring to the experience of hearing Garner’s recorded voice. Yet this mediated experience has an unmediated quality for Wonder, as he witnesses Orta’s technological witnessing and then attests to it as his own experience, no less authoritative for having been based on an audio-visual recording. This question of mediated sensation is heightened all the more because Wonder is himself blind: the key action here was hearing Garner speak in that fatal moment.
In some sense, Wonder’s statement fits neatly into a longstanding tradition of protest music, especially among African-American musicians. One strain of that tradition, ranging from spirituals and blues to contemporary hip-hop and R&B, places the musician him- or herself in a personalised role, as a kind of aesthetic witness. If a witness is generally understood to play a role as an epistemological medium – to transmit knowledge about a person or event, as Sybille Krämer emphasises (Reference Krämer and Epps2015, 144–64) – this musical form of witnessing trains its focus on the affective dimensions of knowing, or what Tomie Hahn calls ‘sensational knowledge’ (Reference Hahn2007). One might argue that Wonder’s performance, with musical accompaniment during the speech, followed by a song – all the while surrounded by dancers, bright lights and a cheering audience – is hardly the place for a nuanced transmission of knowledge. But this kind of knowing-affective mediation seems to be precisely Wonder’s aim, as he concludes by encouraging those who know the words to ‘Living for the City’ to sing along. The song’s lyrics chronicle the difficulties of life for a poor, young black man from the American South moving to New York before getting falsely arrested and imprisoned. Much of the dramatic action of the piece is told not so much through the lyrics but rather through a kind of micro-radio-play with the sounds and speech of a bus and driver, police sirens and handcuffs, a courtroom verdict and the clang of prison bars.
The song’s concluding stanzas recount how the protagonist, wandering the city (after release from prison?), is nearly dead from breathing the city air. In stark contrast to the magical kind of ‘world breath’ that Friedrich Kittler imagines as animating operatic heroes (Reference Kittler, Kittler, Schneider and Weber1987), Wonder conjures instead the image of a gritty lack-of-breath that leads slowly but inexorably toward death – all too resonant with Eric Garner. The final lines shift to a first-person narrator – perhaps the protagonist, perhaps the singer – with a particular injunction for listeners: ‘I hope you hear inside my voice of sorrow / And that it motivates you to make a better tomorrow.’7 The voice is explicitly figured as a means of conveying not just words or semantic content, but feeling. Audition is a kind of burrowing-into: hear inside the voice, let its affective qualities resonate around the listener. And do something about it. Empathic hearing becomes a kind of testimonial action, even when displaced from the original circumstances in question. Wonder’s audiences can’t be there alongside the song’s protagonist; but precisely because of that displacement an ethical burden remains on them to hear-inside, to listen with care, and to respond accordingly.
Fittingly, Wonder’s speech was transmitted to the world as a multi-layered, sedimentary testimonial. The video’s creator, Zoltán Grossman, describes his actions as follows:
I started filming him when I guessed from a few chords that he was starting ‘Living for the City’ – one of my faves. At first I was disappointed that he started talking instead, but then realized that he had spoken about Ferguson before, and his remarks about Eric Garner’s death in New York could be valuable. They sure were, and I’m glad that I filmed it.8
Smartphone video not only serves as a bulwark against police violence, it also can transmit other acts of witnessing, as in this case. Of course, Ramsey Orta’s video of Eric Garner, which he alleges led to his own imprisonment, cannot fairly be compared to a bootleg video of a concert, not least because of the risk Orta took on while filming (some of which is captured in the confrontations with police during the filming itself). But both perform a similar kind of work as testimonials in the sense that Wonder’s lyrics suggest: they are not simply documents that capture an event, but rather invitations or even demands to be circulated and heard. Although witnessing traditionally shuns extra layers of mediation, in these somewhat paradoxical cases, the greater number of mediations – repostings, shares online, embeddings – the more effective the witnessing has been. Mediation – and, specifically, remediation – becomes a form of amplification in the digital age. That amplified witnessing is, of course, subject to the technical constraints and (sometimes whimsical) human preferences of social media. But it amplifies nonetheless.
Digital Video and Its Transformations
Years before the invention of YouTube, the rapper Chuck D of the group Public Enemy famously described rap music as a kind of mass medium like a news broadcast: ‘For the first time, a kid from New York can understand how a kid from Los Angeles lives … You’ve got to understand, Public Enemy and rap music are dispatchers of information. We’re almost like headline news … the invisible TV station that black America never had’ (Jones Reference Jones1989). Again, music has long served as a vehicle for communicating and transmitting ideas – including but not limited to ideas of protest – across considerable geographies. Internet-based platforms like YouTube offer a new set of possibilities for such transmissions but, as with any medium, those transmissions are constrained and facilitated by the particularities of that medium.
We might theorise YouTube and the digital video platforms that have emerged in its wake in any number of ways: as a site of physical interactivity and bodily re-performance in which aurality borders on touching, as I have explored elsewhere (McMurray Reference McMurray2014); as the ‘unruly’ heart of a new digital cinema (Vernallis Reference Vernallis2013); or as a physical infrastructure, including the glass and plastic of screens, silicon wafers, server farms storing petabytes of ‘cloud’ data, as well as the human labour used to assemble the latest iPhone (Peters Reference Peters2015; Kirschenbaum Reference Kirschenbaum2008). But, of course, YouTube and its competitors are not fixed entities; they have histories and are changing at this very moment. We might imagine a ‘golden age’ of YouTube, dating roughly from the purchase of YouTube by Google in 2006 to around 2012, when certain changes in scale and market became clear: Psy’s ‘Gangnam Style’ reached one billion views; Facebook bought the photo and video-sharing site Instagram; and smartphones became nearly ubiquitous (van Dijck Reference van Dijck2013; Burgess and Green Reference Burgess and Green2012).
In the wake of that golden age of YouTube, a major rupture has taken place, one that appears to be tied closely to the rise of the Black Lives Matter movement and especially the acts of digital witnessing that accompany it. The ubiquity of portable recording devices and options to share media made on those devices has given rise to new forms of political accountability. Digital cinema has taken on a certain social gravitas, and these ‘new media’ demand – returning to Stevie Wonder briefly – a hearing-inside that embraces the hypermediated audio-visual testimonial of events like protests and police violence. Journalist Stereo Williams has written about the kind of maturing that has come along with these musical – and I would add, audio-visual – testimonials. In a 2015 article entitled ‘Is hip-hop still “CNN for Black people”?’, riffing on Chuck D, Williams suggests that ‘this contemporary wave of social conscious music seems to be reflective of what the public is feeling, and that public doesn’t really seem to want it to be anything else … These guys [and all of Williams’s examples are men] are asking questions as opposed to acting as though they have the answers’, in contrast to previous generations of political artists.9
The following examples give a sampling of what audio-visual media – especially in the United States, but not limited to any single geography – have become in light of current tensions surrounding not just police violence but broader questions of race and justice, and to a certain degree gender, as well. They may seem like marginal or exceptional examples in unpacking what digital screen culture means today, but following Stereo Williams I argue that they raise critical questions (while not always providing complete answers) about the stakes of audio-visual media and their circulations. And again, they pose these questions through expressions of witnessing – but expressions that are always tinged in multivalent ways by capital, violence and other forms of institutionalised power.
Case 1. Beyoncé’s Lemonade: Video as Amplification
As is so often the case with multimedia work, Beyoncé’s (2016b) release, Lemonade – a self-described ‘visual album’ that premiered as an hour-long television show – raises a number of questions about definitions. What is a visual album? (And what is an album in the digital age?) Does that terminology mean that visuals take priority over music? Or vice versa, since Beyoncé is a singer? Or does she fit into a broader category of ‘entertainer’ given how she incorporates dance and video into her work? And what about live performances of the album’s material? Once again, Nicholas Cook’s formulation of tensions between media in multimedia (Reference Cook1998, 103) is helpful analytically: to what degree do the album’s audio and visual elements complement or contest one another? Or to reframe the question once more in terms of the audience, what does it mean for an audience to attend to certain components of this album more than others? As if Beyoncé had planned it precisely that way, these questions consumed the popular press and academic online spheres for months in the wake of Lemonade’s release (e.g. McFadden Reference McFadden2016, Pareles Reference Pareles2016, Vernallis Reference Vernallis2016, among many others). Central to this reception was a further question: how should this material be positioned relative to Black Lives Matter?10 In a sense, the answer to all these questions seems to be: Yes. That is, the album indeed seems designed to provoke many (or perhaps all) of these questions. In so doing, it maximises its own self-amplification, with critics serving as the channel for that response. The degree to which that is a savvy business decision or an act of social conscience – or both – is less clear. But whatever Beyoncé’s internal motivations, Lemonade tapped into the same kind of amplifying channels as she had with earlier, single-song music videos like her 2008 hit, ‘Single Ladies’. What seems different to me is precisely the massive tear in the American cultural fabric that had emerged since the late 2000s because of the visibility of police violence. And in many ways, the unfolding of Lemonade as an album follows that same progression.
Before tracing what Lemonade does internally, it bears mention that Lemonade did not arrive on the scene fully formed. Prior to its debut on HBO on 23 April 2016, shorter fragments were released, focusing on the song ‘Formation’. On 6 February, the song and its music video (from the full-length Lemonade visual album) were released, one day before a live performance of the song at the half-time show of the Super Bowl, the American football championship game. ‘Formation’, which serves as the finale of Lemonade, also includes some of the most overt political commentaries and imagery of the whole album – especially in contrast to the earlier segments, which focus more on questions of personal relationships and specifically on fidelity and betrayal (Beyoncé 2016a). The Super Bowl performance is notable not least because, as I discuss below, American football has been drawn into the audio-visual performance of race and anti-racism in surprisingly central ways. And Beyoncé appears to have taken full advantage of that platform, perhaps most strikingly in the outfits worn by her dance troupe. In the ‘Formation’ video itself (which, again, comments quite directly on questions of race in America), the dancers performing with Beyoncé wear multiple outfits, including old denim and white T-shirts. But at the Super Bowl, they donned outfits that were suggestive of the Black Panthers, the American black nationalist group formed fifty years earlier in 1966 – the same year the Super Bowl began (Caramanica et al. Reference Caramanica, Morris and Wortham2016). While audiences’ response to the costuming varied, it offered a compelling reminder of the possibilities of amplifying certain qualities of Lemonade (or specifically of ‘Formation’) through visual elements: first, through the video itself, with its striking imagery of post-Hurricane-Katrina New Orleans as well as anti-police protest; and, secondly, through the additional costuming of the Super Bowl half-time show, itself one of the most important multimedia events in the United States.
But these officially released videos were not the only video precursors to Lemonade’s formal release. In May 2014 after the Met Gala in New York, silent video footage from a security guard’s phone filming a closed-circuit surveillance camera leaked, showing Beyoncé’s sister, Solange Knowles, hitting and kicking Jay-Z, Beyoncé’s husband, while Beyoncé stands by. Several critics and other media pundits weighed in on whether this incident was connected to the tale of infidelity that dominates the first half of Lemonade. Most have rightly dismissed the idea that Beyoncé is required to tell the truth about her life – as though she lacked the creativity to imagine something beyond the ‘authenticity’ of her own lived experience (Tinsley Reference Tinsley2016; Als Reference Als2016). But by the same token, it does raise questions about how an audience should know when to flip on/off an authenticity filter. This kind of uncertain disjuncture is amplified by Beyoncé’s posture during the Solange/Jay-Z scuffle: she stands more or less motionless (at least as shown from above by the camera). Furthermore, when she exits the elevator, she seems poised for the paparazzi, smiling calmly, unlike the others leaving with her.11
More broadly, Lemonade is a series of music videos that feature Black women centrally throughout. These individual music videos are then connected with a mix of (often abstract) imagery accompanied by voice-over of Beyoncé speaking, often reciting poetry by Somali-British poet Warsan Shire. In addition, each song has a title, but those titles never appear in the visual album. Instead, they’re replaced by single-word titles (‘Intuition’, ‘Denial’, ‘Anger’ and so on) that evoke multiple stages of grieving. This thickly layered media constellation has proven to be a boon for interpretation, making nearly every moment of Lemonade overdetermined with possible meanings.
Unsurprisingly then, debates sprung up regarding several aspects of the visual album, including: the depiction of intersections of race, gender and sexuality; the respective roles music and visuals play in the album; and the economics of Beyoncé’s storytelling.12 For many fans and critics, Beyoncé’s depiction of the complex entanglements of race, gender and sexuality was thrilling. But at least one prominent author, bell hooks, challenged Beyoncé on the way she brought these two issues together, criticising in particular Beyoncé’s apparent embrace of violence as a response to oppression – most memorably in ‘Hold Up’, as she walks down the street with a baseball bat smashing cars, fire hydrants, a CCTV security camera, and (it appears) even a camera operator.13 (This track shows the most explicit self-awareness of media in the album – and perhaps the most direct violence comes at the expense of an imagined human holding the camera when Beyoncé hits both with the bat to bring the track to a close.) Significantly, hooks responds to the audio-visual album primarily as a set of moving images, barely commenting on its aural aspects. In contrast, a certain set of music critics insisted on evaluating the album first and foremost as music – reviewing it like any other album (including earlier Beyoncé releases). For them, the cinematic version was secondary, much like any other music video would be relative to an album or song. Robin James helpfully summarises the various positions taken on this debate, but makes the compelling case that attempts to interpret the album primarily (or solely) as ‘just music’ enact ‘epistemic violence’, demanding that it conform to standards of beauty and value developed for Western visual and musical arts.14 James’s point seems obvious but it underscores the fact that Lemonade sits between media, genre categories and critical discourses; there are no clear criteria or metrics for evaluating it, despite important audio-visual precedents from Prince to Beyoncé’s own ‘Single Ladies’ video.
If Lemonade has largely drawn acclaim for its audio-visual depictions of race, gender and sexuality, its connection to capitalism is more complex, if less commented upon – perhaps because that connection is so obviously present for a professional artist who makes money from her art. After premiering on the American cable television channel HBO, Lemonade was available only on Tidal, a music streaming service owned by Jay-Z (Rys Reference Rys2016). The audio-visual material of the album itself suggests a deep-seated but ambivalent relationship with capitalism, most notably in the memorable line from ‘Formation’, that she ‘just might be a black Bill Gates in the making’. But beyond this kind of brash entrepreneurialism, which was normalised years ago by rappers, the album’s audio-visual ‘text’ (i.e. the album itself) and the context of its release (choices about record labels, streaming, etc.) begin to blur into one another. Stephen Witt (Reference Witt2016) describes the political economics of Lemonade as follows: ‘As art, it was an unforgettable act of public shaming. As business, though, it was a gift of surpassing value, suggesting a kind of Clintonian marital bargain, in which pride is sacrificed in service to dynasty. The irony is rich: the man whose presumptive philandering provided the subject matter for this album now stands to profit most from its distribution’. The comparison to the marriage and simultaneous careers of Bill Clinton and Hillary Clinton underscores the fact that the personal is political here and vice versa. Some critics, including Greg Tate, focused on the potential for profiteering from more obvious socio-political issues, suggesting that Beyoncé’s embrace of Black Lives Matter and race-related issues was in many ways a business decision.15
More broadly, Beyoncé is bearing witness to a cultural moment that extends beyond just the questions of love, race, gender and power she explicitly addresses, yet her witnessing is also marked by a kind of excess, sedimented with other cultural accretions: perhaps unintentionally, she is also documenting the broader neoliberal regime of music production she and we inhabit. But rather than argue the merits of that embrace of capitalism – whether as a taint on the album’s politics, a necessary evil, or even the successful ‘hustle’ of her musical entrepreneurialism – I would suggest that this complexity gives listeners/viewers greater perceptual agency in determining what exactly Lemonade means and to what issues it bears witness. Again, witnessing becomes highly mediated through legions of fans and critics (including those who dislike the album); they too are part of that witnessing. As such, Beyoncé’s ability to elicit responses from those audiences is integral to her ability to witness on her own terms. She gets us to talk, and we selectively amplify her audio-visual act of witnessing, itself an act of audio-visual amplification of resistance to police violence.
Case 2. Philando Castile’s Death: Audio as Amplification
On 6 July 2016, Philando Castile was driving with his girlfriend Diamond Reynolds and her young daughter in Minneapolis, Minnesota. He was pulled over by officer Jeronimo Yanez and his partner, ostensibly over a broken tail light. Yanez then approached the driver’s side window and began talking to Castile. In less than a minute, Castile had been shot seven times. Castile was a registered gun owner and had properly disclosed to the officer that he had a gun in the car. The precise details of what happened in the next three seconds is subject to disagreement, but Yanez claims that Castile was reaching for his gun despite the officer’s warnings not to move. Reynolds in turn claims Castile was reaching for his driver’s licence, as instructed by the officer. What is clear is that Yanez began to shoot at him point-blank through the open window while yelling loudly. Reynolds then picked up her phone and began using Facebook Live to stream live video of what was unfolding (Reynolds Reference Reynolds2016).16 That video is a chilling mix of grief, chaos and technological savvy. It is a compelling, if disturbing, act of witnessing – paradigmatic of digital video tools that have greatly expanded the affordances and meanings of video, while also greatly expanding access to video-making technologies. This expansion affects phones especially, thanks to a handful of massive tech companies (Apple, Google, Facebook) that are reaping profits from these technological ‘disruptions’ in video and media production.17
The video begins with Reynolds apparently addressing Castile, crying out, ‘Stay with me!’ She then begins addressing the generic Everyone of the Internet, saying:
We got pulled over for a busted taillight in the back. And the police, just – he’s covered … They killed my boyfriend! He’s licensed to carry [a firearm]. He was trying to get out his ID in his wallet out [of ] his pocket and he let the officer know that he had a firearm and he was reaching for his wallet. And the officer just shot him in his arm. We’re waiting for a back–
At this point, Reynolds is interrupted by Yanez, who has been repeatedly cursing in the background (‘F**k!’). In one of the most telling moments of the exchange, he yells at Reynolds to keep her hands where they are. With tremendous poise, she replies, ‘Don’t worry, officer. I will.’ Before she can finish her words, Yanez screams out again: ‘F**k!’ Reynolds and Yanez then begin rehearsing events. Yanez, whose voice is raspy and panicky, says:
[Yanez:] ‘I told him not to reach for it. I told him to get his hand up.’
Reynolds: ‘You told him to get his ID, sir. You told him to get his driver’s licence. Oh, my God, please don’t tell me he’s dead. Please don’t tell me my boyfriend just went like that.’
Yanez (still pointing his gun through the window): ‘Keep your hands where they are, please.’
Reynolds: ‘Yes, I will, sir. I’ll keep my hands where they are. Please don’t tell me this, Lord. Please, Jesus, don’t tell me that he’s gone. Please don’t tell me that he’s gone. Please, officer, don’t tell me that you just did this to him. You shot four bullets into him, sir. He was just getting his licence and registration, sir.’ (Reynolds Reference Reynolds2016)
The tragic cinematography of the scene intensifies as Reynolds is instructed to get out of the car with her hands up and visible to the officer. She begins asking about her daughter, who was riding in the back seat of the car and had been pulled out of the car immediately after the shooting by Yanez’s partner. Reynolds is told to walk backward, and responds by filming behind herself – suddenly we see the officers standing behind her with guns drawn, telling her repeatedly, ‘Keep walking!’ She is wrestled to the ground and, as she is handcuffed, her phone falls beside her, pointing up to the sky as a small child’s cry is heard, sirens approach, tyres squeal, and Reynolds begins wailing. But before doing so, she speaks to her still-livestreaming phone: ‘They threw my phone, Facebook.’18 Reynolds then began broadcasting her plight again from the back of a police car, retelling the story and also commenting that her phone battery was about to die. In a particularly poignant moment, we see that her daughter is sitting with her in the back. Reynolds continues to switch between audiences, speaking to her daughter and then the world (at least that subsection of it that had access to her Facebook stream): ‘I don’t know if he’s OK or if he’s not OK. I’m in the back seat of a police car, handcuffed. I need a ride. I’m on Larpenteur and Fry. They’ve got machine guns pointed. [inaudible from child] Don’t be scared. My daughter just witnessed this. The police just shot him for no apparent reason. No reason at all.’ As Reynolds breaks down, her daughter in turn comforts her: ‘It’s OK, Mommy. [Reynolds cries out.] It’s OK, I’m right here’ (Reynolds Reference Reynolds2016).
In many ways, there is nothing that can be said about a video like this.
But something must be said about a video like this. So let me say that it is a masterpiece of audio-visual witnessing: it is impressive in its physical and technical execution, it is emotionally riveting, and it conveys the gravitas and profound loss that comes with such a traumatic death. That Reynolds manages to film at all after the shooting, let alone while walking backwards and while handcuffed in the back of a police car, is remarkable in itself. That virtuosity, if such a word can apply in such grim circumstances, is intensified by the rhetoric of hands: keep your hands where they are, keep your hands in the air, and implicitly, keep your hands cuffed behind your back. Needless to say, these are not the standard hand positions for shooting video. But beyond the presence of mind Reynolds shows to use these tools in real time in the midst of trauma, her ability to cogently narrate what she has seen and heard – and what she is seeing and hearing, even when we as viewers can no longer see her after her phone is thrown to the ground – demonstrates a deep commitment to the art of witnessing. Even when her body and camera/phone are forcibly displaced from one another, she continues to witness acousmatically as a voice without a visible body, a kind of violation of the most basic (old) rule that witnessing demands bodily presence. On account of the ubiquity of such audio-visual media devices, witnessing is changing. Nevertheless, the importance of a commitment to witnessing, even of such a brutal act, is central to what Reynolds’s actions mean in our current (social) media ecology. And that ecology quickly extends to encompass others beyond Reynolds, most painfully evident in her comments in the back of the police car with her daughter: she too witnessed this killing. Her daughter becomes a co-witness and an interlocutor, offering comfort while also coming to terms with extraordinarily complex circumstances.
On 16 June 2017, Yanez was acquitted of all charges, unleashing a wave of protest around the country. A few days later, a second video was released publicly, filmed from the dashboard camera of his police vehicle. This dashcam video, which had been used as evidence in the trial, was the centrepiece of a cluster of official, police-generated audio-visual fragments that documented various moments in the shooting and its aftermath. As I mention above, while it documents an act of police brutality, it inverts the audio-visual relationships found in the Rodney King video: it features a static wide shot instead of a tightly zoomed image, while the close-miked audio records Yanez, amplifying his spoken interactions with Castile, then the gunshots, and finally his anguished (perhaps panicky) vocalisations after shooting Castile. These vocalisations attracted considerable commentary: do they indicate that Yanez knew immediately he had made a mistake? A lack of professional composure? Two responses from police/criminology commentators underscore the affective impact of his voice crying repeatedly, ‘F**k!’:
Analyst 1, David A. Klinger, professor of criminology and former Los Angeles police officer:
‘Afterwards, he’s in a very emotionally wrought place. He’s screaming into his mike. There’s no composure. He did not present a very professional demeanor.’
Analyst 2, Paul Butler, law professor and former federal prosecutor:
‘Part of what may have made a difference to the jury was the officer’s very emotional reaction after the shooting. He’s somebody who realizes that he’s made a grievous mistake. It’s certainly an argument for a manslaughter conviction rather than a murder conviction. People who do harm in the heat of the moment still deserve punishment.’
(Bosman and Smith Reference Bosman and Smith2017)
In other words, Yanez’s vocal timbre matters for legal purposes. From a purely technical perspective, we hear Yanez’s voice overmodulate the microphone repeatedly, resulting in distortion as he curses about the predicament. The use of audio-visual recording media became a central part of the internal police investigation that followed the shooting, and the police force has gradually begun to police itself through the use of audio-visual equipment as a kind of auto-witnessing. (This is part of the move toward having police wear cameras on their bodies and mounted in their vehicles.) Yet there are many ifs and buts. The police investigation noted Yanez’s standard use of such media (e.g. having the dashcam running whenever pulling someone over) as well as deviations from this (the second officer to arrive did not do this). Yet the dashcam footage from Yanez’s vehicle was not released to the public until a few days after the trial (nearly a year after the shooting). Again, Yanez didn’t radio the general police radio dispatcher but rather contacted another officer directly; the only recording of that conversation – in which Yanez gives his dubious reasoning for deciding to pull over Castile, based solely on racial profiling, including the size of Castile’s nose – was made not by police but by a local citizen who was independently monitoring and recording the police scanner (Mannix Reference Mannix2016). In this way Yanez circumvented the technologies designed to police the police. And another key audio recording, an interview with Yanez as part of a state investigation into the killing, was disallowed from the court proceedings (Xiong and Mannix Reference Xiong and Mannix2017). Juvenal’s aphoristic question, ‘Who will watch the watchmen?’, seems apt, if sensorily incomplete. (The same holds for the word ‘witnessing’ itself, with its etymological emphasis on vision.) The shooting of Philando Castile reminds us that acts of witnessing, especially today, also demand a careful listening.
If Rodney King’s beating and trial and the Los Angeles riots that followed mark a starting point in mediatised witnessing about and against police violence, the shooting of Castile and its livestreaming by Reynolds marks a kind of climax. Other killings had been filmed on smartphones, including Eric Garner’s death-by-choking, discussed above.19 But in the case of Castile, the immediate aftermath of the shooting was streamed in real time. The relationship of audio and video also connects King and Castile, in an inverted way: like the King beating, Castile’s death was filmed from some distance, leaving certain actions illegible, but whereas the audio in King’s case clarified almost nothing about the specifics of police actions, the audio from Yanez’s microphone gives an intense feel of proximity to this fatal act of violence. When viewed and heard together, this bundle of media – Reynolds’s live video broadcast, the police video and other media (from police and other citizen bystanders/recordists) – bears a striking witness to Castile’s killing. And yet the legal results were the same: acquittal of the police officer(s) involved. On the one hand, we might read the acquittal of Yanez as the perennial failure of all media to effectively witness; as Ronell (Reference Ronell, Bender and Druckrey1994) writes, these technical witnesses fail to analyse themselves – they fail to say what they mean, as it were. And on some level that seems apt in this case: even with audio-visual media produced by both parties, the evidence was found inconclusive. But I would interpret this case slightly differently. Those media were never designed to lead to justice. They are far too malleable, especially in the hands of a legal system that has shown little inclination to punish officers for the violence they commit. Instead, as Diamond Reynolds clearly understood in her snap decision to start broadcasting Castile’s death, they are better suited to witnessing through amplification, aimed at a broader public that may take – but hasn’t yet taken – steps to bring about structural change in society in order to minimise such violence.
Conclusion: Oh Say, Can You See?
Some readers of this essay may find it too American-centric. These problems, the thinking goes, are unique to the United States, with its peculiar mix of a history of slavery, lingering racism, a massive prison system, and a vast media infrastructure that can readily amplify (or stifle) all kinds of performative utterances. Instead, a topic like the role of media in the Arab Spring or something about YouTube music more generally might have more obvious relevance to a wider readership. Unsurprisingly, I disagree: racism may be more visible (and audible) in the United States, but it would appear to be part of a larger global trend, both in overt politics (e.g. the re-emergence of global populism) and in more subtle manifestations through ethnic and religious conflict (e.g. the expulsion of the Rohingya from Myanmar or the Syrian Civil War and its fallout). And so I conclude briefly with an example I believe has broader relevance, despite the appearance of being the most American of examples.
Case 3. Football Players Protest ‘The Star-Spangled Banner’
Since 2016 a new practice of protest has become common: American football players kneeling, sitting or holding a fist in the air while the American national anthem (‘The Star-Spangled Banner’) is played at the beginning of games. It began in fall 2016 as a response by football player Colin Kaepernick to police killings of black Americans. Without any fanfare, Kaepernick would quietly sit on the bench alongside the field while his teammates would stand at attention in front of him. In the United States, as in many other countries, when the national anthem is played at sporting events, people – players, fans, officials – are expected to stand at attention and face the flag. Many put their hand over their heart. This is the stuff of national anthems everywhere – musical nationalism performed in highly public settings, especially those tied to sports.20 Kaepernick described his motivations as follows: ‘There are a lot of things that are going on that are unjust … There’s a lot of things that need to change. One specifically? Police brutality. There’s people being murdered unjustly and not being held accountable.’ He continues, with a more sonic allusion: ‘I’m seeing things happen to people that don’t have a voice, people that don’t have a platform to talk and have their voices heard, and effect change … No one’s tried to quiet me and, to be honest, it’s not something I’m going to be quiet about. I’m going to speak the truth when I’m asked about it.’21 Teams declined to hire him for the 2017 season, leading Kaepernick to file a lawsuit alleging that team owners and the National Football League were conspiring together to fire a warning shot at other players who might be similarly inclined (Belson Reference Belson2017). Unsurprisingly, with Kaepernick gone, the practice intensified, all the more so after Donald Trump commented repeatedly about how such players should be fired for what amounts to their exercising of free speech, a guaranteed right in American constitutional law.
The fallout of these exchanges is not yet clear but at risk of triteness I want to close with the question posed in the opening lines of that national anthem: ‘Oh say, can you see … ?’ As it turns out, Kaepernick had been sitting on the bench for the anthem for several weeks before news outlets noticed and reported on it. Whether Kaepernick wanted the media to notice or not (he hadn’t said anything about it prior to that first wave of reporting), media – and in this case, ‘the media’, including television, newspapers and online media platforms – amplified his protest and the responses to it, both negative and positive. Tellingly, almost all reporting on these protests has been mute: still images circulate widely showing players kneeling. The music is almost never shown with these images – perhaps because a national anthem is the kind of musical object that everyone assumes everyone knows intimately. Intentionally or otherwise, the effect is to eliminate an entire sensory register – music, sound, speech, hearing – that might lead to players being allowed to speak out about their concerns and be heard. Some broadcasts now simply skip the national anthem.22 As an exception, one sound-sensitive news piece on 11 September 2016 included not only audio-visual footage of the anthem as sung by firefighter Keith Taylor, but also an unprompted analysis of hearing and listening by Doug Baldwin, a player on the Seattle Seahawks: ‘There is a message that needs to be heard. And so, you heard us. Now listen to us’.23 Baldwin suggests it was not the singing firefighter but the kneeling (effectively silent) players who needed to be heard. Furthermore, the relationship between hearing and listening is not a theoretical question, as it might be understood in academic debates, but rather an invitation for participatory engagement by an audience. Collective witnessing calls for receptive listening.
This example may be quintessentially American but it recapitulates the broad question: how do people use media to witness in a time of violence, and what are the sensory ecologies of that witnessing? Following on from that, how do the audiences of such acts of witnessing then play a role in that witnessing? As audio-visual media become more readily shareable, the creation of digital cinema falls not only to those who produce those media but also to audiences who watch/listen, evaluate, debate about and perhaps share them. In an age where online circulation is so visibly quantified – how many times was Lemonade streamed in its first week, or how many times was the hashtag #Blacklivesmatter used on Twitter after a given police shooting? – witnessing becomes a distributed act. Viewer/listeners are pulled into a constellation of media, offering a reminder that those media choices have concrete political and social consequences. ‘New media’ may not be so new in this regard: from memorials to early religious martyrs (who combined death and witnessing in defence of the propagation of a message) to the Rodney King video, hearers can readily re-tell and viewers can otherwise inscribe, record and share images as well. But new media certainly heighten the impact of (some) individuals within that broader media ecology. And, of course, these ‘individuals’ need not be actual people, as seen in the rise and impact of ‘bots’ that automatically engage with humans in these media ecologies to, say, influence an election or replace telephone-based customer service lines. But these post-human extensions of media are precisely the point. What is at stake here, both in the filming and circulating of dramatic recordings of police violence and in the banal retweets generated by artificial intelligence, is the status of the human, and especially the human body. Witnessing has long had a close connection to bodily presence; in the digital age, that connection has been distributed but has not disappeared. Although the distinctions between human and machine continue to blur increasingly quickly, basic functions like breathing, seeing and hearing remain critical.