Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-10T21:30:30.223Z Has data issue: false hasContentIssue false

The nature of sociolinguistic perception

Published online by Cambridge University Press:  03 April 2009

Kathryn Campbell-Kibler
Affiliation:
The Ohio State University
Rights & Permissions [Opens in a new window]

Abstract

This study investigates how linguistic variation carries social meaning, examining the impact of the English variable (ING) on perceptions of eight speakers from the U.S. West Coast and South. Thirty-two excerpts of spontaneous speech were digitally manipulated to vary only in tokens of (ING) and used to collect listener perceptions in group interviews (N = 55) and an experiment (N = 124). Interview data and experimental results show that (ING) impacts social perception variably, inhabiting an indexical field of related meanings (Eckert, Penelope. [2008]. Variation and the indexical field. Journal of Sociolinguistics 12(4):453–476). One of these meanings, intelligence/education, is explored in detail to understand how a given meaning is realized or not in a specific context. Speakers were heard as less educated/intelligent when they used -in, but this effect is driven by reactions to speakers heard as aregional and not as working-class. Some implications on our future understanding of the processing of socially laden variation are discussed.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2009

In recent years, variationist research has devoted increasing attention to the concept of social meaning, the idea that speakers and listeners use linguistic structures to carry social information and thus shape the situations and larger societal structures in which they participate. The link between social positioning and linguistic choices has been well established from early in the field (Labov, Reference Labov1963), and our understanding of social meaning has expanded more recently. Still, little is known about how listeners realize social meaning, how they receive sociolinguistic cues, and what they do with them. The essence of the claim that variation carries social meaning is that it connects speaker and hearer in social communication, making sociolinguistic comprehension basic to an understanding of what exactly social meaning is and how it is constructed on a minute-to-minute basis.

This article examines how listeners incorporate social information from a single variable into perceptions of speakers on the basis of brief but content-rich linguistic stimuli. I will present data from a matched guise study of the English variable (ING) (the alternation at the end of multisyllabic words between word-final [ɪn] or [ən], here referred to as -in and [ɪŋ], called -ing). This study manipulated (ING) in examples of spontaneous speech from eight speakers varied by gender and region and collected qualitative data through open-ended group interviews and quantitative data through a Web-based matched-guise experiment.

Results showed that (ING) does carry social meaning in that the manipulations changed the social perceptions of the speakers. It does not, however, influence a single percept in any consistent way. I will discuss how (ING) shifts meaning by taking one of its most commonly cited associations, that of education/intelligence, as an example and showing how perceptions of class and region shape when this meaning is incorporated into the perception of a speaker and when it is not. I propose that (ING) can usefully be thought of as inhabiting an indexical field, within which it shifts based on the other cues, linguistic and nonlinguistic, available to the hearer. Before presenting this evidence, I will discuss the concept of social meaning and its treatment in the field of variation.

SOCIAL MEANING

It is helpful to say a few words about what is intended here by the term “social meaning.” All sociolinguistic work is, by definition, concerned with how linguistic behavior relates to other aspects of social behavior. Proponents of social meaning argue that linguistic variation not only reflects social differences, but is also used by speakers to position themselves within the social world, and through such positioning, to build and rebuild that world (for discussions, see Eckert, Reference Eckert2005; Mendoza-Denton, Reference Mendoza-Denton, Chambers, Trudgill and Schilling-Estes2003). Inherent in this notion of construction is a claim that linguistic variation is linked in the minds of speakers and hearers to aspects of social structure, such as, for example, categories, situations, stances, personal qualities, intended audiences. The crucial observation is that not only do linguistic behaviors and other social structures correlate, but that they do so because speakers/hearers mentally connect them, whether consciously or unconsciously. Social meaning, then, is social content tied in the minds of a given speaker/hearer to a particular piece of linguistic behavior.

Much of the evidence for the existence of social meaning in this sense has come from studies of speaker performances, which have documented correlations between linguistic cues and social categories, particularly locally defined categories whose members join them through experiences as adolescents or adults. In these studies (e.g., Eckert, Reference Eckert2000; Labov, Reference Labov1963; Zhang, Reference Zhang2005), the social categorizations predict linguistic behavior in ways that cannot be explained via differences in purely linguistic background. Instead, the best explanation for this sociolinguistic connection is the association of linguistic variation to social meanings, whether categories (islander, burnout, yuppy) or more situational objects such as stances or speech acts (Ochs, Reference Ochs, Duranti and Goodwin1992).

The association between social categories and linguistic behaviors in the minds of speakers has been further supported by a range of exciting experimental research. A long tradition of work in language attitudes and the social psychology of language has demonstrated the often strong influence of language, language variety, and paralinguistic cues on social perception (for an overview, see Giles & Billings, Reference Giles, Billings, Davies and Elder2004). Variationists have likewise shown that social perceptions may be altered through the manipulation of individual linguistic variables (Fridland, Bartlett, & Kreuz, Reference Fridland, Bartlett and Kreuz2004; Labov, Ash, Baranowski, Nagy, Ravindranath, & Weldon, Reference Labov, Ash, Baranowski, Nagy, Ravindranath and Weldon2006; Plichta & Preston, Reference Plichta, Preston, Kristiansen, Coupland and Garrett2005). Sociophonetic work has demonstrated sociolinguistic connections influencing speech perception in subtle ways. For example, Strand (Reference Strand, Milroy and Preston1999) showed that listeners not only use visual information for direct cues as to linguistic production (McGurk & Macdonald, Reference McGurk and Macdonald1976), but also to assess social information such as gender typicality (a concept intimately tied to local understandings and categories) and apply that information to phonetic processing. National boundaries also influence listener phonetic boundaries, whether attributed to the speaker explicitly (Niedzielski, Reference Niedzielski, Milroy and Preston1999) or merely unconsciously evoked (Hay, Nolan, & Drager, Reference Hay, Nolan and Drager2006).

A discussion of social meaning inevitably raises the question of how it differs from other kinds of meaning, such as semantic or pragmatic meanings. It is not obvious where to place a boundary between these concepts, but discussions often emphasize the role of speaker intention, with semantic and pragmatic meaning being limited to those meanings that both speaker and hearer acknowledge the speaker to have intended, but social meaning consists of those meanings not controlled by a speaker, such as regional origin (e.g., Levinson, Reference Levinson1983:29). Work documenting the role of socially meaningful linguistic variation draws this divide into question, as speakers are shown engaging with linguistic markers of, for example, region in ways that seem to reflect intentional construction of situation and identity. In the intuitions of naive speakers, social meaning has a conflicted relationship to intention, as some meanings, such as incompetent, are unlikely to be intended by most speakers while others, such as jaded, lose their meaning when recognized as intentional (for a fuller discussion, see Campbell-Kibler, Reference Campbell-Kibler2008).

In the current discussion, I will not attempt to motivate a fundamental difference between social meaning as a cognitive construct and other kinds of meaning associated with linguistic forms. Jackendoff (Reference Jackendoff2002) discussed at length the problems with drawing a line in a principled way between “semantic meaning” and various other kinds of knowledge. Similar problems arise with any attempt to divorce mental representations of word meanings or pragmatic uses from their social significance (consider, for example, the problems in classifying the knowledge that the title Mr. indicates respect and/or distance under such a system). Nonetheless, I will continue to discuss social meaning with the understanding that I am referring to a portion of a much larger cognitive space and in the hopes that the dimensions of variation within that space will in the future be more fully understood.

METHODS

This study used the matched guise technique, an approach based on eliciting listener reactions to sets of linguistic performances that differ in specific and controlled ways. Performances for matched guise technique studies are usually created by having speakers consciously change their speech styles, but as technology has improved, we have been increasingly able to directly manipulate recorded speech for a wider range of variables. The current study used the software package Praat (Boersma & Weenink, Reference Boersma and Weenink2007) to “cut and paste” tokens of (ING) into recordings, a technique also seen in Labov et al. (2006). This manipulation gave me close control over the exact elements that were altered and minimized confounding variables. It also facilitated working with speech spontaneously produced in sociolinguistic interviews, rather than read aloud. Perceptions elicited by spontaneous speech samples are advantageous given evidence that read and spontaneous speech differ in systematic ways (Hirose & Kawanami, Reference Hirose and Kawanami2002; Laan, Reference Laan1997) and that listeners perceive these differences (Guaïtella, Reference Guaïtella1999; Mehta & Cutler, Reference Mehta and Cutler1988), making it difficult to generalize based on listener perceptions of read speech. Along similar lines, excerpts were selected to provide a range of messages with different content for each speaker, instead of attempting to produce “neutral” messages. It has been shown both theoretically (Bradac, Cargile, & Hallett, Reference Bradac, Cargile, Hallett, Robinson and Giles2001) and practically (Giles, Coupland, Henwood, Harriman, & Coupland, Reference Giles, Coupland, Henwood, Harriman, Coupland and  Ramgaran1990; Rogers & Smyth, Reference Rogers, Smyth, Solé, Recasens, Romero and 2003) that there is no such thing as truly neutral content and in seeking it, we are likely not only to fail but to sacrifice important insights about the complex interplay between content and form. Developing multiple stimuli for each speaker allowed me to tease out factors related to content from those related to speaker-specific linguistic or paralinguistic cues.

Another important aspect of the methodology was the combination of group interviews with survey data, allowing multiple perspectives on the results. Interview data give a richer image of the processes at work during perception and provide a check on analysts' interpretations of the reasons for quantitative patterns. Conversely, the survey data provide a testing ground for theories presented in or inspired by the interviews. By conducting the interviews first (see Williams, Hewett, Hopper, Miller, Naremore, & Whitehead, Reference Williams, Hewett, Hopper, Miller, Naremore and Whitehead1976), I was able to use interview data as a pilot to aid in the development of the survey itself, improving the fit of the survey questions to the specific population under study.

The literature on (ING) suggests that Southerners in the U.S. differ from other U.S. speakers in their (ING) use (see Hazen, Reference Hazen and Brown2005), so the study design incorporated region as an independent variable, drawing both listeners and speakers from two locations: North Carolina and California. The eight speakers in the study were two men and two women from each location, all of whom were university students who had grown up in the state (with the exception of one of the California women, Elizabeth, who was originally from Seattle). Table 1 gives the names (pseudonyms) of the speakers, divided by region and sex. The listeners likewise were university students in these two states, although they included a wider range of geographic backgrounds.

Table 1. Speakers, by region and sex

The original recordings used to develop the stimuli for the study were gathered in informal, hour-long interviews in two rough parts, one focused on work or school topics and the other on recreational activities. The speakers returned after the interviews to produce alternate examples of (ING) tokens from their original interviews, giving an -in and an -ing for each.

From each interview, I selected four short (10 to 20 second) excerpts, each with two to six tokens of (ING) and created stimuli by replacing the (ING) tokens with alternate tokens manipulated to match their length, intensity, and pitch to the original. Because the tokens were spliced rather than synthesized, there was no distortion of the speech sounds and the resulting voices remained human sounding. Great care was taken to arrange the splices at pauses (e.g., during a stop closure) when possible, although, in some cases, splices during sonorants were used, matching the formant structures closely. The manipulations yielded natural-sounding stimuli, with the only noticeable artifact being occasional variation in the loudness of background noise during the (ING) tokens, as a result of having manipulated the loudness of those tokens to match that of the original. This is not to say that the altered tokens exactly matched the originals; -in tokens especially ended up somewhat longer and more audible than many, although not all, of the naturally occurring -in examples. Although this did not make them strange or unnatural, it does raise interesting questions regarding exactly what we consider to be “matched” in matched guise work. Once the paired recordings were completed, I briefly piloted them for naturalness and identifiability before moving on to the first stage of data collection, which was the group interviews.

Qualitative data on listener reactions to the speakers and listener beliefs about (ING) were gathered in interview sessions with groups ranging from 1 to 6 participants, most commonly 2 or 3. In these sessions, similar to focus groups, participants were asked questions as a group and interacted with both the moderator and each other in building their responses. The first goal for the interviews was to determine what the general reactions were to the speakers and what terms were used spontaneously to describe them. The second was to gather intuitions and ideologies regarding (ING) and its effect on these utterances. In the first part of the interview, listeners heard individual recordings from 4 of the 8 speakers and answered general questions about the speaker and situation. In the second part, I played the same recordings in their matched pairs, asking the listeners to comment explicitly on how (ING) changed their perceptions. In all, 20 groups consisting of 55 participants were analyzed, 1 group having been eliminated due to problems with the recording and another due to a preponderance of nonnative speakers whose English skills were not sufficient for them to perceive (ING) variation reliably.

The goal of the survey that followed the interview sessions was to investigate covert reactions, so listeners were not directed toward (ING) or any other linguistic attribute. The survey was conducted after the interviews had been completed, transcribed, and analyzed; new subjects were recruited who had not participated in the group interviews and did not know the study's focus on (ING). Listeners in the survey heard a single recording from each of the 8 speakers and answered a series of questions about each, shown in the Appendix. A total of 124 participants completed the study. An additional 36 began it but failed to finish, and their data were removed from the analyses out of concern that their lack of interest (or other factors that caused them to quit the study) might have influenced the attention with which they approached the task.

Statistical analyses of the experimental data were carried out using linear mixed-effects models for the ratings variables and generalized linear mixed effects models for the binary checkbox variables. These models, fitted using the software package R (Hornik, Reference Hornik2007) and the lmer command from the lme4 package, incorporate random effects that account for the fact that differences between, for example, individual subjects influence the data in ways that are not captured by analyses that assume that each observation is independent from all the others. This subject-related patterning may cloud real effects or create the illusion of false effects if not accounted for statistically (Baayen, Reference Baayen2008). The random effects included in these models were subject (listener) and recording (i.e., which of the four excerpts for each speaker the listener heard). Models were fit for each dependent variable by stepping down from a model that included speaker, (ING), listener gender, and their interactions as fixed effects and removing nonsignificant terms, based on analysis of variance (ANOVA) on the model, until the simplest model was found. Alternate models were run using perceived speaker region instead of speaker as the first term and compared with the first model using ANOVA as well. These techniques provided the basic picture and were fit for each dependent variable. Differences between individual speakers were classed as fixed rather than random effects because these speakers were examined as individuals, not as representative samples of their larger identity groups. The statistical analyses assess the generalizability of the detailed patterns to other groups of listeners from the same population (students at high-status universities) responding to these same eight speakers. With a different population of listeners or different speakers, I would not expect the same response patterns to emerge. It is the structure of the relationship between (ING), other cues, and evaluative responses that is the basis of the larger claims about sociolinguistic meaning more generally.

Two other types of models were used to expand on particular questions. One investigated other listener factors to determine if any of these could improve the models already described. The independent variables added in these cases were the school the listener attended (one in each region represented by the speakers); the listener's regional background; and the listener's racial, ethnic, and religious backgrounds. The exclusion of race and region from the primary models was not because I considered them inherently less important, but rather because they were either unevenly distributed (for race) or difficult to analyze cleanly (region) in the data. None of these factors affected the results I will be presenting in this discussion, and I will not be discussing them further here. The second type of “second run” models tested relationships between dependent variables by using one as a term when fitting a model for another. These were run to investigate particular hypothesized relationships, rather than testing all possible combinations. Results presented in this article give the F and p values for a given term or interaction based on ANOVA done on the fitted models.

In the following discussion, I will draw on both the interview data and the survey data. Although I will attempt to provide an overall picture of the shape of the data, space considerations prohibit a full discussion (or even mention) of every statistically significant result in the experimental data. The results presented were chosen because they show the complexity of (ING)'s impacts on intelligence/education, one of the most important percepts in the literatures on both (ING) and social perception through language. There are many other patterns of results in this data, some of which are presented elsewhere (Campbell-Kibler, Reference Campbell-Kibler2007, Reference Campbell-Kibler2008), but all of which support the overall claim that the social contribution of (ING) is highly dependent on the other social information available in the message content, speech stream, or outside context, as well as on the overall reactions of the listeners to the speakers.

RESULTS

The discussion of the results will focus on the strongest and perhaps most expected, based on existing literature, statistical impact of (ING): on perceived speaker intelligence and education. I will first motivate the creation of a combined factor based on ratings of intelligence and education, and then I will document the influence of (ING) on this perceptual factor. I will show, however, that this effect is limited in its scope to a particular subset of speaker-listener pairs, defined by the regional and class background attributed to the speaker by the listener. This limitation presents an analytical problem, a solution for which is suggested by Eckert (Reference Eckert2008). Following her work, I argue for an understanding of linguistic variation as linked to indexical fields, sets of interlinked meanings within which a given resource may be based on other available social information.

CONTEXT DEPENDENCE

It is common to subject data from matched guise studies to factor analysis to determine the underlying social factors that are responsible for the observable data (Zahn & Hopper, Reference Zahn and Hopper1985). Although there are clear benefits to investigating co-occurrence structures in experimental responses, there are also problems with applying factor analysis to data of this kind. In this particular study, there is also the practical concern that factor analysis does not apply to binary data, which means that a large portion of the responses is excluded. It is likely that several of the binary variables, especially the label articulate, are actually connected to the educated/intelligent factor I will be motivating, but due to the different format of the variable (see Appendix), I was unable to test this theory, and so it was not included in this discussion.

A more fundamental concern is that factor analysis rests on an assumption of independence between each observation, an assumption that is violated by this style of experiment, where one subject evaluates multiple speakers, because the groupings of speakers and subjects provide structure to the data set. To address this concern, I first ran factor analysis on the entire data set and then ran it on smaller data sets made up of responses to a single speaker, in which the observations, each from a different listener, were truly independent. The analysis on the whole set fit a two-factor model, given in Table 2, consisting of a main factor loadedFootnote 1 with educated and intelligent and a second factor loaded with outgoingness and to a lesser extent speech rate. A marginal model (p = .087) added a third factor to these two, consisting of the casual/formal dimension and how well the speaker knew their addressee. The analysis on the individual speaker data sets consistently showed a first factor loaded with only educated and intelligent, but the second factors varied across speakers, if any were fit at all. For example, Sam, one of the West Coast men, showed a second factor of outgoing, casual, and not knowing his addressee, and the other West Coast male, Jason, had a second factor in which accent was tied to lack of masculinity. From these analyses, only the educated/intelligent factor was supported enough to use in further analysis and was formed by averaging responses to these two ratings. All other responses were analyzed independently.

Table 2. Factor analysis on full data set p < 0.001

This intelligent/educated percept shows a strong link to (ING) in both the quantitative and qualitative data. In the experimental results, speakers were heard as more intelligent/educated when they used -ing than when they used -in, as shown in Table 3. These results come from mixed-effects models fitted with subject and recording as random effects. The p values given are based on ANOVA performed on models fitted using the step-down process described earlier.

Table 3. Educated/intelligent factor by (ING) F(1,976) = 4.56, p = 0.033

Intelligence and education are also included in the set of concepts invoked by interview participants; in fact, many interview participants explicitly said that education and intelligence were the primary things that they associated with (ING). These associations support previous findings in the literature that show correlations between (ING) and actual educational background as well as task manipulations designed to vary the amount of effort speakers devote to standardizing their speech (Labov, Reference Labov1966; Trudgill, Reference Trudgill1974). Both education and intelligence are also tied to many of the other concepts impacted by or discussed in relation to (ING), in my data and elsewhere (e.g., Wald & Shopen, Reference Wald, Shopen, Clark, Escholtz and Rosa1985). Overall, this quality is, on the basis of many metrics, one of the most tempting candidates for a core or central meaning for (ING), though not the only one (the casual/formal dimension is another such candidate).

Closer examination, however, reveals two interaction effects in the experimental results, one within the other, that show that the connection between (ING) and education/intelligence is restricted to a subset of speaker/listener pairs, based on perceived speaker region and class backgrounds. I will describe the effect related to class first and then turn to region.

The questions in the experiment about speaker class were presented as questions about background, rather than current class status, both in the hope of making listeners more comfortable expressing perceptions about class and to provide a different type of information from the questions about education and occupation, which were already included elsewhere. Listeners were invited to select any of three descriptors regarding class, indicating whether the speaker sounded like she or he was from “a working class background,” “a middle class background,” or “a wealthy background.” Listeners could select these descriptors in any combination or leave all three blank (as nearly half the respondents did). All three class responses showed interesting patterns relative to (ING), but it is the descriptor working-class whose relationship with education/intelligence is influenced by (ING).

The statistical analysis was run including the selection of working-class as a term in a linear mixed-effects model. This is a methodological tactic only, allowing me to ask questions about how (ING) shifts relationships between multiple percepts, and is not meant to suggest that this class perception is in some way more basic than perceptions of how educated or intelligent the speaker is. The first level of interaction may be seen in Table 4, showing that overall, speakers received lower ratings for education/intelligence when they used -in and were heard as working-class. Indeed, the overall pattern of -ing promoting perceptions of educated/intelligent is primarily driven by responses in which listeners also indicated they thought the speaker sounded working-class. When listeners were responding to speakers they did not mark as working-class, (ING) had no impact on educated/intelligent ratings. It is not possible to tell whether this pattern involves (ING)'s contribution being shaped by other cues that are influencing the perceived class background or whether it is impacting some combination of class and intelligence/education in conjunction. The selection of working-class on its own showed no statistically reliable effect of (ING) (-in = 19%, -ing = 16%, F[1, 987] = 1.03, p = .311).

Table 4. Educated/intelligent responses by (ING) and working-class F(1,976) = 8.48, p = 0.004

This interaction is not articulated in the interview data, perhaps because it is not consciously available and perhaps because interview participants were reluctant to refer directly to class, although they did invoke it indirectly, as in (1) where a participant uses a description of a familiar individual to convey the occupational and educational category to which he was assigning the speaker. When class labels were mentioned explicitly in the interviews, it usually involved an observation that all of the speakers sounded educated and middle-class.

  1. (1)

    Scott: I think from the first conversation, like, most of us felt he was some type of young professional. But now I kinda get the sense he's some type—he reminds me of my sister's fiancé kind of just graduated from high school, didn't go to college, didn't do anything. But got a job like at the local auditorium and really knows what he's doin' there knows how to kind of, you know, he could change the court from ice to, you know, to a basketball court in half hour, you know, stuff none of us would have any idea about but he's not formally educated and he's really kinda excited, like, excited about his job.

    (Group 9, Duke. In response to Ivan, recording: crucial, -in guise.)Footnote 2

Although class was a relatively uncomfortable topic for interview participants, regional difference emerged as the dominant theme for the interviews, inspired by the inclusion of both audibly Southern and non-Southern voices among the stimuli. Throughout the interviews, there was a division based on perceived regional accent that shaped the explanations participants sought for linguistic behavior, as well as the questions they saw as in need of explanation. During the second half of the group interviews, I played the stimuli in their matched (ING) pairs and asked participants what, if any, difference the change in (ING) made in their image of who each speaker was. Despite this wording, many participants interpreted me as asking which variant sounded most natural in the context of a given speaker's performance, or what they thought the speaker was most likely to have really said. The responses to this unasked question, such as in as example (2), were easily given, with few hesitations or hedges. Interview participants almost universally described the West Coast speakers as likely to use -ing in their everyday speech, and they described -in as the more natural form for the Southerners.

  1. (2)

    Sally: The second one sounded more natural.

    Moderator: Okay.

    ???: Yeah.

    Sarah: I agree.

    Tom: It was kind of like the same situation as Tricia. Just went with how she speaks better.

    Moderator: Okay.

    Tom: It's natural.

    (Group 14, University of North Carolina, Chapel Hill. In response to Bonnie, recording: classes, comparison phase.)

A common follow-up theme to this judgment was to provide explanations for unexpected variants. These explanations are interesting both in shedding light on what participants took to be an adequate explanation for a given linguistic choice, but also in what they considered to be worthy of explanation. Overall, -ing was less seen as needing explanation than -in was, a pattern that was tied to ideas about correct speech as normal. Within this larger expectation, however, variant choices that violated regional stereotypes were seen as needing explanation, though the explanations for -in and for -ing were different, as will be seen in Table 6. Whether participants reported a difference in social judgment based on (ING) or not, the use of -in was consistently aligned with having a Southern accent, as in (2). Some reported that hearing -in increased their sense of a speaker's Southern accent, as in (3).

  1. (3)

    Alice: There were several places that were um, the -ings I thought make—made the accent much less pronounced. So to me, unfortunately as a Southerner, it sou—she sounded more educated in the second [-ing guise].

    (Group 18, Duke. In response to Tricia, recording: work-school, comparison phase.)

Even when saying that they heard no difference between the (ING) guises, participants drew on different explanations to explain the equivalence for Southerners and non-Southerners. In (4), Tricia's Southern accent is credited with obscuring any contribution of (ING), and in (5), it is Elizabeth's “accent-free” status as someone who “usually” said -ing that blocked a social judgment based on -in.

  1. (4)

    Tracey: It seems like actually, the second one seems more natural to her the rest of her, you know, speech. Because the -ing sounds really forced. And the rest of the conversation.

    Carlos: Yeah. I didn't, um, really it didn't sound that bad. The second recording. It wasn't like [startling?] it was like it was pretty moderate.

    (pause)

    Amy: I think the -in marched her, the -in matched her [??]. I thought it was more natural.

    Amelia: Well I think her accent's so heavy that the one thing doesn't make that big of a difference.

    Carlos: Yeah, if anything it would just make it sound weird.

    (Group 7, Stanford. In response to Tricia, recording: work-school, comparison phase.)

  2. (5)

    Karen: Right right it's not like she usually goes around saying you know I'm tearin' stuff. You know (laughter) well what are you doing? I'm tearing.

    ???: (laughter)

    Karen: She would say the G usually. So then it was ok. Cause it was the context—it was—it was the right context to leave it out.

    (pause)

    Karen: Yeah, part of the rest of her speech just she kind of sounds um she yeah just the way she spoke. She pretty much pronounced every single word fully usually. Um, and then this was just, I seriously think it was just like the situation the story she was telling she was just going so fast you know if I get really excited and tell a story I'll leave off the ends of some words and stuff.

    (Group 5, Stanford. In response to Elizabeth, recording: hair, comparison phase.)

These data suggest that hearers do distinguish between variable use by different speakers, and especially on the basis of region. However, it is difficult to see exactly how, because the same qualities can be presented as explaining a difference or no difference, depending on the reactions and social goals of the participant in a given interview. The experimental results regarding education/intelligence and perceived region help to clarify the most common contributions of region on the impact of (ING).

To understand them, it is necessary to understand the patterns of perceived region, which are different from the actual regional backgrounds of the original speakers. The study design included speakers from two regional groups: Californians and North Carolinians. One male speaker from each group (Ivan and Jason) showed idiosyncratic patterns of perceived region, coming across as coastal residents (a West Coast surfer “cool dude” and a bicoastal cosmopolitan gay or metrosexual man, respectively), and the three remaining speakers from each region formed larger patterns (for a more comprehensive discussion of this division, see Campbell-Kibler, Reference Campbell-Kibler2007). The three remaining North Carolina speakers (Bonnie, Tricia, and Robert) were overwhelmingly described both as being from the South (77%) and as coming from rural and working-class backgrounds. The remaining Californians (Valerie, Elizabeth, and Sam) had as their most common regional attribution might be from anywhere (39%) leading to the title “anywhere speakers” for this trio. These speakers personified Lippi-Green's (Reference Lippi-Green1997) “myth of the non-accent,” as speakers whose speech was perceived as unmarked by regional or ethnic cues, and were seen as coming from either the city or the suburbs, but not the country. This division between the perceived Southerners and the anywhere speakers (the archetypal accented and accent-free U.S. speakers), turns out to structure the relationships between (ING) and working-class and intelligent/educated.

Figure 1 shows that the interaction just described is actually specific to the anywhere speakers. The Southerners show no impact of (ING) on perceptions of their intelligence and education; regardless of (ING) use, particularly when heard as working-class, their scores are in the same range as the lowered ratings the anywhere speakers received in response to their -in guises with working-class perceptions.Footnote 3

Figure 1. Perceived region shaping (ING) effects on the connection between working-class and intelligent/educated (p = 0.028)

This double layer of interactions demonstrates (ING)'s contextual dependence even with respect to one of its most central meanings. Different speakers have different aspects of their identity left open to manipulation by (ING), depending on the other information that is available to the listeners. In this case, it appears that other cues associated with a Southern accent provide information that causes these listeners to downgrade Southern speakers on the intelligent/educated dimension. This “hit” that Southerners take for their accent seems to in some way use up or account for the available language-related downgrading, leaving (ING) with nothing else to do in their speech with respect to intelligence/education. Likewise, “anywhere speakers” who are not being perceived as working-class seem to be “bullet-proof” against the effect of (ING); as Karen explains for us in (5): “She would say the G usually. So then it was ok.” This leaves the working-class non-Southerners as the only ones in this data set who are influenced by the “general” effect of (ING) as indicative of education/intelligence.

INDEXICAL FIELDS

The flexibility documented presents a problem for understandings of socially meaningful variation or more accurately, a gap in our theoretical toolbox. We would like to capture the notion that intelligence/education is centrally linked to (ING) while incorporating (and eventually explaining) that flexibility. Eckert (Reference Eckert2008) offered a solution in the form of an indexical field, “a constellation of ideologically related meanings, any one of which can be activated in the situated use of the variable.”

This concept suggests that rather than meaning one particular thing, (ING) is tied to a network of related concepts. It may influence the perception of any one of these qualities under the right circumstances. Which one it is used to mean (or ends up meaning) is different based on a number of contextual factors. In this section, I will first present some of the social objects in (ING)'s indexical field, as marked by either explicit comments made by interview participants or by statistical patterns in the experimental results.

The notion of an indexical field is tremendously helpful in understanding how (ING) operates within the social world because (ING) can mean a wide variety of things. Just as a given word's referential meaning depends on the context in which it is used, social meaning too is highly flexible. One view of this flexibility is in the experimental responses that show a main effect for (ING). Table 5 shows individual responses that overall are made more likely by -in or -ing, once other significant effects (both random and fixed) are taken into account.

Table 5. Responses showing main effects of (ING)

Interview participants, talking explicitly about (ING), painted a distinct but overlapping view of (ING)'s indexical field, shown in Table 6. The overlap between these two sets of terms is in one sense an artifact of experimental design; one of the purposes of the group interviews was to collect terms that the population under study (undergraduates at two particular universities) used spontaneously to articulate their ideas about (ING) and this particular set of speakers. As a result, the terms given in Table 5 were included in the experimental instrument because of their use in the group interviews. Conversely, some of the differences between the lists (for example, the lack of “proper grammar” in Table 5) are artifacts of those choices, driven by the need to keep the experimental instrument to a reasonable length. In another sense, this overlap does mark the degree of convergence between two measures of related but distinct objects. The inclusion of a term in an experiment is no guarantee that (ING) or anything else will affect it and some terms that interview participants named as central to their understanding of (ING), such as the casual/formal continuum or speaker/addressee intimacy, proved unaffected in the experiment.

Table 6. Terms used by interview participants to characterize (ING)

The lack of response along the casual/formal and intimacy dimensions speaks to another useful caveat regarding this mapping of (ING)'s indexical field. The picture being painted here shows only one area of social meaning within which (ING) is able to operate. It is not exhaustive, having been collected in response to a particular set of speakers, from a particular population. The influence of the population is visible in the inclusion of qualities such as Southern and the association of -ing with sounding older. Different speakers and listeners might have led to other social categories emerging, such as Black or might even have reversed some of these, for example, speech samples from elderly speakers may have lead to -in being the variant associated with sounding older. The task itself is also an important factor, in that these listeners were responding to voices of strangers, rather than evaluating the mood or context of a familiar speaker. However, even though the specific qualities included in a given indexical field are going to shift depending on the context of investigation, the fundamental pattern that a linguistic resource may mean a range of different things is not likely to change.

Given this diversity of meanings, all connected to the same linguistic variable, what determines when (ING) “means” one of these versus another? How do listeners decide which of the many associations they will assign to a given token? Part of the answer lies in what the listeners already know, guess, or assume about the speaker and speech situation. The speech signal is rich with information, in the form of sociolinguistic cues as well as semantic and pragmatic content. In most speech situations, the listeners have independent knowledge about the speaker, from earlier in the interaction, other interactions, hearsay from mutual acquaintances, or simple role assignment, among many other sources. This other knowledge makes particular aspects of the speaker and speech situation more or less available to be affected by (ING).

Instances of (ING) impacting perceptions of different speakers differently are seen in the experimental data through interaction effects, for example, those in Table 7, for which (ING)'s impact is significantly different for different speakers. The terms listed in Table 7 are those for which the regression model includes a term indicating an interaction between (ING) and either speaker or perceived speaker region. These are terms that are influenced by (ING) use significantly differently for different speakers. They show that the intelligence/education dimension is not the only one whose relationship with (ING) varies based on contextual factors. Such variation is, in fact, the norm.

Table 7. Experimental responses with significant interactions between (ING) and speaker or perceived speaker region

THEORETICAL IMPLICATIONS

The analysis of the results presented here has a temporal tone to it, which could be taken to imply that the evaluation of all other qualities somehow precedes (ING), leaving it as the last and most constrained social cue. In the context of this experiment, this is literally true; the (artificial) manipulation of (ING) did follow the production of the other linguistic material. However, it would be ridiculous to suppose that (ING) enjoys (suffers?) such a status in perceptual processes more generally. Doubtless, at the same time that (ING)'s field of operation is being shaped by other cues, it is doing the same to them. Exactly how this works and what, if any, ranking is applied to different sources, both linguistic and nonlinguistic, of social information is an open question.

Another idea potentially implied by this discussion is that the social images of both the speaker and a given variable are somehow fixed after the first impression. Again, this is true of the experimental data presented here, in that listeners gave their responses to the speaker immediately after hearing them, and these responses were preserved indefinitely on a hard drive. In real situations, however, such reactions are always subject to challenge. Even after an utterance has been made and answered, speakers and hearers can and do continue to construct and reconstruct linguistic performances in new ways (Chun, Reference Chun2006). One fascinating set of questions involves how these intersubjective, necessarily social, processes of contestation inform the person-internal cognitive processes used to understand socially meaningful speech and vice versa.

The relationship of intersubjective sociolinguistic processes to internal mental processes is also tied to the question of conscious reasoning and the role it plays in sociolinguistic processing. Clearly not all of this calculation is conscious, given how rapidly it can be carried out and the disconnect between online reactions and explicit discussions. There is within variationist studies a peculiar resistance to the idea that sociolinguistic processing can be carried out unconsciously. At the end of his 1963 Martha's Vineyard study, Labov stated without argument that variables not consciously accessible to speakers “can hardly therefore be the direct objects of social affect” (Labov, Reference Labov1972:40). Discussions of social meaning have largely avoided making cognitive claims about the conscious or automatic status of such processing, but their use of terms such as “aware,” “agency,” “actively” have at times prompted responses dubious of the ability of speakers to consciously manage small details of linguistic behaviors on a moment-to-moment basis. This dubiousness is both reasonable and misplaced, as the concept of social meaning, or even of speakers building social selves, does not require online management by the conscious mind. Research in the field of social cognition has repeatedly demonstrated the ease with which people perform complex social calculations quickly and automatically (see Wyer, Reference Wyer2004).

However, it is not certain what aspects of the process are available to conscious introspection or what the relationship is between consciously expressed beliefs and more automatic sociolinguistic processes. Even though speakers/hearers may not be consciously considering a variable as they evaluate a given performance or make their own linguistic choices, they may at other times consciously think or even speak about its social content. (ING) is relatively conscious in this sense, being a linguistic stereotype (Labov, Reference Labov1966), a linguistic variable that is culturally acknowledged to the extent of having a specific term (“dropping one's Gs”) to refer to it. Given that the research presented here addresses only this one variable, it remains an open question how important its high level of cultural salience is to the results. It is possible that the impact of (ING) on listener perceptions resulted (wholly or in part) from its status as a linguistic stereotype. In the pilot study for this project, (ING) affected more ratings and with larger effect sizes than the other variable, /t/ release, a variable with less conscious cultural capital (Campbell-Kibler, Reference Campbell-Kibler2005). Because these were the only two variables addressed in the pilot, it is not clear whether this difference is idiosyncratic to the two of them or reflective of their relative salience.

Another open set of questions concern what, if any, grouping is applied to linguistic resources in assigning them social meaning. The patterns displayed by the listeners in my study suggest that they are making reference to some linguistic structures while assigning meaning to others. It is not clear whether all linguistic qualities have the ability to influence the impact of all others or whether there is a ranking based on salience, perceived immutability, or other factors. It is possible that part of this mechanism involves grouping linguistic behaviors into styles (Half Moon Bay Style Collective, 2006). If this is the case, it raises the question whether stylistic packages involve all recognized traits in a given performance or subsets that may be combined with each other.

These open questions need to be addressed if we are to understand the role of social meaning in shaping linguistic behaviors and, conversely, the role of linguistic behaviors in maintaining social structures. Recent work (Eckert, Reference Eckert2000; Zhang, Reference Zhang2005) has made it clear that language variation is an integral part of a matrix of social practices through which speakers make a social world and move within it. This article has shown that linguistic variation can and does carry social meaning that is interpretable by listeners, even in spontaneous speech when the manipulated variable is not obvious to participants, and it has shown some aspects of how that interpretation is accomplished. What remains to be developed is more precisely how, at the immediate and cognitive level, this process is accomplished and how these social processes relate to the other cognitive tasks necessary to producing and using language.

APPENDIX

Survey instrument

This is Ivan:

Press the play button to hear the recording. You can play it as many times as you like. After listening to him, tell me as much as you can about Ivan, based on what you hear.

How old does Ivan sound (check all that apply, must choose at least one)?

From you heard, does Ivan sound like he might be (check all that apply):

How well does he know the person he's talking to?

Right now, does he sound like he might be (check all that apply):

Where does Ivan sound like he might be from (check all that apply, must choose at least one)?

Footnotes

1. I considered a response to be loaded onto a given factor if the factor loading given was greater than .5.

2. Bracketed text in transcripts indicates difficult to understand speech. Identifiers at the end of each excerpt specify the interview excerpted and the campus on which the interview was conducted. Also given are which speaker and which speech are being responded to, as well as which (ING) guise. “Comparison phase” indicates the comment was made during the second portion of the interview when listeners were contrasting the two (ING) guises.

3. The numbers given here are for the full regression model. Southern-sounding speakers, however, were much more frequently described as working-class, raising potential problems with including both in a regression model. To investigate this concern, smaller models were fit to subsets of the data and the results echoed that of the full model.

References

REFERENCES

Baayen, Harald. (2008). Analyzing linguistic data: A practical introduction to statistics. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Boersma, Paul, & Weenink, David. (2007). Praat: doing phonetics by computer (version 4.6.29) http://www.fon.hum.uva.nl/praat/.Google Scholar
Bradac, James J., Cargile, Aaron Castelan, & Hallett, Jennifer S. (2001). Language attitudes: Retrospect, conspect and prospect. In Robinson, W. P. & Giles, H. (eds.), The new handbook of language and social psychology. New York: John Wiley & Sons. 137155.Google Scholar
Campbell-Kibler, Kathryn. (2005). Listener perceptions of sociolinguistic variables: The case of (ING). Ph.D. thesis, Stanford University.Google Scholar
Campbell-Kibler, Kathryn. (2007). Accent, (ING), and the social logic of listener perceptions. American Speech 82(1):3264.CrossRefGoogle Scholar
Campbell-Kibler, Kathryn. (2008). I'll be the judge of that: Diversity in social perceptions of (ING). Language in Society 37(5):637659.CrossRefGoogle Scholar
Chun, Elaine. (2006). Talking preppy: Indeterminacies of style, structure and social meaning. Paper presented at New Ways of Analyzing Variation 35, Columbus Ohio.Google Scholar
Eckert, Penelope. (2000). Linguistic variation as social practice: The lingustic construction of identity in Belten High, Language in Society. Vol. 27. New York: Blackwell.Google Scholar
Eckert, Penelope. (2005). Variation, convention, and social meaning. Paper presented at the Annual Meeting of the Linguistic Society of America. Oakland, California.Google Scholar
Eckert, Penelope. (2008). Variation and the indexical field. Journal of Sociolinguistics 12(4):453476.CrossRefGoogle Scholar
Fridland, Valerie, Bartlett, Kathryn, & Kreuz, Roger. (2004). Do you hear what I hear? Experimental measurement of the perceptual salience of acoustically manipulated vowel variants by Southern speakers in Memphis, TN. Language Variation and Change 16:116.CrossRefGoogle Scholar
Giles, Howard, & Billings, Andrew C. (2004). Assessing language attitudes: Speaker evaluation studies. In Davies, A. & Elder, C. (eds.), The handbook of applied linguistics. Malden, MA: Blackwell. 187209.CrossRefGoogle Scholar
Giles, Howard, Coupland, Nikolas, Henwood, Karen, Harriman, Jim, & Coupland, Justine. (1990). The social meaning of RP: An intergenerational perspective. In  Ramgaran, S. (ed.), Studies in the pronunciation of English: A commemorative volume in honor of A. C. Gimson. New York: Routledge. 191211.Google Scholar
Guaïtella, Isabelle. (1999). Rhythm in speech: What rhythmic organizations reveal about cognitive processes in spontaneous speech production versus reading aloud. Journal of Pragmatics 31:509523.CrossRefGoogle Scholar
Half Moon Bay Style Collective. (2006). Elements of style. Poster presented by Kathryn Campbell-Kibler, Penelope Eckert, Norma Mendoza-Denton, & Emma Moore at New Ways of Analyzing Variation 35. Columbus, Ohio.Google Scholar
Hay, Jennifer, Nolan, Aaron, & Drager, Katie. (2006). From fush to feesh: Exemplar priming in speech perception. The Linguistic Review 23(3):351379.CrossRefGoogle Scholar
Hazen, Kirk. (2005). The in/ing variable. In Brown, K. (ed.), Encyclopedia of language and linguistics. Vol. 5. 2nd ed.St. Louis, MO: Elsevier.Google Scholar
Hirose, Keikichi, & Kawanami, Hiromichi. (2002). Temporal rate change of dialogue speech in prosodic units as compared to read speech. Speech Communication 36:97111.CrossRefGoogle Scholar
Hornik, Kurt. (2007). R FAQ. Available at: http://CRAN.R-project.org/doc/FAQ/R-FAQ.html. Accessed: 1/11/09.Google Scholar
Jackendoff, Ray. (2002). Foundations of grammar. Oxford: Oxford University Press.Google Scholar
Laan, Gitta P. M. (1997). The contribution of intonation, segmental durations, and spectral features to the perception of a spontaneous and a read speaking style. Speech Communication 22:4365.CrossRefGoogle Scholar
Labov, William. (1963). The social motivation of a sound change. Word 19:273309.CrossRefGoogle Scholar
Labov, William. (1966). The social stratification of English in New York City. Washington, DC: Center for Applied Linguistics.Google Scholar
Labov, William. (1972). Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.Google Scholar
Labov, William, Ash, Sharon, Baranowski, Maciej, Nagy, Naomi, Ravindranath, Maya, & Weldon, Tracy. (2006). Listeners' sensitivity to the frequency of sociolinguistic variables. Penn Working Papers in Linguistics: Selected papers from NWAV 34. Vol. 12.2. Philadelphia: University of Pennsylvania, Penn Linguistics Club. 105129.Google Scholar
Levinson, Stephen C. (1983). Pragmatics. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Lippi-Green, Rosina. (1997). English with an accent: Language, ideology, and discrimination in the United States. New York: Routledge.Google Scholar
McGurk, Harry, & Macdonald, John. (1976). Hearing lips and seeing voices. Nature 264:746748.CrossRefGoogle ScholarPubMed
Mehta, Gita, & Cutler, Anne. (1988). Detection of target phonemes in spontaneous and read speech. Language and Speech 31(2):135157.CrossRefGoogle ScholarPubMed
Mendoza-Denton, Norma. (2003). Language and identity. In Chambers, J. K.Trudgill, P. & Schilling-Estes, N. (eds.), The handbook of language variation and change. Hoboken, NJ: Wiley-Blackwell. 475499.Google Scholar
Niedzielski, Nancy A. (1999). The effect of social information on the perception of sociolinguistic variables. In Milroy, L. & Preston, D. R. (eds.), Special issue: Attitudes, perception, and linguistic features. Journal of Language and Social Psychology 18(1):6285.Google Scholar
Ochs, Elinor. (1992). Indexing gender. In Duranti, A. & Goodwin, C. (eds.), Rethinking context: Language as an interactive phenomenon. Cambridge: Cambridge University Press. 335358.Google Scholar
Plichta, Bartek, & Preston, Dennis R. (2005). The /ay/s have it: The perception of /ay/ as a North-South stereotype in US English. In Kristiansen, T.Coupland, N. & Garrett, P. (ed.), Theme issue: Subjective processes in language variation and change. Acta Linguistica Hafniensia 37:243285.Google Scholar
Rogers, Henry, & Smyth, Ron (2003). Phonetic differences between gay- and straight-sounding male speakers of North American English. In Proceedings of the 15th International Congress of Phonetic Sciences. Solé, M. J.Recasens, D. & Romero, J., J. (ed.), Barcelona: Universitat Autònoma de Barcelona. 18551858.Google Scholar
Strand, Elizabeth A. (1999). Uncovering the roles of gender stereotypes in speech perception. In Milroy, L. & Preston, D. R. (eds.), Special issue: Attitudes, perception, and linguistic features. Journal of Language and Social Psychology 18(1):8699.Google Scholar
Trudgill, Peter. (1974). The social differentiation of English in Norwich. Cambridge: Cambridge University Press.Google Scholar
Wald, Benji, & Shopen, Timothy. (1985). A researcher's guide to the sociolinguistic variable (ING). In Clark, V.Escholtz, P. & Rosa, A. (eds.), Language: Introductory readings. New York: St. Martin's Press. 515542.Google Scholar
Williams, Frederick, Hewett, Nancy, Hopper, Robert, Miller, Leslie M., Naremore, Rita C., & Whitehead, Jack L. (1976). Explorations of the linguistic attitudes of teachers. Rowley, MA: Newbury House.Google Scholar
Wyer, Robert S. Jr. (2004). Social comprehension and judgment: The role of situation models, narratives and implicit theories. Philadelphia: Lawrence Erlbaum Associates.Google Scholar
Zahn, Christopher J., & Hopper, Robert. (1985). Measuring language attitudes: The speech evaluation instrument. Journal of Language and Social Psychology 4(2):113123.Google Scholar
Zhang, Qing. (2005). A Chinese yuppie in Beijing: Phonological variation and the construction of a new professional identity. Language in Society 34:431466.CrossRefGoogle Scholar
Figure 0

Table 1. Speakers, by region and sex

Figure 1

Table 2. Factor analysis on full data set p < 0.001

Figure 2

Table 3. Educated/intelligent factor by (ING) F(1,976) = 4.56, p = 0.033

Figure 3

Table 4. Educated/intelligent responses by (ING) and working-class F(1,976) = 8.48, p = 0.004

Figure 4

Figure 1. Perceived region shaping (ING) effects on the connection between working-class and intelligent/educated (p = 0.028)

Figure 5

Table 5. Responses showing main effects of (ING)

Figure 6

Table 6. Terms used by interview participants to characterize (ING)

Figure 7

Table 7. Experimental responses with significant interactions between (ING) and speaker or perceived speaker region