The research reported in this article aims to fill a distinct gap in our understanding of how children's perception of the social-indexical meaning of linguistic variation develops. This study explores development of phonological knowledge during the language acquisition process at a critical point in the intervening years between infancy and young adolescence.
The preschool and early school years mark a very significant period for children as they start to develop social networks through attending nursery/school and interacting with their peers. There is evidence of stylistic shifting of some variables among 3-4-year-olds in response to caregiver input (Smith, Durham, & Fortune, Reference Smith, Durham and Fortune2007; Smith, Durham, & Richards, Reference Smith, Durham and Richards2013), and social interactions with peer groups between the ages of four and five are seen to affect children's adoption of sociolinguistic variables (Nardy, Chevrot, & Barbu, Reference Nardy, Chevrot and Barbu2014). The current research offers insights into children's developing sociolinguistic perception by investigating preschool children's emerging awareness of regional accent variation.
Adult listeners are able to group speakers according to broad regional accent distinctions, such as New England versus South versus North/West in the U.S. (Clopper & Pisoni, Reference Clopper and Pisoni2004a, Reference Clopper and Pisoni2004b) and North versus South in Britain (Lawrence, Reference Lawrence2014). This study sets out to discover whether children of a preschool age are able to use phonological regional accent information to categorize speakers, and whether this sociolinguistic awareness develops as a step beyond the stage at which children have achieved phonological constancy and are therefore able to comprehend familiar words in unfamiliar accents.
Perceptual awareness of accent features is investigated on three different levels. Level 1 of the experiment tests the children's ability to interpret accent variation as evidence for categorizing the speakers (e.g., grouping speakers according to whether they say ‘b[a]sket’ or ‘b[ɑ:]sket’). Level 2 tests the children's ability to abstract across this variation on the phonological level and groups speakers according to their pronunciation of the same phoneme in different words (e.g., grouping speakers together who say ‘gr[a]ss’ and ‘p[a]th’ versus speakers who say ‘gr[ɑ:]ss’ and ‘p[ɑ:]th’). Finally, Level 3 tests the children's more abstract awareness of regional accent variation by investigating their ability to group speakers into accent categories based on their pronunciation of different phonemes (e.g., speakers who say ‘gl[a]ss’ and ‘br[e:]k’ versus speakers who say ‘gl[ɑ:]ss and br[eɪ]k’). Three independent variables explore the effects of both maturational and exposure-related factors on the children's performance in the experiment: the children's age, their gender, and their exposure to regional variation via the linguistic input they receive from their parents/carers. The developmental trajectory will be proposed through an exemplar-based account, linking children's developmental awareness to their experience and exposure to variation.
BACKGROUND
Development of phonological constancy
From five months, infants have been shown to demonstrate a preference for a familiar, local accent over unfamiliar, nonlocal accents (Butler, Floccia, Goslin, & Panneton, Reference Butler, Floccia, Goslin and Panneton2011). Studies of infants’ word learning have found that accent differences initially prevent the recognition of familiar words when they are heard in an unfamiliar accent (cf., Best & Kitamura, Reference Best, Kitamura and Burton2012; Schmale, Cristià, Seidl, & Johnson, Reference Schmale, Cristià, Seidl and Johnson2010). These studies have found that by 12–19 months, infants can abstract across different and unfamiliar accents in order to understand familiar words.
The ability to comprehend familiar words in an unfamiliar accent reflects children's development toward understanding the principle of ‘phonological constancy’ (Best, Tyler, Gooding, Orlando, & Quann, Reference Best, Tyler, Gooding, Orlando and Quann2009), whereby the phonology of a word is kept intact despite variations in its phonetic realization. Best et al. (Reference Best, Tyler, Gooding, Orlando and Quann2009) found that 19-month-olds but not 15-month-olds accepted nonnative pronunciations of familiar words and advocate a ‘perceptual attunement’ account to explain this development. They suggest that young children first develop dialect-specific phonetic patterns through their perception of the articulatory gestures of their native language/dialect. By 19 months, children have learned the phonological distinctiveness of the phonemes in their native language/dialect and have acquired the corresponding skill of phonological constancy; they are able to detect the phonetic variations they encounter as belonging to the same phoneme, by comparing the variations in terms of their articulatory gestures. The perceptual attunement account draws on the ‘perceptual assimilation model’ (PAM) (Best, Reference Best and Strange1995), originally developed as an explanation of how adult listeners deal with nonnative phonemes in cross-language speech perception. Therefore, PAM is able to explain and link together processes of speech perception throughout childhood and adulthood, as well as accounting for perceptual development of both native and nonnative contrasts.
Best et al.’s (Reference Best, Tyler, Gooding, Orlando and Quann2009) account focuses on children's linguistic input and their developing comparisons between incoming sounds. As such, it is compatible with a usage-based explanation such as exemplar theory, in which both probabilistic methods of learning and the creation of higher-level abstractions are proposed. The advantage of such an account for the current study is that the explanation incorporates a description of both the storage and accessing of a combination of phonological and social information (see ‘Theoretical account’ below). Such an account is therefore at least partly based on the individual child's exposure.
Previous research provides conflicting results regarding the role of infants’ previous exposure to accent varieties once they have reached the stage of phonological constancy. Prior exposure to an unfamiliar accent under laboratory conditions was found by Schmale, Cristia, and Seidl (Reference Schmale, Cristia and Seidl2012) to help 24-month-old infants’ understanding and processing of the accent, but this was contradicted by van Heugten and Johnson's (Reference van Heugten and Johnson2015) study of 28-month-old infants.
Similarly inconsistent results are found in studies of children's immersive experience of an accent. Floccia, Delle Luche, Durrant, Butler, and Goslin (Reference Floccia, Delle Luche, Durrant, Butler and Goslin2012) found that 20-month-old children brought up in a rhotic community were quicker at recognizing words pronounced in a rhotic form, regardless of whether they were “mono-accentual” (with two rhotic parents) or “bi-accentual” (with at least one nonrhotic parent). They interpret this as suggesting that a child's phonological representations are conditioned by their community rather than by their parents. However, van der Feest and Johnson's (Reference van der Feest and Johnson2015) study of 24-month-old Dutch children found that children with a “Mixed Input” (nonlocal parents) were able to detect mispronunciations by a speaker with their parents’ accent, whereas children with “Uniform Input” (local parents) ignored these mispronunciations because they were not familiar with the accent. Therefore, rather than ignoring the input from their parents (as Floccia et al.’s [Reference Floccia, Delle Luche, Durrant, Butler and Goslin2012] study suggests), the children with Mixed Input were able to utilize the mixed evidence from their linguistic input to decide the relevance of phonological contrasts that they heard.
Overall, the results of studies with infants demonstrate a development in children's sensitivities to accent variability. Infant studies are based on speaker discrimination and/or word learning which reflects the infants’ familiarity with the individual or accent that they are hearing. Their findings are, therefore, a preliminary step in understanding more about how speakers with the same accent can become categorized together conceptually as children mature. The current research investigates the emergence and development of such categorization among children at a key point in their sociolinguistic development.
Children's sociolinguistic development
Studies uncovering the development of sociolinguistic skills in the preschool years have found that children from the age of two acquire accent-specific phonological variation in their production (Foulkes, Docherty, & Watt, Reference Foulkes, Docherty and Watt1999; Roberts, Reference Roberts1997; Roberts & Labov, Reference Roberts and Labov1995). Small-scale sociolinguistic patterns relating to gender have also been discovered, such as girls’ higher rates of (-t, d) deletion in Roberts’ (Reference Roberts1997) study of preschool children in Philadelphia, and gender-specific variation of /p t k/ in preschool Tyneside English (Foulkes, Docherty, & Watt, Reference Foulkes, Docherty and Watt2005) and in primary school Australian English (Tait & Tabain, Reference Tait and Tabain2016). Style-shifting of certain variables has been evidenced in the speech of children from the age of three (Smith et al., Reference Smith, Durham and Fortune2007, Reference Smith, Durham and Richards2013) and has been found to develop as they get older (Kerswill & Williams, Reference Kerswill and Williams2002). The probability and rates of such style-shifting vary according to the variable itself, as well as according to the levels of style-shifting that children are exposed to in their input (Roberts, Reference Roberts1997; Smith, Durham, & Fortune, Reference Smith, Durham and Fortune2009; Smith et al., Reference Smith, Durham and Fortune2007, Reference Smith, Durham and Richards2013).
A few studies have investigated young children's perceptual awareness of regional accents after infancy, from the age of five. However, their differing methodologies and assumptions deliver conflicting conclusions. Studies by Floccia, Butler, Girard, and Goslin (Reference Floccia, Butler, Girard and Goslin2009) and Wagner, Clopper, and Pate (Reference Wagner, Clopper and Pate2014) found that children under the age of seven were not able to group speakers according to their regional accent. In these studies, the children only heard two example sentences in each accent before they were then asked to categorize ‘aliens from elsewhere’ (Floccia et al., Reference Floccia, Butler, Girard and Goslin2009) or different color puppets (Wagner et al., Reference Wagner, Clopper and Pate2014) into two groups based on these examples. Therefore, the tasks may simply have been too difficult as they required the children to have a very good working memory. Additionally, Floccia et al.’s [Reference Floccia, Butler, Girard and Goslin2009] task assumed that the children would know what an alien is and understand the link between where speakers are from and how they speak. Such metalinguistic awareness is a difficult and abstract skill that is being developed but is not fully mastered at such a young age, as Beck (Reference Beck2014) found in a task investigating children's explicit awareness of the link between accent and regional origin. Beck directly addressed this question by asking 5-7-year-olds to listen to two speakers with different regional accents and answer the question, “Can you guess why these two people talk differently?” She found that only 32% of the children gave the correct answer, and 42% declined to answer at all, indicating that most of the children had not made a metalinguistic connection between regional accent and geography.
Beck (Reference Beck2014) also investigated regional accent awareness among 5-7-year-olds by running an experiment with a simpler design than the grouping tasks discussed above. In her ABX discrimination task, children heard single words which were chosen to reflect the vowel quality differences between two different accents. They heard three speakers pronounce each word and were asked which two speakers sounded most alike. Beck found that the children in her study were able to discriminate between a familiar, local (Philadelphian) and a nonfamiliar, nonlocal (General Southern) regional accent, with an average of 64% correct answers. Beck's study therefore presents some evidence that children from five years are able to match speakers based on regionally distributed pronunciations. However, her experiment was limited to testing children's ability to discriminate between sounds and match them accordingly.
The current research goes beyond investigating sound-matching within words by testing their interpretation of variation at the level of the phoneme across different words (Level 2 of the experiment) as well as across different phonemes (Level 3 of the experiment). Additionally, by using a categorization task rather than an ABX discrimination task (see ‘Methodology’ section), the results of the current study can be said to indicate more reliably that the children are using the differences to group speakers rather than simply matching sounds. Furthermore, a driving question for the current research is whether children of an even younger, preschool age show an emerging awareness of accent; therefore, the age of focus for the current study is on 3-4-year-olds.
The current study also investigates the children's performance across the age range, as well as considering gender-based differences in the results. These factors are included as a rough approximation of the potential impact of maturational factors, such as increased processing speed that develops with maturation of the brain (Murphy, Reference Murphy2004). Infant girls have been found to mature physically at a faster rate than boys (Bornstein, Hahn, & Haynes, Reference Bornstein, Hahn and Haynes2004) and, therefore, they may have an early advantage in language processing tasks due to earlier brain maturation. More pertinent to the current study's focus on sociolinguistic development, any gender-related differences are likely to represent children's early gendered socialization. As mentioned above, research has found differences in the production patterns of young children in line with gendered norms, indicating that this particular aspect of their language socialization starts early. Furthermore, research shows that girls and boys can receive different linguistic input from their parents. In a community in Newcastle, Foulkes et al. (Reference Foulkes, Docherty and Watt2005) found that the child-directed speech (CDS) produced by mothers to their sons contained a higher proportion of nonstandard variants compared to CDS produced by mothers to their daughters. The authors proposed that “mothers are tuning their phonological performance in line with their child's developing gender identity” (Foulkes et al., Reference Foulkes, Docherty and Watt2005:198). It is also possible that this form of children's linguistic socialization affects their overall perception of linguistic variation, and, therefore, gender is an important consideration in the current study.
A theoretical account
A theoretical account of the stages in the phonological/sociolinguistic acquisition process is needed to explain how perceptual awareness progresses throughout childhood and beyond. Many usage-based accounts of the cognitive processes involved in language acquisition now advocate exemplar theory (ExT) as the best way to explain the storing of both linguistic and social information (cf., Pierrehumbert, Reference Pierrehumbert2003). Exemplar theory is a theory of memory and categorization originally developed in psychology. ExT proposes that we store detailed episodic traces in memory and that these memory traces affect how we process and interpret our future experiences.
At the heart of an exemplar model of memory is the idea that individually encountered stimuli are stored with details of the individual encounter. In speech encounters, this can range from the phonetic detail of the pronunciation made by the individual at the time, to social detail such as aspects of the speaker's accent or social background. When similar stimuli are then encountered in speech processing at a later point, these episodic traces, and the details stored alongside them, are accessed together.
In describing how children learn socially structured variation alongside their phonology, Docherty, Foulkes, Tillotson, and Watt (Reference Docherty, Foulkes, Tillotson, Watt, Goldstein, Whalen and Best2006) credited an exemplar approach with being able to account for the connection between language and its social context, and, in particular, how phonetic properties can be aligned with social referents, such as particular speakers, particular genders, and particular accents. They compared data of child-directed speech in Newcastle from Foulkes et al. (Reference Foulkes, Docherty and Watt2005) with adult-to-adult speech in the same community. Findings indicated that some of the community patterns of adult-to-adult speech were emphasized in child-directed speech. For example, in adult-to-adult speech, women were found to produce a higher proportion of (word-medial) standard /t/ compared to men. In CDS, both women and men increased their use of standard /t/, but there was still a much larger proportion of standard /t/ among women. They also found evidence that the children's own productions were reproducing the fine-grained phonetic variability to which they were exposed. As Docherty et al. (Reference Docherty, Foulkes, Tillotson, Watt, Goldstein, Whalen and Best2006) described, in line with the patterns of their local accent and their gender, children specifically associate phonetic variability with certain kinds of speaker. “Thus exemplar models may offer a plausible means of accounting for the learning and emergence of features of socially structured variation alongside other systematic aspects of sound patterning” (Docherty et al., Reference Docherty, Foulkes, Tillotson, Watt, Goldstein, Whalen and Best2006:414). From this sociolinguistic perspective, children are learning socially structured variation alongside their phonology, and both kinds of patterning lead to stored abstractions across their encounters.
While the focus in ExT is on the role of individually stored episodic traces and their detail, there is a consensus view among many proponents of the theory that some level of abstraction is also an important part of the process (Docherty & Foulkes, Reference Docherty and Foulkes2014:46). Cutler, Eisner, McQueen, and Norris (Reference Cutler, Eisner, McQueen and Norris2010) suggested that only a hybrid model of speech processing, which includes a role for both abstractions and episodes, can account for evidence showing that listeners adjust their interpretation of phonemes after limited exposure to deviant realizations by individuals (such as found by Norris, McQueen, & Cutler [Reference Norris, McQueen and Cutler2003] and McQueen, Cutler, & Norris [Reference McQueen, Cutler and Norris2006]). In this case, stored individual encounters, detailing the phonetic realizations of the speakers, contribute to a changed distinction that listeners make on the phonological level. Similarly, the build-up of encounters that listeners have with speech exemplars indexing social information about speakers can explain the development of categories pertaining to social-indexical distinctions, such as those based on speakers’ regional accents.
Foulkes (Reference Foulkes2010) hypothesized that we cognitively categorize speakers based on our accumulation of individual speakers’ exemplars and, as a result, that differences between individual speakers form the basis for the development of these speaker categories. For example, it is likely that fairly early in life, children are exposed to individual speakers who are easily categorizable “in a relatively neat tripartite structure” (Foulkes, Reference Foulkes2010:20) as ‘adult males,’ ‘adult females,’ and ‘children.’ Due to large differences in pitch and formant frequencies, exemplars from these individuals are grouped together in this three-way distinction. Less tangible groupings, such as those based on accent, are likely to develop later, through the accumulation of more exposure to speakers with these accents. Therefore, an individual's experience of individual speakers with different accents (as explored in the current study), is central to such a model.
THE CURRENT RESEARCH
The current research aims to capture a stage in children's development that, in an Exemplar Theory account, is at a point when they have built up enough exemplars to be able to categorize speakers according to the links between the phonetic and the social information that they contain. The overarching question is whether, at an age when they have developed a stable phonological system and are able to ignore superfluous variation for the purposes of understanding the meaning, children can nonetheless organize this variation in a socially meaningful way. Are they able to implement their now established phonological constancy, while also being able to interpret the variation they hear as something categorical? In particular, can 3-4-year-olds categorize speakers by phonological variables indexing regional accents:
(1) when the speakers produce the same phoneme within the same word?
(e.g., ‘b[a]sket’ or ‘b[ɑ:]sket’)
(2) when the speakers produce the same phoneme but within different words?
(e.g., ‘gr[a]ss’ and ‘p[a]th’ versus ‘gr[ɑ:]ss’ and ‘p[ɑ:]th’)
(3) when the speakers produce different phonemes in different words?
(e.g., ‘gl[a]ss’ and ‘br[e:]k’ versus ‘gl[ɑ:]ss’ and ‘br[eɪ]k’)
(4) To what extent do these abilities vary with age, gender, and parental input from different regional accents?
The phonological variables
This study investigates preschool children's awareness of accent features indicative of the distinction between speakers from the north and south of England, using the phonological variables in the bath, strut, and face lexical sets (Wells, Reference Wells1982a). In York, North Yorkshire, where the research took place, the local accent includes pronunciations of bath, strut, and face, which are prototypical of the Central North (a region defined by Hughes, Trudgill, & Watt [Reference Hughes, Trudgill and Watt2012]).
The Yorkshire accent extends to cover the accent of speakers from the other county subdivisions of Yorkshire (West and East) that the city of York itself borders (cf., Haddican, Foulkes, Hughes, & Richards, Reference Haddican, Foulkes, Hughes and Richards2013; Tagliamonte & Roeder, Reference Tagliamonte and Roeder2009). The bath vowel (and its realization as [a] in the North or [ɑ:] in the South) and the strut vowel (and its realization as [ʊ] in the North or [ʌ] in the South), are described as among the most conspicuous accent features in differentiating a Northern accent from a Southern one (cf., Hughes et al., Reference Hughes, Trudgill and Watt2012; Wells, Reference Wells1982a/Reference Wellsb). The differences in pronunciation of the strut and bath vowels are commonly seen as linguistic stereotypes (in Labov's [Reference Labov1972] terms) of the north/south of England as they are often overtly commented on by lay listeners as characterizing speakers from these two broad geographic regions. The realization of the face vowel differentiates speakers from the Central North who use [e:], and those from both the Midlands and the south of England who use [eɪ]. The monophthongal variant [e:] is a ‘mainstream’ Northern variant (Watt, Reference Watt2002), however, it is often not used by middle-class speakers in these areas, who use the more typically Southern/SSBE (Standard Southern British English) diphthongal pronunciation [eɪ]. The ongoing rise in the use of the diphthongal variant in Central North regions such as York is a change in progress, as reported by Haddican et al. (Reference Haddican, Foulkes, Hughes and Richards2013). Together, therefore, the bath, strut, and faceFootnote 1 vowels form a distinction between the vowels used in a local Yorkshire accent (situated in the Central North) and those used in a SSBE accent (pronunciations typical of the southeastof England).
Participants and background information
Twenty preschool children (ten 3-year-olds, ten 4-year-olds; twelve females, mean age 3;10; eight males, mean age 3;11) took part in the experiment. These children were all attending one of two different nurseries in York; nine children aged 3;1 to 4;6 from one nursery and eleven children aged 3;2 to 4;7 from another nursery.
The children's parents were asked to provide regional background information. Eighteen of the twenty children were born in York; one child moved from Germany to York aged five months, and another child moved from London and had been living in York for 17 months. For the purposes of the statistical analysis, the children were split into two groups according to whether they had at least one Yorkshire parent (ten children, mean age 3;10) or no Yorkshire parents (nine children, mean age 3;11, missing information from one child), with the region of Yorkshire defined as set out above. This distinction was made in line with second dialect studies in which children are usually classified as bidialectal if they move to a new town from elsewhere and have two nonlocal parents (e.g., Chambers, Reference Chambers1992; Payne, Reference Payne and Labov1980; Tagliamonte & Molfenter, Reference Tagliamonte and Molfenter2007; Trudgill, Reference Trudgill1981). Although most of the children in the current experiment were born in the local region, the ‘outsider’ status of their parents represented their exposure to nonlocal varieties at home. Those children with no Yorkshire parents had parents from a range of regions throughout the UKFootnote 2, such as Hampshire, London, the West Midlands, Northern Ireland, Tyne & Wear, and Aberdeenshire (see Appendix, Table A1 for a full list).
Experiment design: Stimuli
Sentence-length stimuli were constructed with the target word (i.e., the word featuring the accent difference) at the end of each sentence in order to draw the children's attention to the target vowel. The rest of the words in each sentence were chosen carefully in order to avoid including any words with accent differences corresponding to diagnostic features of the north and south (as defined by Wells, Reference Wells1982a). For example, the sentence shown in (1) features the bath vowel in the word ‘basket.’
(1) This is my basket
In each iteration of the experimental procedure (see below), children were presented with two ‘reference sentences,’ each spoken by a different cartoon image. They were then asked to match a set of ‘grouping sentences,’ (again each linked to a separate cartoon image), to one or other of the reference sentences. The reference sentences in each set were worded the same as each other, so that the only difference between them was the target vowel pronunciation. For example, there is only a bath vowel ([a]/[ɑ:]) distinction in (2) and (3).
(2) This is my b[a]sket
(3) This is my b[ɑ:]sket
The grouping sentences were designed with vowel pronunciation differences in line with three different levels of the experiment (referred to as levels 1–3), in order to test different aspects of the children's accent awareness. In line with the findings from infant phonological development described above, the children were presumed to have reached the stage of phonological constancy. It was therefore assumed that the children would have no trouble comprehending the words when they were pronounced either in a Yorkshire or a SSBE accent. The first level in the experiment aimed to explicitly test whether the children were able to pick up these lower-order phonetic patterns by testing their ability to group speakers based on different pronunciations of the same word, as shown in (4).
(4) Reference sentence: ‘This is my basket’
Grouping sentence: ‘Put me in a basket’
Accent difference: ‘b[a]sket’ versus ‘b[ɑ:]sket.’
This first level was therefore testing whether, despite having reached a level of phonological constancy, children were still able to interpret the phonetic variation between these sounds to the extent that they would be able to use them as grouping criteria. The second level was a higher-level test of the extent to which children were able to use both their knowledge of abstraction and variation, as this task asked the children to group speakers based on different pronunciations of the same phoneme but in different words, as shown in (5).
(5) Reference sentence: ‘We need to walk on the path’
Grouping sentence: ‘I want to walk on the grass’
Accent difference: ‘p[a]th’/’p[ɑ:]th’ vs. ‘gr[a]ss’/‘gr[ɑ:]ss’
The second level was therefore testing whether the children could hear these phonetic differences across different words, relying on their awareness of phonological constancy across words, as well as their ability to interpret phonetic variation within the phonemes.
The third level tested children's more abstract knowledge, as the task asked the children to group speakers across different phonemes, as shown in (6).
(6) Reference sentence: ‘What did you break?’
Grouping sentence: ‘It was a glass’
Accent differences: ‘br[e:]k’/‘br[eɪ]k'vs.‘gl[a]ss’/’gl[ɑ:]ss’
This third level was therefore more explicitly testing the social indexical knowledge developing among the children in relation to their awareness of speakers belonging to abstract regional categories (Yorkshire versus SSBE). Overall, these different levels aimed to track the stages of development: from children achieving phonological constancy to being able to use their wider abstract reasoning to link phonetic variation to higher-order differences relating to regional accent groups.
Table 1 presents a summary of the different levels, with example words used in the experiment.
Table 1. Experiment design with examples
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_tab1.png?pub-status=live)
Stimuli recordings
Stimuli for the experiment were recorded from one bidialectal speaker, a 25-year-old female, who was able to switch between two different accents and produce vowel pronunciations typical of both Yorkshire and the South East of England. Using the same speaker helped to ensure that the children would focus on the phonological accent variation of the speaker guises during the experiment, rather than making decisions based on other characteristics of individual speakers’ voices.
The speaker was recorded reading a list of sentences, using a Zoom H4n recorder which was set to record at a 16 bit 44.1kHz sampling rate. The speaker was asked to read the set of sentence stimuli, first with a SSBE pronunciation and then with a Yorkshire pronunciation of the target word. In order to keep the pronunciation of the rest of the stimulus consistent, she was instructed to read the rest of the sentence naturally, in her normal accent (standard and not regionally distinctive). This meant that the focus of the point of comparison would be on the end word itself, as this was the only one that differed. Furthermore, as the speaker's prosody was another variable with the potential to indicate her regional background, this was kept controlled to some extent by keeping stress placement consistently on the final word in the sentence.
Experimental procedure
The children took part in the experiment individually, either in a quiet corner of the nursery or at the child's home with their parent(s) present. The experiment was presented on a laptop computer, and the children listened to the audio stimuli through headphones.
The experiment was designed to be run in Microsoft PowerPoint as a slideshow, with pictures and sound clips and each slide consisting of a different grouping task. In order to keep the experiment short, there were ten tasks in all, each consisting of five trials. Each task was presented as a different screen (see Figure 1), and each trial involved matching a cartoon image to its group on the basis of a stimulus sentence. Tasks 1–3 (the first 15 trials) consisted of Level 1 sentences, tasks 4–7 (the next 20 trials) of Level 2 sentences, tasks 8–9 (the next 10 trials) of Level 3 sentences. Task 10 consisted of a mixture of Levels 2 and 3, with 3 of the grouping sentences matching the phoneme of the reference sentence and two grouping sentences containing a different phoneme. Therefore, in total the children carried out 15 trials at Level 1, 23 trials at Level 2, and 12 trials at Level 3. This uneven number of trials per level is accounted for in the statistical analysis of the results, which considers the results of each level separately.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_fig1.png?pub-status=live)
Figure 1. Screen shot of first ‘teddy bear’ grouping task.
The first three grouping tasks (for the Level 1 sentences) involved cartoon bears. In each of these tasks, two mother bears were displayed at the bottom of the screen. They were made distinguishable by having different color patches on their fur (see Figure 1). Each mother bear was linked to a sound clip of one of the reference sentences and next to each mother bear was a picture of the subject of the sentence, for example a basket in the case of the first sentence. At the top of the screen, pictures of five identical baby bears were displayed, each linked to an audio clip of one of the grouping sentences.
The experiment was presented as a game for the children to play; they were asked to group the ‘lost’ baby bears with their mother bears. The experimenterFootnote 3 controlled the playing of the audio files, clicking on each of the characters that rocked from side to side while the corresponding sound clip played. Each mother bear was heard first, and then each baby bear was heard, after which the child was asked to indicate the mother bear they belonged to by pointing at the screen. The experimenter then dragged the picture of the baby bear over to the mummy bear that the child had indicated. For the sake of consistency, in each task, three of the baby bears were linked to the Yorkshire sound clip and two of the bears were linked to the SSBE sound clipFootnote 4. The baby bears were arranged in a random order each time, to prevent children forming a pattern-based decision. Similarly, while the mother bear on the left of the screen was consistently played first, the accents of the mother bears were played in a pseudo random order (either Yorkshire or SSBE first). A trial was logged as ‘correct’ if the baby bear was grouped with the same-accented mother bear.
To keep the task varied and interesting, the grouping tasks for Levels 2 and 3 used pictures of cartoon mothers and their daughters instead of teddy bears. The grouping tasks were primarily the same, with two mothers (distinguished by different color dresses) displayed at the bottom of the screen and five daughters arranged randomly at the top of the screen.
Children who did not want to continue and those who failed to understand the task did not take part in the second part of the experiment based on the stimuli from Levels 2 and 3 (featuring the mothers and daughters). Out of the 20 child participants from the first part of the experiment, 15 went on to do the second part of the experiment: six 3-year-olds (five females, one male) and nine 4-year-olds (five females, four males). Overall, the mean age for the ten females was 3;11 and for the five males was 4;1. There were five children who had no Yorkshire parents (mean age of 4;2) and nine children who had 1+ Yorkshire parents, (mean age 3;10, missing information for one participant).
RESULTS AND ANALYSIS
Overall results across the different levels
Figure 2 presents an RDI plot of the overall results across the different levels of the experiment, created using the yarrr package (Phillips, Reference Phillips2017) in R. An RDI plot displays the raw data as well as the descriptive statistics and the inferential statistics. We can therefore see the overall density (the colored ‘beans’), 95% Highest Density Intervals (HDIs) (the boxes), and the mean as a measure of central tendency (the horizontal bands). For each level, the mean performance is above chance at 50%.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_fig2.png?pub-status=live)
Figure 2. RDI plot: All children's results, divided by level.
The children perform best in Level 1, in which the mean score is 65% correct answers, and the HDI of the mean has the smallest range, between 58–72%. Performance in Level 3 shows the sparsest density, with a mean score of 63% correct answers, but an HDI of the mean between 45–76%. The overlapping HDIs of all levels indicate that we cannot draw any strong conclusions as to how the children performed comparatively across the levels overall. The full results and background information for each child are shown in the Appendix, Table A1.
Effects of the independent variables
Due to the variation in performances across the levels and the different abilities being tested in each of the levels, the effects of the independent variables were analyzed in statistical models run separately for each level of the experiment.
Mixed effects statistical modeling
Binary mixed effects logistic models were carried out in R (R Core Team, 2013) through a stepwise backward regression method using the lme4 package (Bates, Maechler, Bolker, & Walker, Reference Bates, Maechler, Bolker and Walker2015). The three independent variables under investigation were included as binary independent variables: Age group (3-year-old/4-year-old), Yorkshire parent (Yes/No), and Gender (Female/Male), with the reference level in the model amounting to ‘3-year-old girl with no Yorkshire parent(s).’ This reference level acted as a baseline against which the models could measure the rest of the results (i.e., 4-year-olds’ results were measured in comparison to 3-year-olds’ results and boys’ results were measured in comparison to girls’ results)Footnote 5. Age was treated as a categorical variable as the children formed two distinct age groups, with a six-month gap between the oldest 3-year-old and the youngest 4-year-old. Overall, there were ten 3-year-olds (aged 3;0 to 3;8, mean age 3;4) and ten 4-year-olds (aged 4;2 to 4;7, mean age 4;5). In order to account for individual variation, individual child was included as a random effectFootnote 6.
Age
The 4-year-olds (M = 65.4, SD = 20.2) performed significantly better overall than the 3-year-olds (M = 39, SD = 18.36); t(18) = 3.06, p = 0.007. Age was found to be a significant predictor for the children's performance in Levels 1 and 2 but not in Level 3.
The RDI plot in Figure 3 illustrates these findings. The mean score for the 4-year-olds is consistently above chance for each of the levels, whereas the 3-year-olds’ mean scores are more variable. The lack of overlap in the HDIs indicates that we can conclude with high confidence that the 4-year-olds performed better than the 3-year-olds in Level 2.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_fig3.png?pub-status=live)
Figure 3. RDI plot: Results for each level, divided by age group.
Gender
As shown in Table 2 above, the best fit regression model finds gender to be a significant predictor in Level 1. The models fit to Levels 2 and 3 (Tables 3 and 4) do not select gender as significant.
Table 2. Logistic mixed effects model for experiment Level 1, investigating the grouping of the same phonological variables within the same word among 20 Yorkshire children (n responses = 300, significance level: ‘*’ = 0.05, ‘**’ = 0.01, ‘***’ = 0.001)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_tab2.png?pub-status=live)
Table 3. Logistic mixed effects model for experiment Level 2, investigating the grouping of the same phonological variables within different words among 15 Yorkshire children (n responses = 345, significance level: ‘*’ = 0.05, ‘**’ = 0.01, ‘***’ = 0.001)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_tab3.png?pub-status=live)
Table 4. Logistic mixed effects model for experiment Level 3, investigating the grouping of different phonological variables across different words among 15 Yorkshire children (n responses = 180, significance level: ‘*’ = 0.05, ‘**’ = 0.01, ‘***’ = 0.001)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_tab4.png?pub-status=live)
The RDI plot in Figure 4 exemplifies the results of the statistical models, showing that the girls have a higher mean than the boys in Level 1. They also have a higher mean in Level 3, but the plot shows sparse densities and large HDIs in particular for this level, indicating a large range of variable scores.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_fig4.png?pub-status=live)
Figure 4. RDI plot: Results for each level, divided by gender.
Parental input
The effect of Yorkshire parents was only found to be a significant predictor in the best fit statistical model run on Level 3, as Table 4 shows. This was the only significant predictor for Level 3.
The RDI plot in Figure 5 illustrates this finding; the HDIs for the Level 3 results do not overlap across the two groups, indicating high confidence of a difference in performance. Although not significant in the statistical model, the plot shows that children who have no Yorkshire parents scored higher on average in Levels 1 and 2 as well.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_fig5.png?pub-status=live)
Figure 5. RDI plot: Results for each level, divided by parental input.
Overview of the results within the different levels
Overall, results from the experiment find that maturational factors and exposure-related factors play a role in children's performance. Maturation is approximated by considering the children's age and gender. A development was found between the ages of 3 and 4 in Levels 1 and 2 of the experiment, and the girls were found to perform significantly better than the boys in Level 1 of the experiment. In Levels 1 and 2 of the experiment, children were asked to group speakers based on hearing the same word featuring an accent difference, or the same phoneme featuring an accent difference. Therefore, the improvement with age appears to demonstrate a development in the understanding that variable realizations of phonemes can represent a categorical difference between speakers. In Level 3, the children were asked to group speakers based on hearing different words. Therefore, the children were grouping speakers across different realizations of two different phonemes. In this part of the experiment it was found that varied input helped in the creation of more robust accent categories, as children with parents from outside the Yorkshire region performed significantly better than those with at least one Yorkshire parent. This finding shows that, beyond perceiving the phonetic differences between variable realizations of the same phoneme, these children were able to group together different phonemes representing an accent distinction. Therefore, the children with parents from outside the region have more robust cognitive accent categories, which facilitates their grouping of speakers into these categories.
DISCUSSION
A challenge to previous perceptual experiments with young children
This experiment has found that 3-4-year-olds perform better than chance when grouping speakers together based on regionally distributed pronunciation features. Children of this age have not been tested for this ability previously, and, indeed, these results challenge the conclusion of earlier studies that found that children under the age of seven were not able to group speakers according to accent criteria. The more refined task design used in the current study enabled the capture of a hitherto unexposed developmental ability among young preschool children. The experiment was explicitly designed to test children's ability based on a limited number of key phonological variables pertaining to well-known broad accent distinctions in the UK. This is in comparison to the studies by Floccia et al. (Reference Floccia, Butler, Girard and Goslin2009) and Wagner et al. (Reference Wagner, Clopper and Pate2014), which used tasks with longer stimuli and that had no experimental control over the stimuli that the children heard. The current study's direct focus on key segmental variables makes the results easier to interpret as we can more reliably infer that the children are reacting to these specific differences in pronunciation when grouping the speakers.
Sociolinguistic development in the preschool years: beyond phonological constancy
Overall, a development in performance was found between the 3- and 4-year-olds. The significance of the improvement was found to be most robust in Level 2 (the same phoneme condition) that tested their extension of phonological constancy from within words to between words, as well as their ability to group speakers according to variation in the phonetic realizations of the phonemes. Therefore, the age-related improvement in this process of abstraction shows a development from phonological constancy (as the infancy literature posits at around 18/19 months), to the increasing ability to interpret the variation within these phonemes as indicating something socially meaningful.
This improvement throughout the preschool years contributes to the collective findings of others who have investigated sociolinguistic developments in production occurring around the age of three years (cf., Barbu, Nardy, Chevrot, & Juhel, Reference Barbu, Nardy, Chevrot and Juhel2013; Foulkes et al., Reference Foulkes, Docherty and Watt1999; Roberts & Labov, Reference Roberts and Labov1995; Smith et al., Reference Smith, Durham and Fortune2007). It seems that the preschool years see rapid changes in the sociolinguistic competence of children, both in perception and production.
Parental input and the role of variation
Children with parents from outside of Yorkshire had a higher chance of performing better in Level 3, which added a further level of abstraction to the task as the children were being tested on matching different phonemes, essentially as either ‘Yorkshire sounding’ or ‘Southern sounding.’Footnote 7 Variation in children's input from parents with a nonlocal accent contributed to children's successful performance in this particular task, suggesting that those with outsider parents appear to have made a more distinct category division between ‘local/nonstandard’- and ‘nonlocal/standard’-sounding speakers.
As the current findings are based on a relatively small number of children, and the parents from outside of Yorkshire all come from different regions (see Appendix, Table A1), it is difficult to reliably interpret the particular ‘outsider’ status of the children's parents and how their own production patterns may have differed from each other. The measurement of this independent variable was a simplified representation of the children's exposure to regional variation. In order to validate the findings, future studies would need to formulate a more comprehensive way of measuring their exposure to regional varieties.
However, it is worth noting that three of the top four performers in Level 3 (who scored above 90% correct answers) have at least one Southern parent (M7, F2, and F5, see Table A1). Their high performance in this task can therefore potentially be explained as a result of their experience with a Southern accent in particular, although more detailed work needs to be done.
The role of gender
On average, the girls outperformed the boys in this experiment, although the difference was only significant in Level 1 (the same word condition). Overall, the boys’ ability varied much more, particularly in Level 3 (the different phoneme condition). This difference between the genders could partly be due to the nature of the task itself, which was centered on the speech of females, using female cartoon pictures and run by a female experimenter. Support for this interpretation can be found in the results of Cvencek, Greenwald, and Meltzoff (Reference Cvencek, Greenwald and Meltzoff2011), who ran implicit association tests with preschool children and found that girls showed a stronger implicit preference for stereotypically ‘girly’ flowers (versus insects), as well as a stronger implicit preference for their own gender than the boys. Foulkes et al.’s (Reference Foulkes, Docherty and Watt2005) account of child-directed speech provides the basis for a sociolinguistic interpretation of the data with girls receiving more standard forms in CDS than boys. As the current experiment focuses on a comparison between SSBE and Yorkshire forms, the girls’ potential higher exposure to SSBE forms may have aided them in the experiment. The contributing effect of overall exposure to variation can be interpreted through an exemplar theoretic model, as the next section explores.
Putting the pieces together: an exemplar model
This study set out to investigate a stage in children's perceptual development at which they are able to interpret the social information encoded in the phonetic realizations of speakers’ pronunciations. The investigation has been framed as an exploration of an exemplar-led explanatory model of sociolinguistic acquisition, whereby children's exposure to variation has built up in the form of individual exemplars that have social information encoded within them. There has been a rise in studies advocating exemplar models of memory and conceptualization in fields of linguistic research relevant to the current study, such as sociolinguistics, phonological development, and speech perception (Docherty & Foulkes, Reference Docherty and Foulkes2014; Johnson, Reference Johnson1997, Reference Johnson2006; Pierrehumbert, Reference Pierrehumbert2003). These models differ in the extent to which they determine memories as being purely episodic-based or whether they also involve a process of abstraction. As Docherty and Foulkes (Reference Docherty and Foulkes2014:49) pointed out, the exact nature of the connection between individually stored exemplars and the abstracted categories that develop as a result of their similarities is “under-theorised.” However, the importance of linguistic input is foregrounded in all such accounts. The findings from the current study relating to the three measures associated with linguistic input (age, gender, Yorkshire parents) provide strong evidence for an exemplar account of indexical learning and the development of social-indexical knowledge, supporting the account developed by Docherty et al. (Reference Docherty, Foulkes, Tillotson, Watt, Goldstein, Whalen and Best2006), Foulkes (Reference Foulkes2010), and Docherty and Foulkes (Reference Docherty and Foulkes2014).
In the current study, the four-year-olds have built up a larger store of exemplars than the three-year-olds, giving them more variation to draw upon when categorizing the speakers. Girls are more often addressed using stylistically shifted standard variants in CDS (Foulkes et al., Reference Foulkes, Docherty and Watt2005), providing them with more exposure to standard forms that aids their categorization of speakers into Yorkshire and SSBE in the current study.
The strongest support for an exemplar model comes from the effect of children's exposure to variation in their input, which plays the most significant role in Level 3 of the experiment. The level of abstraction required to group speakers based on different phonemes marks a crucial stage in their sociolinguistic development as it signifies the initial stages of the evolution of cognitive speaker categories based on accent criteria (a developmental stage posited by Foulkes [Reference Foulkes2010]). This provides convincing evidence for an exemplar theoretic account; children's individual experience of phonetic variation relating to different accents results in a larger store of exemplars of speakers with different accents. In turn, this store of exemplars is used as a basis for grouping the unfamiliar speaker guises that they hear.
The results from this study have captured aspects of early development in children's sociolinguistic awareness. Identifying a developing perceptual awareness of regional accents in the preschool years, based partly on an individual's exposure to variation, adds to our understanding of the context in which this sociolinguistic development is happening. The extent to which we experience variation through the speakers we hear around us inevitably impacts the way that we use the cognitive strategy of categorization to understand our social world. It is anticipated that the results from this research will form the basis for further research into the broader implications of the association between linguistic and social information, such as the formation of linguistic stereotypes.
APPENDIX
TABLE A1. Children's background information and performance across the levels
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200221033344821-0903:S0954394519000176:S0954394519000176_tab6.png?pub-status=live)