Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-02-11T05:57:31.710Z Has data issue: false hasContentIssue false

Working memory performance in children with and without specific language impairment in two nonmainstream dialects of English

Published online by Cambridge University Press:  02 November 2017

JANET L. MCDONALD*
Affiliation:
Louisiana State University
CHRISTY M. SEIDEL
Affiliation:
Louisiana State University
REBECCA HAMMARLUND
Affiliation:
Louisiana State University
JANNA B. OETTING
Affiliation:
Louisiana State University
*
ADDRESS FOR CORRESPONDENCE Janet L. McDonald, Department of Psychology, Louisiana State University, Baton Rouge, LA 70803. E-mail: psmcdo@lsu.edu
Rights & Permissions [Opens in a new window]

Abstract

Using speakers of either African American English or Southern White English, we asked whether a working memory measure was linguistically unbiased, that is, equally able to distinguish between children with and without specific language impairment (SLI) across dialects, with similar error profiles and similar correlations to standardized test scores. We also examined whether the measure was affected by a child's nonmainstream dialect density. Fifty-three kindergarteners with SLI and 53 typically developing controls (70 African American English, 36 Southern White English) were given a size judgment working memory task, which involved reordering items by physical size before recall, as well as tests of syntax, vocabulary, intelligence, and nonmainstream density. Across dialects, children with SLI earned significantly poorer span scores than controls, and made more nonlist errors. Span and standardized language test performance were correlated; however, they were also both correlated with nonmainstream density. After partialing out density, span continued to differentiate the groups and correlate with syntax measures in both dialects. Thus, working memory performance can distinguish between children with and without SLI and is equally related to syntactic abilities across dialects. However, the correlation between span and nonmainstream dialect density indicates that processing-based verbal working memory tasks may not be as free from linguistic bias as often thought. Additional studies are needed to further explore this relationship.

Type
Articles
Copyright
Copyright © Cambridge University Press 2017 

Children with specific language impairment (SLI) have normal intelligence, but have more difficulty with various aspects of language including syntax and morphology, vocabulary, and phonological processing than typically developing (TD) children (Leonard, Reference Leonard2014; Schwartz, in press). They also tend to have deficits in verbal working memory (Briscoe & Rankin, Reference Briscoe and Rankin2009; Ellis Weismer, Plante, Jones, & Tomblin, Reference Ellis Weismer, Plante, Jones and Tomblin2005; Frizelle & Fletcher, Reference Frizelle and Fletcher2015; Lum, Conti-Ramsden, Page, & Ullman, Reference Lum, Conti-Ramsden, Page and Ullman2012; Mainela-Arnold & Evans, Reference Mainela-Arnold and Evans2005; Mainela-Arnold, Evans, & Coady, Reference Mainela-Arnold, Evans and Coady2010; Montgomery, Reference Montgomery2000a; Reference Montgomery2000b; Montgomery & Evans, Reference Montgomery and Evans2009; Montgomery, Evans, & Gillam, Reference Montgomery, Evans and Gillam2009; Vugs, Hendriks, Cuperus, & Verhoeven, Reference Vugs, Hendriks, Cuperus and Verhoeven2014). Working memory, the ability to hold and simultaneously process information, is measured by tasks that require an individual to perform some kind of operation while also remembering information for recall (Just & Carpenter, Reference Just and Carpenter1992). In TD children, performance on working memory tasks correlates to standardized tests of syntax (Engel de Abreu, Gathercole, & Martin, Reference Engel de Abreu, Gathercole and Martin2011; Haake, Hansson, Gulz, Schötz, & Sahlén, Reference Haake, Hansson, Gulz, Schötz and Sahlén2014; Magimairaj & Montgomery, Reference Magimairaj and Montgomery2012; but see Lum et al., Reference Lum, Conti-Ramsden, Page and Ullman2012) and vocabulary (Adams, Bourke, & Willis Reference Adams, Bourke and Willis1999; Engel de Abreu et al., Reference Engel de Abreu, Gathercole and Martin2011), as well as to grammaticality judgments (McDonald, Reference McDonald2008) and often, but not always, to sentence comprehension (Montgomery, Reference Montgomery2000a, Reference Montgomery2000b; Montgomery & Evans, Reference Montgomery and Evans2009; Montgomery, et al., Reference Montgomery, Evans and Gillam2009). The picture is murkier for children with SLI, with working memory scores and sentence comprehension sometimes positively correlating (Montgomery & Evans, Reference Montgomery and Evans2009; Montgomery et al., Reference Montgomery, Evans and Gillam2009), sometimes not correlating (Montgomery, Reference Montgomery2000b), and once even negatively correlating (Montgomery, Reference Montgomery2000a).

In the current work, we further explored working memory as an important correlate to children's language abilities and the language deficits of children with SLI using speakers of two nonmainstream dialects of English: African American English (AAE) and Southern White English (SWE). Our focus on AAE and SWE was strategic because both are spoken in the rural south and in both of these dialects, a child's language aptitude is difficult to assess with traditional measures. For morphology in particular, utterances with omitted grammatical morphemes (e.g., She Ø walking) are both a hallmark feature of these dialects and fit the linguistic profile of SLI (Oetting, Lee, & Porter, Reference Oetting, Lee and Porter2013; Oetting & McDonald, Reference Oetting and McDonald2001; Seymour, Bland-Stewart, & Green, Reference Seymour, Bland-Stewart and Green1998). This overlap in features across dialects and the SLI condition makes grammatical morphology extremely difficult to evaluate in speakers of these nonmainstream dialects. Given this, working memory measures, if they are free from linguistic bias, could be extremely useful for helping to identify childhood SLI within AAE and SWE and perhaps within other groups of linguistically diverse learners.

In the current work, we examined the dialect neutrality of a verbal working memory measure by asking if across dialects the measure is sensitive to variation in language aptitude to the same degree, generates similar types of error profiles in children, and correlates similarly to other ability measures. Between and within the dialects, we also asked whether the measure is influenced by the density with which children produce nonmainstream forms, which should not be the case if the measure is free from bias. Between dialects, AAE and SWE are ideally suited to examine this question because AAE tends to have much higher nonmainstream density than SWE (Oetting, Reference Oetting and Lanehart2015; Oetting & McDonald, Reference Oetting and McDonald2001). Within the dialects, speakers also differ in the densities of their nonmainstream forms, with some speakers producing low levels, others producing moderate levels, and still others producing high levels (Oetting & McDonald, Reference Oetting and McDonald2002; Terry, Connor, Thomas-Tate, & Love, Reference Terry, Connor, Thomas-Tate and Love2010; Washington & Craig, Reference Washington and Craig1994). Below we review the literature on working memory deficits in children with SLI, the identification of SLI within AAE and SWE, and dialect density as an important metric for studies of linguistic bias.

WORKING MEMORY DEFICITS IN CHILDREN WITH SLI

As detailed below, poor verbal working memory capacity as a possible deficit in children with SLI has been investigated in a number of studies using several different verbal working memory tasks. It is generally found that children with SLI do worse than age-matched controls regardless of the task.

Backward digit span tasks require children to repeat lists of digits that vary in length in reverse order. Results show that children with SLI, aged 4 to 12 years, have lower backward digit span scores than age-matched controls (Briscoe & Rankin, Reference Briscoe and Rankin2009; Frizelle & Fletcher, Reference Frizelle and Fletcher2015; Lum et al., Reference Lum, Conti-Ramsden, Page and Ullman2012; Vugs et al., Reference Vugs, Hendriks, Cuperus and Verhoeven2014; for studies that found trends but not significant differences, see Petruccelli, Bavin, & Bretherton, Reference Petruccelli, Bavin and Bretherton2012; Quail, Williams, & Leitão, Reference Quail, Williams and Leitão2009). Counting span tasks ask children to count the number of objects in successive arrays, and then recall the number of objects in each array at the end of the sequence. Sequences increase in length until children are no longer able to recall the counts. As with the backward digit span, studies show that children with SLI, aged 4 to 12 years, have lower spans than their age-matched counterparts (Frizelle & Fletcher, Reference Frizelle and Fletcher2015; Lum et al., Reference Lum, Conti-Ramsden, Page and Ullman2012; Vugs et al., Reference Vugs, Hendriks, Cuperus and Verhoeven2014). Finally, listening span tasks require children to make true/false judgments of sentences they hear while simultaneously retaining the final word of each for later recall. Across various versions of this task, children with SLI, aged 4 to 14 years, generally do not differ from controls in their ability to judge the truth value of the sentences; however, in terms of words recalled, children with SLI score lower than age- but not language-matched controls (Briscoe & Rankin, Reference Briscoe and Rankin2009; Ellis Weismer, Evans, & Hesketh, Reference Ellis Weismer, Evans and Hesketh1999; Ellis Weismer et al., Reference Ellis Weismer, Plante, Jones and Tomblin2005; Frizelle & Fletcher, Reference Frizelle and Fletcher2015; Laing & Kamhi, Reference Laing and Kamhi2003; Lum et al., Reference Lum, Conti-Ramsden, Page and Ullman2012; Mainela-Arnold & Evans, Reference Mainela-Arnold and Evans2005; Mainela-Arnold et al., Reference Mainela-Arnold, Evans and Coady2010; Marton & Eichorn, Reference Marton and Eichorn2014; Marton, Kelmenson, & Pinkhasova, Reference Marton, Kelmenson and Pinkhasova2007; Marton & Schwartz, Reference Marton and Schwartz2003; Marton, Schwartz, Farkas, & Katsnelson, Reference Marton, Schwartz, Farkas and Katsnelson2006; Montgomery & Evans, Reference Montgomery and Evans2009; Rodekohr & Haynes, Reference Rodekohr and Haynes2001; Vugs et al., Reference Vugs, Hendriks, Cuperus and Verhoeven2014). In addition to their listening span scores, children with SLI and their age-matched controls differ in the type and amount of nonlist words produced, with children with SLI more likely to give words that had occurred on previous lists, or had occurred elsewhere in the sentence (Ellis Weismer et al., Reference Ellis Weismer, Evans and Hesketh1999; Marton & Eichorn, Reference Marton and Eichorn2014; Marton & Schwartz, Reference Marton and Schwartz2003; Marton et al., Reference Marton, Schwartz, Farkas and Katsnelson2006, Reference Marton, Kelmenson and Pinkhasova2007).

The task used in this study, size judgment (Cherry, Elliott, & Reese, Reference Cherry, Elliott and Reese2007; Cherry & Park, Reference Cherry and Park1993), is a verbal working memory measure that does not require children to count or process syntax or morphology. It involves hearing a list of nouns, and then upon recall reordering them in terms of the physical size of the referent, from smallest to largest. Performance on this task is highly correlated to performance on both the backward digit span and listening span tasks in adults (Cherry et al., Reference Cherry, Elliott and Reese2007), validating its use as a working memory measure. Montgomery (Reference Montgomery2000a, Reference Montgomery2000b; Montgomery et al., Reference Montgomery, Evans and Gillam2009) developed a version of the task appropriate for children that involves three levels of processing difficulty. Their lists contain items of various sizes from two different semantic categories (e.g., clothing and animals). In the easiest level, children say the words back in any order; in the intermediate level, they repeat the items back in order of size regardless of semantic category; and in the hardest level, they must do two reorderings upon recall, one by semantic category and then within each category, by size. Montgomery (Reference Montgomery2000a, Reference Montgomery2000b; Montgomery et al., Reference Montgomery, Evans and Gillam2009) found that children with SLI, aged 7 to 10 years, did not differ from their age-matched controls on the easy or intermediate levels of this task, but they scored lower than the controls for the hardest level of the task.

IDENTIFICATION OF CHILDHOOD SLI WITHIN NONMAINSTREAM DIALECTS OF ENGLISH

It is difficult to identify children with SLI in the context of different nonmainstream dialects such as SWE and AAE. Relative to mainstream dialects of English, the study of AAE and SWE in children has been minimal, and linguistic milestones to benchmark typical development (or flag impairment) within these dialects have yet to be fully established (for recent work, see Newkirk-Turner, Oetting, & Stockman, Reference Newkirk-Turner, Oetting and Stockman2016; Stockman, Guillory, Seibert, & Boult, Reference Stockman, Guillory, Seibert and Boult2013; Stockman, Newkirk-Turner, Swatzlander, & Morris, Reference Stockman, Newkirk-Turner, Swartzlander and Morris2016). As mentioned earlier, dialects such as AAE and SWE allow grammatical omissions, including those that mark tense and agreement morphemes (e.g., auxiliaries BE and DO, third-person morphemes, and past-tense morphemes) that are well known to be characteristic of the SLI condition. Although there is now some evidence showing that children with and without SLI can be differentiated within these dialects by the frequency of their omissions for at least some grammatical morphemes (Cleveland & Oetting, Reference Cleveland and Oetting2013; Garrity & Oetting, Reference Garrity and Oetting2010; Oetting & McDonald, Reference Oetting and McDonald2001; Oetting & Newkirk, Reference Oetting and Newkirk2008; Rivière & Oetting, Reference Rivière and Oetting2017), much of this evidence has come from labor-intensive analyses of language samples that have included over 200 utterances per child. Within clinical practice, detailed analyses of language samples of this length are likely not feasible.

Linguistic biases within standardized tests also contribute to the difficulty of identifying children with SLI within AAE and SWE. These biases have been shown to surface not only when the children being tested speak a nonmainstream dialect but also when they come from a minority cultural background and/or are economically disadvantaged (Wyatt, Reference Wyatt and Lanehart2015). Given this, multiple researchers advocate for assessments to include processing-based measures such as working memory tasks to circumvent language-related and/or experience-related test biases (Campbell, Dollaghan, Needleman, & Janosky, Reference Campbell, Dollaghan, Needleman and Janosky1997; Craig & Washington, Reference Craig and Washington2000; Engel, Santos, & Gathercole, Reference Engel, Santos and Gathercole2008; Laing & Kahmi, Reference Laing and Kamhi2003; Oetting & Cleveland, Reference Oetting and Cleveland2006; Rodekohr & Haynes, Reference Rodekohr and Haynes2001; Washington & Craig, Reference Washington and Craig2004). However, there is some evidence that working memory tasks are not completely free from bias. A closer look at Engel et al. (Reference Engel, Santos and Gathercole2008) shows that while no differences were found by socioeconomic status on backward digit span or counting span when using a Bonferroni correction, the counting span results were at the p = .03 level, hinting there may be some possible biases. This trend was confirmed in further work by Engel de Abreu, Puglisi, Cruz-Santos, Befi-Lopes, and Martin (Reference Engel de Abreu, Puglisi, Cruz-Santos, Befi-Lopes and Martin2014), who found that children with poor schooling, and presumably less practice at counting, had significantly lower counting spans than children with better schooling.

Moreover, only a very limited number of studies have examined whether working memory tasks are good at discriminating between children with and without language impairments across different races and dialects. Laing and Kamhi (Reference Laing and Kamhi2003) found clinical status effects (African American language impaired < African American controls) but not race effects (African American TD = White TD) in third and fourth graders using a listening span task. Rodekohr and Haynes (Reference Rodekohr and Haynes2001) also found a clinical status effect (language impaired < controls) but not a dialect/race effect (African American who were confirmed AAE speakers = White whose dialect was not confirmed) in children, aged 7 years, using a listening span task. Nevertheless, in this study there was a marginal interaction (p = .067) between clinical status and dialect, which related to a larger clinical effect for the White children than for the AAE-speaking children. Ideally a test designed to detect differences between children with and without SLI would show the same magnitude of effect across different races and dialects.

In addition to identifying children with SLI with the same degree of accuracy across dialects, an unbiased measure of working memory should also show similar correlations to other measures of ability across dialects. Specifically, we should find similar correlations between working memory scores and standardized language and intelligence measures in both AAE and SWE. In both dialects we should replicate the previously mentioned findings that children's working memory scores correlate with standardized measures of syntax (Engel de Abreu et al., Reference Engel de Abreu, Gathercole and Martin2011; Haake et al., Reference Haake, Hansson, Gulz, Schötz and Sahlén2014; Magimairaj & Montgomery, Reference Magimairaj and Montgomery2012; but see Lum et al., Reference Lum, Conti-Ramsden, Page and Ullman2012) and vocabulary (Adams et al., Reference Adams, Bourke and Willis1999; Engel de Abreu et al., Reference Engel de Abreu, Gathercole and Martin2011); they should also correlate with standardized measures of nonverbal intelligence (Adams et al., Reference Adams, Bourke and Willis1999; Alloway, Gathercole, Willis, & Adams, Reference Alloway, Gathercole, Willis and Adams2004; Ellis Weismer et al., Reference Ellis Weismer, Evans and Hesketh1999; Engel de Abreu, Conway, & Gathercole, Reference Engel de Abreu, Conway and Gathercole2010; Engel de Abreu et al., Reference Engel de Abreu, Gathercole and Martin2011).

NONMAINSTREAM DIALECT DENSITY AS AN IMPORTANT METRIC FOR STUDIES OF BIAS

Besides being a speaker of a nonmainstream dialect, another factor that appears to have important implications for investigating SLI in diverse groups of language learners is a child's nonmainstream dialect density. There are multiple ways to measure a child's nonmainstream dialect density, but all are correlated to each other and involve calculating the relative frequency (i.e., rate) with which a child produces nonmainstream forms (Horton & Apel, Reference Horton and Apel2014; Oetting & McDonald, Reference Oetting and McDonald2002). Both internal variables such as gender, age, and socioeconomic status and external variables such as type of task, modality of task, and speaking partner contribute to this variation (for some examples of child studies, see Barbu, Martin, & Chevrot, Reference Barbu, Martin and Chevrot2014; Craig, Kolenic, & Hensel, Reference Craig, Kolenic and Hensel2014; Craig & Washington, Reference Craig and Washington2004; Craig, Zhang, Hensel, & Quinn, Reference Craig, Zhang, Hensel and Quinn2009; Ivy & Masterson, Reference Ivy and Masterson2011; Mills, Reference Mills2015; Van Hofwegen & Wolfram, Reference Van Hofwegen and Wolfram2010; Washington & Craig, Reference Washington and Craig1998).

In recognition of this variation, a growing number of researchers now include nonmainstream dialect density metrics in their studies. In studies of mainstream English ability, a child's nonmainstream dialect density impacts performance. For example, high nonmainstream dialect density child speakers have more difficulty than low density speakers in identifying aurally presented mainstream words as words (Brown, Reference Brown2011), and identifying words in mainstream contexts that are ambiguous in their nonmainstream dialect (Edwards et al., Reference Edwards, Gross, Chen, MacDonald, Kaplan, Brown and Seidenberg2014). Children's nonmainstream densities have also been found to correlate, often negatively, to children's standardized language test scores (Charity, Scarborough, & Griffin, Reference Charity, Scarborough and Griffin2004; Connor & Craig, Reference Connor and Craig2006; Craig, Thompson, Washington, & Porter, Reference Craig, Thompson, Washington and Porter2004; Terry et al., Reference Terry, Connor, Thomas-Tate and Love2010). For example, in Craig et al. (Reference Craig, Thompson, Washington and Porter2004) the Gray Oral Reading Tests were identified as containing biases because children's densities were negatively correlated to their reading rates. In Terry et al. (Reference Terry, Connor, Thomas-Tate and Love2010), children with high nonmainstream densities also tended to have lower vocabulary and phonological awareness test scores in the early grades (but, for a study showing no differences between density groups using a median split procedure and raw test scores, see Moyle, Heilmann, & Finneran, Reference Moyle, Heilmann and Finneran2014).

Of concern here, children's nonmainstream dialect densities have been shown to influence their performance on nonword repetition, another processing-based task that is often viewed as less culturally and linguistically biased than traditional language tests. Nonword repetition involves having children repeat nonwords of increasing syllable lengths; children with SLI are poorer at the task than TD children, especially at long syllable lengths (Dollaghan & Campbell, Reference Dollaghan and Campbell1998). In two early studies, Campbell et al. (Reference Campbell, Dollaghan, Needleman and Janosky1997) and Oetting and Cleveland (Reference Oetting and Cleveland2006) found no race or dialect effects for nonword repetition, although in both of these studies, race and/or dialect was treated as a nominal variable and they did not look at dialect density. In contrast to these studies, when Moyle et al. (Reference Moyle, Heilmann and Finneran2014) examined children's dialect use using a nonmainstream density measure, children with higher densities were found to earn lower nonword repetition scores than those with lower densities. Results from the Moyle et al. study, as well as the marginally significant interaction of race/dialect with diagnostic group observed on a working memory measure in Rodekohr and Haynes (Reference Rodekohr and Haynes2001), call into question the unbiased nature of processing tasks in general.

SUMMARY, RESEARCH QUESTIONS, AND PREDICTIONS

Studies have found children with SLI evidence weaknesses in working memory and produce more nonlist errors relative to controls. Correlations between children's working memory and measures of their syntax, vocabulary, and intelligence also have been found. In the current work, we extended the study of working memory to children who spoke either AAE or SWE, two nonmainstream dialects within which it has proven difficult to identify children with SLI with traditional assessment tools. We were motivated to do this work because working memory tasks, since they are processing based, are often viewed as free from cultural and linguistic biases, and if this is the case, these tasks would be ideally suited for the identification of childhood SLI in nonmainstream dialects, such as AAE and SWE.

Using a size judgment task, we first asked whether this working memory measure was equally able to distinguish between children with and without SLI across dialects, with similar child error profiles and similar correlations to other measures in the two dialects. We also asked whether this working memory measure was affected by a child's nonmainstream dialect density. We hypothesized that regardless of the children's nonmainstream dialect type, those with SLI would have lower span scores than the controls and that the magnitude of the effect would be the same across the dialects. We also expected to find more nonlist errors on the working memory task made by the children with SLI than by the TD controls and similar correlations between the children's span scores and their scores on other measures of syntax, vocabulary, and intelligence across dialects. Finally, we expected the children's nonmainstream dialect densities to be unrelated to their span scores.

METHOD

Participants

Participants were 106 kindergarteners (M = 66.24 months, SD = 3.78, range = 59–74 months) who were classified by dialect as a speaker of either SWE (n = 36) or AAE (n = 70) and clinical status as either SLI (n = 53, 18 SWE, 35 AAE) or TD (n = 53, 18 SWE, 35 AAE). Details related to each child's classification by dialect and clinical status can be found in Oetting, McDonald, Seidel, and Hegarty (Reference Oetting, McDonald, Seidel and Hegarty2016) as these children also participated in that study. For convenience, a summary of their testing profiles is presented in this paper as well.

As confirmed by listener judgments and a screening test, children's dialect corresponded to their race, with SWE speakers being non–African American and AAE speakers being African American (AA). Specifically, the listener judgment task involved two out of three judges, blind to the child's race and gender, agreeing on the dialect classification from 1-min samples of conversation using a holistic impression (Oetting & McDonald, Reference Oetting and McDonald2002); using this method, 94% of non-African American children were classified as SWE speakers and 90% of African American children were classified as AAE speakers. The nonmainstream dialect status of the remaining children was verified from longer speech samples, and from the presence of nonmainstream responses on the language variation portion of the Diagnostic Evaluation Language Variation Screening Test (DELV-S; Seymour, Roeper, & de Villiers, Reference Seymour, Roeper and de Villiers2003).

The DELV-S was also used to classify the children's nonmainstream dialect densities as low, medium, and high using the three-level classification system provided by the test developers. The rating system considers each child's number of nonmainstream and mainstream responses compared to age-delimited criteria. For speakers of SWE, there were 14 classified as low variation from mainstream English (SLI: 0; TD: 14), 6 classified as medium with some variation (SLI: 4; TD: 2), and 16 classified as high with strong variation (SLI: 14; TD: 2). For speakers of AAE, there were 6 classified as low (SLI: 0; TD: 6), 12 classified as medium (SLI: 6; TD: 6), and 52 classified as high (SLI: 29; TD: 23). As mentioned earlier, while there are many ways to measure a child's nonmainstream dialect density, such as DELV-S scores, listener judgments, or calculating the number of types or tokens of nonmainstream structures produced, they all are correlated to each other (Horton & Apel, Reference Horton and Apel2014; Oetting & McDonald, Reference Oetting and McDonald2002). While the test developers of the DELV-S categorize children into three dialect density groups based on their responses, we and others have also used the DELV-S items to calculate the more continuous measure of each child's percentage of nonmainstream responses over the sum of the child's nonmainstream and mainstream responses (Oetting et al., Reference Oetting, McDonald, Seidel and Hegarty2016; Terry et al., Reference Terry, Connor, Thomas-Tate and Love2010). For the children studied here, both these DELV-S indices are highly correlated (r = .86, p < .001), and the results presented here do not differ significantly as a function of the DELV-S metric used.Footnote 1 We chose the DELV-S three-level score as recommended by the test developers because of its ease of use and relevance to clinical practice.

The children's clinical status was determined through a battery of standardized tests. Children in both groups passed a hearing screening, performed at or above –1.2 SD of the normative mean on the Primary Test of Nonverbal Intelligence (PTONI; Ehrler & McGhee, Reference Ehrler and McGhee2008), and at or above –1 SD of the normative mean on the Goldman–Fristoe test of articulation (Goldman & Fristoe, Reference Goldman and Fristoe2000). Children in the SLI group performed at or below –1 SD of the normative mean on the syntax portion of the Diagnostic Evaluation of Language Variation—Norm Referenced (DELV-NR; Seymour, Roeper, & de Villiers, Reference Seymour, Roeper and de Villiers2005), while those in the TD group performed above this cutoff. Although not used to exclude or classify the participants, all of the children also completed the Peabody Picture Vocabulary Test IV (PPVT; Dunn & Dunn, Reference Dunn and Dunn2007), and 52 completed the grammar subtests of the Test of Language Development—Primary 4 (TOLD; Newcomer & Hammill, Reference Newcomer and Hammill2008), which was added in the final years of data collection. Each child with SLI was matched to a TD control based on dialect spoken, age, and nonverbal IQ (PTONI), and then as much as it was possible, maternal educational level. Maternal education was gathered via parental questionnaire, and measured by number of years of formal education (e.g., 12 = completion of high school).

Materials

The size judgment task consisted of three lists at each of five list lengths, ranging from lengths of 2 to 6 words. Thus, there were a total of 60 words across the lists; each word was assigned to a particular list of a particular list length. All words were one- or two-syllable concrete nouns whose size should be known to kindergarteners (e.g., penny, book, and coat). Since the task involved reordering the lists in terms of size, lists were devised so that the correct ordering of smallest to largest size was fairly obvious, and this was validated by the researchers who scored the task. With the exception of one list of length 2, none of the lists was presented in the smallest to largest order of the objects, thus reordering was necessary to give the lists back in order of size. The one list of length 2 that was given in smallest to largest order was used to be sure children did not think they should simply give the words back in reverse order. Words were digitally recorded by a southern African American female native speaker of English and presented via a laptop computer, with a 500-ms transition between each word in a list.

Procedure

The study was approved by the Louisiana State University Institutional Review Board, and parental consent and child assent was obtained for all participants. Children were tested across multiple days, with the size judgment task given after standardized testing and a language sample. As part of a larger study, the children also completed grammar probes and a sentence recall task. The size judgment task, because it took less than 10 min, was fit in around these other tasks, generally at the end or near the end of testing. All standardized tests were administered as recommended. The size judgment task was administered according to a script that explained that the task was to listen to the list, and then say it back to the experimenter in order of the size of the physical object, starting with the smallest, and proceeding to the largest. There were three practice lists of length 2, where the experimenter explicitly asked the children which item was smaller, and then which one was larger. The child was then asked to put the two words together in that order. After these three explicitly guided practice lists, the children did three additional practice lists of length 2 with corrective feedback. Across the six practice lists, four had items arranged largest first, and two had the smallest first. After the practice lists, children started the experimental lists, starting with three lists of length 2, with lists increasing to a maximum of length 6. Lists were given to all children in the same order. The children's responses were recorded online by the experimenter, as well as digitally recorded for later checking.

Scoring

The size judgment task was scored in two ways using an all or none method and a partial credit method. Previous comparisons between these scoring systems has shown that partial credit scoring generally demonstrates more sensitivity and shows higher correlations to other variables of interest (Conway et al., Reference Conway, Kane, Bunting, Hambrick, Wilhelm and Engle2005; Friedman & Miyake, Reference Friedman and Miyake2005; Giofrè, & Mammarella, Reference Giofrè and Mammarella2014; St. Clair-Thompson, & Sykes, Reference St. Clair-Thompson and Sykes2010). Greatest list length, an all or none method used by Montgomery (Reference Montgomery2000a, Reference Montgomery2000b), awarded children credit for the highest list length in which two of the three lists had all the items recalled, and in the correct reorder. If children failed to correctly recall and reorder the lists at length 2, they earned a score of 1. Scores ranged from 1 to 3 for those with SLI and 1 to 4 for the TD children. Total links, the partial credit scoring method developed for this paper, awarded 1 point every time the order of recall for a word pair from the list went from small to large; no points were awarded when it went from large to small. For example, if children were asked to reorder the list “pony, ring, wolf, ocean, chicken, house” and said “ring, ocean, chicken, pony, wolf, house,” they earned 3 points (ring to ocean, chicken to pony, wolf to house). If they recalled “ocean, wolf, ring, chicken, house, pony,” they earned 2 points (ring to chicken and chicken to house). If they did not recall all the words, links were still scored; for example, if they said “ocean, pony, house,” they earned 1 point (pony to house). Consonant with the high score on the greatest list length measure being 4, inspection of the results showed that performance on the task tended to fall after list length 4. We therefore computed the total links score considering only lists lengths 2 through 4.Footnote 2 Scores ranged from 1 to 13 for children with SLI and from 1 to 17 for the TD children.

Finally, nonlist words that the children produced during recall were classified into three subtypes: words from previous lists, words that rhyme with current list words, and other errors (this included repeating a word from the current list more than once, and words that never appeared in the size judgment stimuli). These were totaled across all five list lengths.

Reliability

Reliability of scoring the size judgment task was checked by having a second person independently score 20% of the data, and individual link scoring and nonlist word types were compared. Agreement was high between the two scorers (97%).

RESULTS

Clinical status and dialect effects on matching variables and standardized tests

Group profiles of the children by dialect and clinical status are presented in Table 1. As reported in Oetting et al. (Reference Oetting, McDonald, Seidel and Hegarty2016), we first analyzed the matching variables of age, PTONI, and maternal education in a 2 (clinical status: SLI vs. TD) × 2 (dialect: SWE vs. AAE) between-subjects analysis of variance (ANOVA). There were no main effects of clinical status or dialect or their interaction for age or PTONI. For maternal education, there was a main effect of clinical status, with the level of the SLI group less than that of the TD group, F (1, 98) = 4.96, p = .028, ηp 2 = 0.05. Thus, matching for maternal education level was not completely successful, but the effect size was small.

Table 1. Characteristics and scores on standardized tests of SLI and TD groups by dialect

Note: SLI, specific language impairment; TD, typically developing; SWE, Southern White English; AAE, African American English; PTONI, Primary Test of Nonverbal Intelligence; DELV-NR, Diagnostic Evaluation of Language Variation—Norm Referenced; TOLD, Test of Language Development; PPVT, Peabody Picture Vocabulary Test; DELV-S dialect density, DELV Screening Test language variation subsection.

a Male/female ratio.

All of the standardized tests also showed main effects of clinical status with the SLI group earning lower scores than the TD group: DELV-NR, F (1, 102) = 328.80, p < .001, ηp 2 = 0.76; TOLD, F (1, 48) = 118.61, p < .001, ηp 2 = 0.71; and PPVT, F (1, 102) = 122.32, p < .001, ηp 2 = 0.55. The PPVT also showed a main effect for dialect (AAE < SWE), F (1,102) = 5.20, p = .03 ηp 2 = 0.05. Knowledge-based tests, such as vocabulary, often show such cultural, socioeconomic, or linguistic group differences (see Engel et al., Reference Engel, Santos and Gathercole2008; see also Qi, Kaiser, Milan, & Hancock, Reference Qi, Kaiser, Milan and Hancock2006; Restrepo et al., Reference Restrepo, Schwanenflugel, Blake, Neuharth-Pritchett, Cramer and Ruston2006), and we confirmed such findings here.

Next we applied the same analysis to the variable of nonmainstream dialect density. Here there were main effects for clinical status, F (1, 102) = 52.59, p < .001, ηp 2 = 0.34, and dialect, F (1, 102) = 23.83, p < .001, ηp 2 = 0.19, and an interaction, F (1, 102) = 19.98, p < .001, ηp 2 = 0.16. The interaction was due to a density difference for the TD (AAE > SWE); F (1, 51) = 28.01, p < .001, ηp 2 = 0.36, but not SLI groups, F (1, 51) = 0.19, p = .66, ηp 2 = 0.004. This finding in the TD group was not unexpected because differences in rates of use is one of the primary ways in which AAE and SWE differ from each other (Oetting, Reference Oetting and Lanehart2015; see Cleveland & Oetting, Reference Cleveland and Oetting2013; Oetting & Newkirk, Reference Oetting and Newkirk2008). Failure to find a density difference in the SLI group may be because both SWE and AAE speakers were near ceiling on this measure.Footnote 3 Examining the interaction from within each dialect, we see there is an effect of clinical status for both the SWE speakers, F (1, 34) = 57.46, p < .001, ηp 2 = 0.63, and the AAE speakers, F (1, 68) = 5.44, p = .023, ηp 2 = 0.07, although it was stronger in the SWE speakers. This clinical status difference in dialect density is unexpected. We therefore examine the effects of dialect density in the correlational analyses reported later, and detail possible reasons for differences in dialect density by clinical status in the discussion.

Clinical status and dialect effects on working memory

Next we turned to our main question: whether the working memory task showed an effect of clinical status that was equivalent across the dialects. For both greatest list length and total links scoring methods, we performed a 2 (clinical status) × 2 (dialect) between-subjects ANOVA and examined how well the measure allowed us to classify the children into the two clinical groups.

Greatest list length

There was a main effect for clinical status, with the SLI group earning lower scores (M = 1.60, SD = 0.60) than the TD group (M = 2.15, SD = 0.74), F (1, 102) = 20.99, p < .001, ηp 2 = 0.17. The interaction between clinical status and dialect, while not reaching conventional levels of significance, F (1, 102) = 3.54, p = .063, ηp 2 = 0.03, echoed similar tendencies found by Rodekohr and Haynes (Reference Rodekohr and Haynes2001). When tested separately within dialects, clinical status remained significant for both dialects, although the effect tended to be larger in SWE, F (1, 34) = 14.70, p < .001 ηp 2 = 0.30; SLI M = 1.44, SD = 0.51; TD M = 2.33, SD = 0.84, than in AAE, F (1, 68) = 5.58, p = .021, ηp 2 = 0.08; SLI M = 1.69; SD = 0.63; TD M = 2.06, SD = 0.68.

In classifying the children into those with and without SLI, a score of 2 on the 1 to 4 scale was identified as the optimal cut point. It classified 63% of the children correctly, with sensitivity (94%) being excellent, but specificity (32%) being poor. Classification was better for SWE speakers (sensitivity 100%, specificity 44%) than for AAE speakers (sensitivity 91%; specificity 26%).

Total links

Again, there was a main effect of clinical status, with the SLI group earning lower scores (M = 6.70, SD = 2.53) than the TD group (M = 9.87, SD = 3.23), F (1, 102) = 35.25, p < .001, ηp 2 = 0.26. Although the interaction between clinical status and dialect was not statistically reliable, F (1, 102) = 2.93, p = .09, ηp 2 = 0.03, the effect again tended to be larger in SWE speakers, F (1, 34) = 30.01, p < .001, ηp 2 = 0.47; SLI M = 6.50, SD = 1.92; TD M = 11.00, SD = 2.91, than in AAE speakers, F (1, 68) = 11.62, p = .001, ηp 2 = 0.15; SLI M = 6.80, SD = 2.82; TD M = 9.29, SD = 3.27. In addition, independent t tests showed that the SLI group earned lower link scores than the TD group at each list length (all ts ≤ –3.86, all ps < 001). Thus, even at the short list length of two words, children with SLI were different from TD children.

In terms of classification, the total links measure was superior to the greatest list length measure, and it correctly classified 75% of the children by clinical status. A cut point at 8 total links yielded a sensitivity of 77% and a specificity of 72%. Classification again was better for SWE speakers (sensitivity 83%, specificity 83%) than for AAE speakers (sensitivity 74%, specificity 66%).

Nonlist words

Nonlist words produced during the working memory task were analyzed in a 2 (clinical status) × 2 (dialect) × 3 (error type) ANOVA. There was a main effect of clinical status; as predicted, the SLI group produced more nonlist words (M = 8.38, SD = 6.21) than the TD group (M = 6.04, SD = 5.23), F (1, 102) = 4.65, p = .033, ηp 2 = 0.04. Although the AAE speakers tended to produce more nonlist words (M = 7.89, SD = 6.26) than the SWE speakers (M = 5.89, SD = 4.70), the main effect of dialect did not reach statistical significance, F (1, 102) = 2.91, p = .091, ηp 2 = 0.03. There was also a main effect of error type, F (2, 204) = 39.91, p < .001, ηp 2 = 0.28. Rhyming errors (M = 0.66, SD = 0.91) were less frequent than the other two types of errors, other errors (M = 2.92, SD = 3.25) and words from a prior list (M = 3.62, SD = 3.30); the latter two did not differ statistically by a Bonferroni corrected post hoc test. This pattern of results occurred for both dialects; there were no significant interactions.

Correlations with standardized tests

Next, we examined the correlations between the two scoring methods for the working memory task and the standardized test measures as well as both of these to nonmainstream density. These are shown below the diagonal in Table 2. As expected from previous research (e.g., Friedman & Miyake, Reference Friedman and Miyake2005), the two scoring methods were significantly intercorrelated. In examining correlations of working memory to the standardized language and intelligence measures, we see that both working memory scoring methods showed strong correlations to measures of syntax (DELV-NR and TOLD) and vocabulary (PPVT) but not intelligence, as measured by the PTONI.

Table 2. Correlations between two scoring methods for working memory task and standardized test measures and among these two measures and nonmainstream density

Note: Correlations below the diagonal are the measures of working memory, language and intelligence measures, and dialect density. Correlations above the diagonal are with dialect density partialled out. GLL, Greatest List Length; DELV-NR, Diagnostic Evaluation of Language Variation—Norm Referenced; TOLD, Test of Language Development; PPVT, Peabody Picture Vocabulary Test; PTONI, Primary Test of Nonverbal Intelligence; DELV-S dialect density, DELV Screening Test language variation subsection.

*p ≤ .05. **p ≤ .01. ***p ≤ .001.

However, note that children's nonmainstream densities also were significantly negatively correlated to the two working memory scoring methods. This indicates that the higher the child's density, the lower the working memory score. In order to check that this was not due to the higher densities found in the children with SLI as compared to TD children, we ran a partial correlation. The relationship between density and working memory held true with clinical status category partialled out (greatest list length r = –.21, p = .028; total links r = –.27, p = .005). Conversely, the relationship between clinical status group and working memory held true with density partialled out (greatest list length r = .26, p = .007; total links r = .36, p < .001).

In addition, the children's nonmainstream dialect densities correlated to their syntax and vocabulary measures. These correlations were also negative, showing that children with higher densities earned lower scores on these measures. Here again we checked these relationships with clinical status partialled out, and found only the relationship between dialect density and PPVT remained significant (r = –.30, p = .002). Conversely, the relationship between clinical status and the DELV-NR syntax measure (r = .85, p < .001), TOLD (r = .86, p < .001), and the PPVT (r = .67, p < .001) remained when dialect density was partialled out.

We next calculated the correlations between the working memory scores and the standardized tests of language and intelligence with nonmainstream dialect density partialled out to see if correlations between working memory and the standardized tests still held. These are shown above the diagonal in Table 2. Even with density partialled out, there was a significant positive correlation between both working memory scoring methods and the measures of syntax (DELV-NR and TOLD) and between total links and the vocabulary measure (PPVT). Thus, the relationships between working memory, syntax, and vocabulary were not solely attributable to the children's nonmainstream dialect densities.

We then divided the participants by dialect and examined the correlations in these subgroups (see Table 2); we had predicted similar patterns of correlations across the two dialects. In the correlations shown below the diagonal, we see significant positive correlations between one or both of the working memory scoring methods and the DELV-NR syntax measure, and the PPVT vocabulary measure for speakers of both dialects. The correlation with working memory was not significant for the TOLD in the SWE group, probably due to the low number of participants on this test. Dialect density was generally negatively correlated with both working memory scores and standardized tests scores within these subpopulations. Looking above the diagonals, we see the correlations with density partialled out in each dialect group. For the SWE speakers, a significant connection was still seen between the total links working memory score and the DELV-NR syntax measure as well as the PPVT measure of vocabulary. For AAE speakers, significant partial correlations were seen between both working memory scores and the DELV-NR syntax measure and between the total links measure and the TOLD. Thus when considering the partial correlations, both dialect groups showed connections between working memory scores and syntax measures of a similar strength; the relationship between working memory and vocabulary held for only SWE speakers.

DISCUSSION

We examined if a verbal working memory measure would prove to be linguistically unbiased by looking at the performance of speakers of two nonmainstream dialects. AAE and SWE were chosen as their dialectal forms strongly overlap with those produced by children with SLI, making it hard to tell dialect from disorder, and because they also differ in the density with which such forms are produced. If verbal working memory proves to be linguistically unbiased, it should be able to distinguish equally well between children with SLI and TD children in each dialect as well as yield similar patterns of nonlist errors and show similar correlations to standardized tests of language and nonverbal intelligence in both dialects. In addition to dialect type, we also examined whether the density with which a dialect is spoken affects performance. To be linguistically unbiased, nonmainstream dialect density should have no effect on performance either between or within dialects.

Scores on the working memory measure, size judgment, were significantly lower for children with SLI than age-matched TD children, and this was true for speakers of both dialects. This held for both the all or none scoring method (greatest list length) and the partial credit method (total links), although consonant with previous studies investigating scoring methods (Conway et al., Reference Conway, Kane, Bunting, Hambrick, Wilhelm and Engle2005; Friedman & Miyake, Reference Friedman and Miyake2005; Giofrè, & Mammarella, Reference Giofrè and Mammarella2014; St. Clair-Thompson, & Sykes, Reference St. Clair-Thompson and Sykes2010), the partial scoring method was generally superior. Specifically, it accounted for more variance in the ANOVA (links ηp 2 = 0.26; greatest list length ηp 2 = 0.17), and it correctly classified more children (links 75%; greatest list length 63%). The ability of the size judgment test to distinguish nonmainstream dialect speaking children with SLI from TD children is consistent with the two other studies which examined working memory in terms of listening span in minority populations (Laing & Kamhi, Reference Laing and Kamhi2003; Rodekohr & Haynes, Reference Rodekohr and Haynes2001).

We can also compare our results to those of Montgomery (Reference Montgomery2000a, Reference Montgomery2000b; Montgomery et al., Reference Montgomery, Evans and Gillam2009), who used a size judgment task in mainstream English speakers, aged 7 to 10 years. They found that the task could differentiate between SLI and TD groups only when two levels of reordering (size and semantic category) were involved. With our younger age group, one level of reordering was sufficient to show group differences. We note that we did not have a group of mainstream English-speaking children in our study, so we cannot say whether or not our nonmainstream dialect-speaking children would score differently than mainstream speakers on the size judgment task, but we did find that the size judgment task can distinguish between children with SLI and TD children as Montgomery et al. found.

Across both nonmainstream dialects, we also found that children with SLI gave more nonlist words in the size judgment task than did TD children. Previous research using listening span tasks with mainstream English-speaking children (Ellis Weismer et al., Reference Ellis Weismer, Evans and Hesketh1999; Marton & Eichorn, Reference Marton and Eichorn2014; Marton & Schwartz, Reference Marton and Schwartz2003; Marton et al., Reference Marton, Schwartz, Farkas and Katsnelson2006, Reference Marton, Kelmenson and Pinkhasova2007) also found a higher number of nonlist words given by children with SLI than TD children, and thus we replicated this finding.

Finally, we found good evidence that working memory correlated with measures of syntax and vocabulary; this held true in each dialect as well. This is consistent with the findings of others using mainstream English speakers (Adams et al., Reference Adams, Bourke and Willis1999; Magimairaj & Montgomery, Reference Magimairaj and Montgomery2012). However, we did not find good evidence for this relationship with nonverbal intelligence. It is possible since all of our participants had a score on the PTONI of at least 82, that restriction of range is preventing us from finding this correlation. Using a group of children with a broader range of ability, such a correlation was found for SWE-speaking children in our lab (McDonald, Seidel, Porter, Oetting, & Hegarty, Reference McDonald, Seidel, Porter, Oetting and Hegarty2011).

Unique to our study was the inclusion of children who spoke one of two nonmainstream dialects. As expected, nonmainstream dialect density differed across these dialects, with the AAE speakers producing higher rates than the SWE speakers. Nonmainstream dialect density also differed within each dialect, with individuals varying in use, from low to moderate to high rates. We found, contrary to our hypothesis, that nonmainstream density was significantly negatively correlated to working memory performance; across dialects this was true when SLI status was partialled out, and it was also true within each dialect. Because of the intercorrelations that were observed between the children's nonmainstream dialect densities, working memory scores, and standardized test scores, we reran the correlations with dialect density partialled out. When both dialect groups were considered together, significant partial correlations were found between working memory scores and measures of syntax and vocabulary. When each dialect group was considered separately, significant partial correlations were found between working memory scores and syntax, and the magnitude was similar across the dialects (e.g., partial correlation between total links and DELV-NR Syntax was .34 for SWE and .38 for AAE). However, we only found evidence of a correlation with vocabulary in SWE children. Recall, however, that the PPVT did not prove itself to be free of linguistic biases either in terms of type of dialect or nonmainstream dialect density, and this may account for this lack of correlation in the AAE group.

Our working memory measure was not nonmainstream dialect density neutral. This finding is parallel to that found by Moyle et al. (Reference Moyle, Heilmann and Finneran2014), who also found an effect of children's nonmainstream dialect density on a different processing-based task, that of nonword repetition. Both size judgment and nonword repetition involve perceiving and repeating phonological strings, and may therefore be influenced by phonological or phonotactic factors that may differ in speakers with higher dialect densities (Brown, Reference Brown2011; Edwards, Beckman, & Munson, Reference Edwards, Beckman and Munson2004; Edwards et al., Reference Edwards, Gross, Chen, MacDonald, Kaplan, Brown and Seidenberg2014).

Nonmainstream dialect density could also be negatively related to verbal working memory performance for reasons beyond phonology, because it is likely correlated to the amount of exposure to formal education and to the speed of access to mainstream lexical items. In AAE-speaking children, nonmainstream dialect density has been shown to decrease as formal education increases, especially between kindergarten and first grade (Craig & Washington, Reference Craig and Washington2004). Recall that our participants were kindergarteners who lived in a rural community. As such, differences in the children's nonmainstream dialect densities could partially reflect how much exposure they have had to mainstream English, most likely in a formal educational setting. At least in adults, the size judgment task has been shown to be sensitive to differences in formal educational levels (Cherry et al., Reference Cherry, Elliott and Reese2007). In addition, children with high dialect density could be having some of the same difficulties a bilingual has with linguistically based working memory measures that necessitate lexical access. Hansen et al. (Reference Hansen, Macizo, Duñabeitia, Saldaña, Carreiras, Fuentes and Bajo2016) showed that even when tested in their first language, children who were immersed in a second language in elementary school showed deficits compared to monolinguals on a first language reading span task early on in the immersion experience, and these deficits were correlated to their speed of lexical access. They did not show this deficit on another measure of working memory, the n-back task, which is not highly linguistically loaded, indicating the deficit was specific to tests with high lexical access demands. Thus, demands of mainstream lexical access or other working memory demands may be impeding children with high dialect density from performing as well as lower dialect density children on the size judgment task.

Recall also that within both dialect groups, nonmainstream dialect density scores were highest for the children with SLI, and density differences between those with and without SLI were larger for the SWE than AAE groups. While this finding needs to be confirmed with other groups of nonmainstream English child speakers, it is possible that children with SLI, relative to their TD peers, are less able to shift their dialects to a more mainstream variety when engaged in school-based tasks (Craig & Washington, Reference Craig and Washington2004). Alternatively, or in addition, it is possible that the language variation portion of the DELV-S is not ability neutral across dialects. In support of this possibility, the dialect portion of the DELV-S includes eight items that target third-person marking, and at least one study has shown children with and without SLI to differ in their marking of this grammatical structure in SWE but not AAE (Cleveland & Oetting, Reference Cleveland and Oetting2013).

Are processing-based measures free of cultural and/or linguistic bias?

Although the working memory task differentiated between children with and without SLI in the two nonmainstream dialects studied here, there were several findings that suggested that this processing-based measure was not entirely free of cultural and/or linguistic biases. First, although not statistically significant, clinical status effects tended to be larger for SWE than AAE speakers, a finding similar to that of Rodekohr and Haynes (Reference Rodekohr and Haynes2001). Finding such a tendency across studies, even though none reached the conventional level of significance, should at least give us pause before we confidently conclude that processing-based measures are free of linguistic biases. Second, there were strong correlations between the children's working memory scores and their nonmainstream dialect densities, with higher densities corresponding to poorer working memory scores. It is possible that the tendency for stronger differences between TD children and children with SLI in SWE speakers than in AAE speakers is actually an effect of nonmainstream dialect density rather than dialect type. This would not be surprising, as one of the major differences between AAE and SWE is the frequency with which dialectal forms are used. To test this idea, we reran the ANOVA analyses on working memory span with dialect density as a covariate. There were no longer any tendencies for interactions between clinical status and dialect type (ps > .31), indicating that the children's nonmainstream dialect density rather than their dialect type may have been the operational factor.

Taken together, the current study and these previous studies suggest that processing-based tasks involving verbal materials may not be free from all cultural and/or linguistic biases. In terms of practical use of such tests for diagnosis and research, it may be fruitful to go beyond tests of verbal working memory such as listening span or size judgment, and look at tests of nonverbal working memory. There is evidence that children with SLI also show deficits in working memory using nonverbal items (Henry, Messer, & Nash, Reference Henry, Messer and Nash2012; Marton, Reference Marton2008); it would be important to assess such measures for bias in speakers of nonmainstream dialects.

Finally, it is clear that in looking for possible linguistic biases in testing materials, either standardized or processing based, it is important to look not only at the type of dialect spoken, but also at the nonmainstream dialect density, both between dialects and by individuals within any dialect. This is especially important as effects of nonmainstream density have been found on phonological processing and lexical access (Brown, Reference Brown2011; Edwards et al., Reference Edwards, Gross, Chen, MacDonald, Kaplan, Brown and Seidenberg2014), word reading and literacy (Connor & Craig, Reference Connor and Craig2006; Craig et al., Reference Craig, Thompson, Washington and Porter2004; Terry et al., Reference Terry, Connor, Thomas-Tate and Love2010) as well as processing-based tasks (Moyle et al., Reference Moyle, Heilmann and Finneran2014). We would therefore advocate that researchers include a measure of nonmainstream dialect density when investigating linguistic bias. It need not be a labor-intensive measure, as we found our results with the easily administered and scored language variation portion of the DELV-S (Seymour et al., Reference Seymour, Roeper and de Villiers2003). It is also important that more research be done on measures of nonmainstream dialect density to be sure they are ability neutral.

In summary, this study adds to a body of literature that finds working memory deficits in children with SLI when compared to age-matched controls, by replicating the findings of lower span scores and higher number of nonlist words in speakers of two nonmainstream English dialects. In addition, it is the first study to look at the effect of children's dialects on a verbal working memory measure by measuring children's dialects in two ways, by type of dialect (AAE vs. SWE) and by nonmainstream dialect density (low vs. medium vs. high). While dialect type was not found to affect the children's span scores at a statistically reliable level, nonmainstream dialect density did. We found a complex relationship between children's nonmainstream dialect density, working memory capacity, and performance on standardized tests. The results raise the possibility that like many standardized language measures, processing-based measures (at least those involving verbal stimuli) may not be as free from cultural and/or linguistic bias as is often purported. Similarly, the results raise the possibility that nonmainstream dialect density measures such as the DELV-S may also not be as ability neutral as many who have used this tool to index dialect differences between and within groups assume.

ACKNOWLEDGMENTS

Funding for this study was provided through NIDCD RO1DC009811. We appreciate the assistance of Jessica Berry, Kyomi Gregory, Ryan James, Christy Moland, Karmen Porter, Andrew Rivière, Tina Villa, and a number of others who helped create the stimuli and collect the data. We also thank the teachers, families, and children who participated in the study.

Footnotes

1. Specifically, while the correlation coefficients vary slightly depending on scoring method used, they have similar levels of significance to each of the other variables in the correlation matrix later given in Table 2.

2. Similar results in the analyses were generally found for total link scores over list lengths 2 to 6. However, scores were less skewed when only including list lengths 2 to 4, possible because some children were giving up or getting frustrated at the higher list lengths.

3. This explanation of ceiling performance is confirmed when looking at the continuous rather than categorical scoring of dialect density. The continuous scoring method still showed the main effects for clinical status, F (1, 102) = 50.23, p < .001, ηp 2 = 0.33, and dialect, F (1, 102) = 44.67, p < .001, ηp 2 = 0.30, and an interaction, F (1, 102) = 26.95, p < .001, ηp 2 = 0.14. But when examining the children with SLI alone, the SWE speakers with SLI (M = 0.79, SD = 0.17) did show a lower dialect density than AAE speakers with SLI (M = 0.89, SD = 0.13), F (1, 51) = 6.06, p = .017, ηp 2 = 0.11.

References

REFERENCES

Adams, A., Bourke, L., & Willis, C. (1999). Working memory and spoken language comprehension in young children. International Journal of Psychology, 34, 364373.Google Scholar
Alloway, T. P., Gathercole, S. E., Willis, C., & Adams, A. (2004). A structural analysis of working memory and related cognitive skills in young children. Journal of Experimental Child Psychology, 87, 85106.Google Scholar
Barbu, S., Martin, N., & Chevrot, J. (2014). Maintenance of regional dialects: A matter of gender? Boys, but not girls, use local varieties in relation to their friend's nativeness and local identity. Frontiers in Psychology, 5, 111.Google Scholar
Briscoe, J., & Rankin, P. M. (2009). Exploration of a “double-jeopardy” hypothesis within working memory profiles for children with specific language impairment. International Journal of Language & Communication Disorders, 44, 236250.Google Scholar
Brown, M. C. (2011). Dialect and lexical access: An investigation into dialect density, dialect environment and word knowledge (Unpublished doctoral dissertation, University of Wisconsin, Madison).Google Scholar
Campbell, T., Dollaghan, C., Needleman, H., & Janosky, J. (1997). Reducing bias in language assessment: Processing-dependent measures. Journal of Speech and Hearing Research, 40, 519525.Google Scholar
Charity, A. H., Scarborough, H. S., & Griffin, D. (2004). Familiarity with “school English” in African-American children and its relation to early reading achievement. Child Development, 75, 13401356.CrossRefGoogle ScholarPubMed
Cherry, K. E., Elliott, E. M., & Reese, C. M. (2007). Age and individual differences in working memory: The size judgment span task. Journal of General Psychology, 134, 4365.Google Scholar
Cherry, K. E., & Park, D. C. (1993). Individual difference and contextual variables influence spatial memory in younger and older adults. Psychology and Aging, 8, 517526.Google Scholar
Cleveland, L. H., & Oetting, J. B. (2013). Children's marking of verbal –s by nonmainstream English dialect and clinical status. American Journal of Speech-Language Pathology, 22, 604614.Google Scholar
Connor, C. M., & Craig, H. K. (2006). African American preschoolers’ language, emergent literacy skills, and use of African American English: A complex relation. Journal of Speech, Language, and Hearing Research, 49, 771792.Google Scholar
Conway, A. R. A., Kane, M. J., Bunting, M. F., Hambrick, D. Z., Wilhelm, O., & Engle, R. W. (2005). Working memory span tasks: A review and a user's guide. Psychonomic Bulletin and Review, 12, 769786.Google Scholar
Craig, H. K., Kolenic, G. E., & Hensel, S. L. (2014). African American English-speaking students: A longitudinal examination of style shifting from kindergarten through second grade. Journal of Speech, Language, and Hearing Research, 57, 143157.Google Scholar
Craig, H. K., Thompson, C. A., Washington, J. A., & Porter, S. L. (2004). Performance of elementary-grade African American students on the Gray Oral Reading Tests. Language, Speech, Hearing Services in Schools, 34, 141154.Google Scholar
Craig, H. K., & Washington, J. A. (2000). An assessment battery for identifying language impairments in African American children. Journal of Speech, Language, and Hearing Research, 43, 366379.CrossRefGoogle ScholarPubMed
Craig, H. K., & Washington, J. A. (2004). Grade-related changes in the production of African American English. Journal of Speech, Language, and Hearing Research, 47, 450463.CrossRefGoogle ScholarPubMed
Craig, H. K., Zhang, L., Hensel, S. L., & Quinn, E. J. (2009). African American English-speaking students: An examination of the relationship between dialect shifting and reading outcomes. Journal of Speech, Language, And Hearing Research, 52, 839855.Google Scholar
Dollaghan, C., & Campbell, T. F. (1998). Nonword repetition and child language impairment. Journal of Speech, Language, and Hearing Research, 41, 11361146.Google Scholar
Dunn, L. M., & Dunn, D. M. (2007). Peabody Picture Vocabulary Test (4th ed.). Toronto: Pearson Education.Google Scholar
Edwards, J., Beckman, M. E., & Munson, B. (2004). The interaction between vocabulary size and phonotactic probability effects on children's production accuracy and fluency in nonword repetition. Journal of Speech, Language, and Hearing Research, 47, 421436.Google Scholar
Edwards, J., Gross, M., Chen, J., MacDonald, M. C., Kaplan, D., Brown, M., & Seidenberg, M. S. (2014). Dialect awareness and lexical comprehension of mainstream American English in African American English-speaking children. Journal of Speech, Language, and Hearing Research, 57, 18831895.CrossRefGoogle ScholarPubMed
Ehrler, D. J., &; McGhee, R. L. (2008). Primary Test of Nonverbal Intelligence. Austin, TX: PRO-ED.Google Scholar
Ellis Weismer, S., Evans, J., & Hesketh, L. J. (1999). An examination of verbal working memory capacity in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 42, 12491260.Google Scholar
Ellis Weismer, S., Plante, E., Jones, M., & Tomblin, J. B. (2005). A functional magnetic resonance imaging investigation of verbal working memory in adolescents with specific language impairment. Journal of Speech, Language, and Hearing Research, 48, 405425.Google Scholar
Engel, P. J., Santos, F. H., & Gathercole, S. E. (2008). Are working memory measures free of socioeconomic influence? Journal of Speech, Language, and Hearing Research, 51, 15801587.Google Scholar
Engel de Abreu, P., Conway, A. A., & Gathercole, S. E. (2010). Working memory and fluid intelligence in young children. Intelligence, 38, 552561.Google Scholar
Engel de Abreu, P., Gathercole, S. E., & Martin, R. (2011). Disentangling the relationship between working memory and language: The roles of short-term storage and cognitive control. Learning and Individual Differences, 21, 569574.Google Scholar
Engel de Abreu, P. M. J., Puglisi, M. L., Cruz-Santos, A., Befi-Lopes, D. M., & Martin, R. (2014). Effects of impoverished environmental conditions on working memory performance. Memory, 22, 323331.Google Scholar
Friedman, N. P., & Miyake, A. (2005). Comparison of four scoring methods for the reading span test. Behavior Research Methods, 37, 581590.Google Scholar
Frizelle, P., & Fletcher, P. (2015). The role of memory in processing relative clauses in children with specific language impairment. American Journal of Speech-Language Pathology, 24, 4759.Google Scholar
Garrity, A. W., & Oetting, J. B. (2010). Auxiliary BE production by AAE-speaking children with and without specific language impairment. Journal of Speech, Language, and Hearing Research, 53, 13071320.Google Scholar
Giofrè, D., & Mammarella, I. C. (2014). The relationship between working memory and intelligence in children: Is the scoring procedure important? Intelligence, 46, 300310.Google Scholar
Goldman, R., & Fristoe, M. (2000) Goldman-Fristoe Test of Articulation (2nd ed.). Circle Pines, MN: American Guidance Services.Google Scholar
Haake, M., Hansson, K., Gulz, A., Schötz, S., & Sahlén, B. (2014). The slower the better? Does the speaker's speech rate influence children's performance on a language comprehension test? International Journal of Speech-Language Pathology, 16, 181190.Google Scholar
Hansen, L. B., Macizo, P., Duñabeitia, J. A., Saldaña, D., Carreiras, M., Fuentes, L. J., & Bajo, M. T. (2016). Emergent bilingualism and working memory development in school aged children. Language Learning, 64(Suppl. 2), 5175.Google Scholar
Henry, L. A., Messer, D. J., & Nash, G. (2012). Executive functioning in children with specific language impairment. Journal of Child Psychology and Psychiatry, 53, 3745.Google Scholar
Horton, R., & Apel, K. (2014). Examining the use of spoken dialect indices with African American children in the Southern United States. American Journal of Speech-Language Pathology, 23, 448460.Google Scholar
Ivy, L. J., & Masterson, J. J. (2011). A comparison of oral and written English styles in African American students at different stages of writing development. Language, Speech, Hearing Services in Schools, 42, 3140.Google Scholar
Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: Individual differences in working memory. Psychological Review, 99, 122149.Google Scholar
Laing, S. P., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and linguistically diverse populations. Language, Speech, and Hearing Services in Schools, 34, 4455.Google Scholar
Leonard, L. (2014). Children with specific language impairment. Cambridge, MA: MIT Press.Google Scholar
Lum, J. G., Conti-Ramsden, G., Page, D., & Ullman, M. T. (2012). Working, declarative and procedural memory in specific language impairment. Cortex, 48, 11381154.Google Scholar
Magimairaj, B. M., & Montgomery, J. W. (2012). Children's verbal working memory: Role of processing complexity in predicting spoken sentence comprehension. Journal of Speech, Language, and Hearing Research, 55, 669682.Google Scholar
Mainela-Arnold, E., & Evans, J. L. (2005). Beyond capacity limitations: Determinants of word recall performance on verbal working memory span tasks in children with SLI. Journal of Speech, Language, and Hearing Research, 48, 897909.Google Scholar
Mainela-Arnold, E., Evans, J. L., & Coady, J. (2010). Beyond capacity limitations: II. Effects of lexical processes on word recall in verbal working memory tasks in children with and without specific language impairment. Journal of Speech, Language, and Hearing Research, 53, 16561672.Google Scholar
Marton, K. (2008). Visuo-spatial processing and executive functions in children with specific language impairment. International Journal of Language & Communication Disorders, 43, 181200.Google Scholar
Marton, K., & Eichorn, N. (2014). Interaction between working memory and long-term memory: A study in children with and without language impairment. Zeitschrift für Psychologie, 222, 9099.CrossRefGoogle Scholar
Marton, K., Kelmenson, L., & Pinkhasova, M. (2007). Inhibition control and working memory capacity in children with SLI. Psychologia, 50, 110121.Google Scholar
Marton, K., & Schwartz, R. G. (2003). Working memory capacity and language processes in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 46, 11381153.Google Scholar
Marton, K., Schwartz, R. G., Farkas, L., & Katsnelson, V. (2006). Effect of sentence length and complexity on working memory performance in Hungarian children with specific language impairment (SLI): A cross-linguistic comparison. International Journal of Language & Communication Disorders, 41, 653673.Google Scholar
McDonald, J. L. (2008). Grammaticality judgments in children: The role of age, working memory and phonological ability. Journal of Child Language, 35, 247268.Google Scholar
McDonald, J. L., Seidel, C. M., Porter, K. L., Oetting, J. B., & Hegarty, M. (2011). Size judgment: Working memory and standardized test performance. Poster presented at the 2011 meeting of the American Speech-Language Hearing Association, San Diego, CA.Google Scholar
Mills, M. T. (2015). The effects of visual stimuli on the spoken narrative performance of school-age African American children. Language, Speech, and Hearing Services in Schools, 46, 337351.Google Scholar
Montgomery, J. W. (2000a). Relation of working memory to off-line and real-time sentence processing in children with specific language impairment. Applied Psycholinguistics, 21, 117148.Google Scholar
Montgomery, J. W. (2000b). Verbal working memory and sentence comprehension in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 43, 293308.Google Scholar
Montgomery, J. W., & Evans, J. L. (2009). Complex sentence comprehension and working memory in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 52, 269288.Google Scholar
Montgomery, J. W., Evans, J. L., & Gillam, R. B. (2009). Relation of auditory attention and complex sentence comprehension in children with specific language impairment: A preliminary study. Applied Psycholinguistics, 30, 123151.Google Scholar
Moyle, M. J., Heilmann, J. J., & Finneran, D. A. (2014). The role of dialect density in nonword repetition performance: An examination with at-risk African American preschool children. Clinical Linguistics and Phonetics, 28, 682696.Google Scholar
Newcomer, P. L. & Hammill, D. D. (2008). Test of Language Development—Primary (4th ed.). Austin, TX: PRO-ED.Google Scholar
Newkirk-Turner, B. R., Oetting, J. B., & Stockman, I. J. (2016). Development of auxiliaries by young children learning African American English. Language, Speech, Hearing Services in Schools, 47, 209224.Google Scholar
Oetting, J. B. (2015). Dialect differences between African American English and Southern White English in children. In Lanehart, S.. (Ed.), Oxford handbook of African American language (pp. 512518). New York: Oxford University Press.Google Scholar
Oetting, J. B., & Cleveland, L. H. (2006). The clinical utility of nonword repetition for children living in the rural of the US. Clinical Linguistics & Phonetics, 20, 553561.Google Scholar
Oetting, J. B., Lee, R., & Porter, K. (2013). Evaluating the grammars of children who speak nonmainstream dialects of English. Topics in Language Disorders, 33, 140151.CrossRefGoogle ScholarPubMed
Oetting, J. B., & McDonald, J. L. (2001). Nonmainstream dialect use and specific language impairment. Journal of Speech, Language, and Hearing Research, 44, 207223.Google Scholar
Oetting, J. B., & McDonald, J. L. (2002). Methods for characterizing participants’ nonmainstream dialect use in child language research. Journal of Speech, Language, and Hearing Research, 45, 508518.Google Scholar
Oetting, J. B., McDonald, J. L., Seidel, C. M., & Hegarty, M. (2016). Sentence recall by children with SLI across two nonmainstream dialects of English. Journal of Speech, Language, and Hearing Research, 59, 183194.Google Scholar
Oetting, J. B., & Newkirk, B, L. (2008). Subject relatives by children with and without SLI across different dialects of English. Clinical Linguistics & Phonetics, 22, 111125.Google Scholar
Petruccelli, N., Bavin, E. L., & Bretherton, L. (2012). Children with specific language impairment and resolved late talkers: Working memory profiles at 5 years. Journal of Speech, Language, and Hearing Research, 55, 16901703.CrossRefGoogle ScholarPubMed
Qi, C. H., Kaiser, A. P., Milan, S., & Hancock, T. (2006). Language performance of low-income African American and European American preschool children on the PPVT-III. Language, Speech, Hearing Services, in Schools, 37, 516.Google Scholar
Quail, M., Williams, C., & Leitão, S. (2009). Verbal working memory in specific language impairment: The effect of providing visual support. International Journal of Speech-Language Pathology, 11, 220233.Google Scholar
Restrepo, M. A., Schwanenflugel, P. J., Blake, J., Neuharth-Pritchett, S., Cramer, S. E., & Ruston, H. P. (2006). Performance on the PPVT-III and the EVT: Applicability of the measures with African American and European American preschool children. Language, Speech, and Hearing Services in Schools, 37, 1727.Google Scholar
Rivière, A. M., & Oetting, J. B. (2017). Marking of infinitival TO is influenced by a child's dialect and clinical status. Unpublished manuscript.Google Scholar
Rodekohr, R. K., & Haynes, W. O. (2001). Differentiating dialect from disorder: A comparison of two processing tasks and a standardized language test. Journal of Communication Disorders, 34, 255272.Google Scholar
Schwartz, R. G. (Ed.). (in press). Handbook of child language disorders (2nd ed.). New York: Psychology Press.Google Scholar
Seymour, H. N., Bland-Stewart, L., & Green, L. J. (1998). Difference versus deficit in child African American English. Language, Speech, and Hearing Services in Schools, 29, 96108.Google Scholar
Seymour, H. N., Roeper, T., & de Villiers, J. G. (2003). Diagnostic Evaluation of Language Variation Screening Test. San Antonio, TX: Psychological Corporation.Google Scholar
Seymour, H. N., Roeper, T., & de Villiers, J. G. (2005). Diagnostic Evaluation of Language Variation: Norm-Referenced Test. San Antonio, TX: Psychological Corporation.Google Scholar
St. Clair-Thompson, H., & Sykes, S. (2010). Scoring methods and the predictive ability of working memory tasks. Behavior Research Methods, 42, 969975.Google Scholar
Stockman, I. J., Guillory, B., Seibert, M., & Boult, J. (2013). Toward validation of a minimal competence core of morphosyntax for African American children. American Journal of Speech-Language Pathology, 22, 4056.Google Scholar
Stockman, I. J., Newkirk-Turner, B. L., Swartzlander, E., & Morris, L. R. (2016). Comparison of African American children's performances on a minimal competence core for morphosyntax and the Index of Productive Syntax. American Journal of Speech-Language Pathology, 25, 8096.Google Scholar
Terry, N. P., Connor, C. M., Thomas-Tate, S., & Love, M. (2010). Examining relationships among dialect variation, literacy skills, and school context in first grade. Journal of Speech, Language, and Hearing Research, 53, 126145.Google Scholar
Van Hofwegen, J., & Wolfram, W. (2010). Coming of age in African American English: A longitudinal study. Journal of Sociolinguistics, 14, 427455.Google Scholar
Vugs, B., Hendriks, M., Cuperus, J., & Verhoeven, L. (2014). Working memory performance and executive function behaviors in young children with SLI. Research in Developmental Disabilities, 35, 6274.Google Scholar
Washington, J. A., & Craig, H. K. (1994). Dialectal forms during discourse of poor, urban, African American preschoolers. Journal of Speech and Hearing Research, 37, 816823.Google Scholar
Washington, J. A., & Craig, H. K. (1998). Socioeconomic status and gender influences on children's dialectal variations. Journal of Speech, Language, and Hearing Research, 41, 618626.Google Scholar
Washington, J. A., & Craig, H. K. (2004). A language screening protocol for use with young African American children in urban settings. American Journal of Speech-Language Pathology, 13, 329340.Google Scholar
Wyatt, T. A. (2015). Assessing the language skills of African American English child speakers. In Lanehart, S. (Ed.), The Oxford handbook of African American language (pp. 526546). New York: Oxford University Press.Google Scholar
Figure 0

Table 1. Characteristics and scores on standardized tests of SLI and TD groups by dialect

Figure 1

Table 2. Correlations between two scoring methods for working memory task and standardized test measures and among these two measures and nonmainstream density