Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-02-06T02:52:59.721Z Has data issue: false hasContentIssue false

Effects of bilingualism, noise, and reverberation on speech perception by listeners with normal hearing

Published online by Cambridge University Press:  14 July 2006

CATHERINE L. ROGERS
Affiliation:
University of South Florida
JENNIFER J. LISTER
Affiliation:
University of South Florida
DASHIELLE M. FEBO
Affiliation:
University of South Florida
JOAN M. BESING
Affiliation:
Montclair State University
HARVEY B. ABRAMS
Affiliation:
Department of Veterans Affairs Medical Center, Audiology and Speech Pathology Services
Rights & Permissions [Opens in a new window]

Abstract

This study compared monosyllabic word recognition in quiet, noise, and noise with reverberation for 15 monolingual American English speakers and 12 Spanish–English bilinguals who had learned English prior to 6 years of age and spoke English without a noticeable foreign accent. Significantly poorer word recognition scores were obtained for the bilingual listeners than for the monolingual listeners under conditions of noise and noise with reverberation, but not in quiet. Although bilinguals with little or no foreign accent in their second language are often assumed by their peers, or their clinicians in the case of hearing loss, to be identical in perceptual abilities to monolinguals, the present data suggest that they may have greater difficulty in recognizing words in noisy or reverberant listening environments.

Type
Articles
Copyright
2006 Cambridge University Press

Although speech is typically well understood under quiet conditions and low task demands, many environmental factors such as noise and reverberation negatively affect speech understanding (Crandell & Smaldino, 2000; Nabelek & Mason, 1981). Both noise and reverberation are present to some degree in the listening environments encountered in everyday life (Helfer & Wilbur, 1990). When noise is present in an acoustic environment, it masks the speech signal by obscuring the less intense portions of the signal (Helfer & Wilbur, 1990). The result is a reduction in the redundancy of acoustic and linguistic cues in speech, an effect that increases as the signal to noise ratio (SNR) decreases. That is, performance on speech-perception tasks tends to deteriorate as the SNR decreases (e.g., Miller, Heise, & Lichten, 1951).

Reverberation refers to the persistence of a sound in an enclosed environment. It is measured in reverberation time (RT), the time required for a sound pressure wave of a specific frequency to decay by 60 dB after the signal ceases. Speech perception tends to deteriorate as RT increases (e.g., Moncur & Dirks, 1967; Steinberg, 1929). Although both noise and reverberation can degrade a speech signal in isolation, these distortions often occur simultaneously and, together, are more detrimental than the sum of the component distortions (Nabelek, 1988).

Other factors, including cognitive demand (Luce, Feustel, & Pisoni, 1983), and listener- or speaker-related variables such as language background may affect speech understanding even in quiet, and can combine with environmental factors to further degrade speech understanding (Helfer & Huntley, 1991; Nabelek, 1988; Newman & Hochberg, 1983; Takata & Nabelek, 1990). For bilinguals, the documentation of language background variables is of particular importance; these variables may include language history (age of onset of acquisition), percentage of language use for both languages, language competency in both languages, language stability (the extent to which proficiency is changing) for both languages, and contexts of language use in both languages (Grosjean, 1997; van Hapsburg & Pena, 2002).

Regardless of the amount of language background information provided, however, the psychological effects of bilingualism are known to be highly complex (cf. Bialystock, 2002). Early bilinguals appear to have a performance advantage over monolinguals for many cognitive abilities such as problem solving and creativity (Kessler & Quinn, 1980, 1987) and for tasks involving memory or inhibition of attention (Bialystock, Craik, Klein, & Viswanathan, 2004; Bialystock & Martin, 2004; Kormi-Nouri, Moniri, & Nilsson, 2003; Ransdell, Arecco, & Levy, 2001). For literacy development, on the other hand, the results are more mixed. Early bilinguals have an apparent advantage over monolinguals for some skills, such as knowledge of print invariance (Bialystock, 1997; Bialystock, Shenfield, & Codd, 2000), but not for others, such as oral proficiency and acquisition of reading (Bialystock, 1988; Rickard Liow, 1999). With relatively little currently known about the complex effects of bilingualism, it is important to explore the impact of bilingualism at each level of language processing, in both the perception and production domains. The focus of the current study is on the effects of early bilingualism on adults' speech perception in adverse listening environments (both noisy and reverberant).

Although studies of speech-perception abilities by bilingual listeners are abundant, most studies of speech perception by bilinguals in adverse acoustic environments have either focused on children, those who acquired a second language as adults or adolescents (Crandell & Smaldino, 1996; Nabelek & Donahue, 1984; Takata & Nabelek, 1990), or nonfluent English speakers (Abel, Alberti, & Riko, 1980). A more limited number of studies have considered the effects of language background variables on the perception of speech in noise by adult bilinguals who learned both languages in early childhood (e.g., Mayo, Florentine, & Buus, 1997; Meador, Flege, & Mackay, 2000). These studies have indicated that, although the speech perception of early bilinguals and monolinguals is very similar in quiet, the performance of the two groups may differ significantly in noise. The effects of other environmental acoustic factors (such as reverberation) that may differentially influence speech perception in early bilinguals and monolinguals have apparently not been investigated. Mayo et al. (1997) examined the effect of noise on the perception of sentences by bilingual (Spanish–English) listeners. They focused primarily on the age of acquisition of English by bilingual participants and divided the bilingual participants into three groups based on this variable: (a) bilingual since infancy (BSI) with three participants, (b) bilingual since toddlerhood (BST) with nine participants, and (c) bilingual postpuberty (BPP) with nine participants. All groups learned Spanish from birth and began learning English at birth, by age 6, or after age 14, respectively. A monolingual group was composed of nine participants who learned only English from birth. Their results indicated that, although the BSI and BST groups performed significantly better than the BPP group, they did not perform as well as the monolingual group, despite similar performance in quiet. Although Mayo et al. (1997) determined through informal interaction that their bilingual participants were fluent in Spanish and English, accent and proficiency were not directly assessed. Therefore, it is not clear if any of their listeners would be perceived by others to be native speakers of English.

Meador et al. (2000) examined the effects of age of first exposure to English on the number of correctly identified words in English sentences presented in noise by native speakers of Italian. Bilingual participants were divided into four groups, based on age of arrival in Canada, while the fifth group contained only English-speaking monolingual listeners. Although the earliest arriving (at an average age of 7 years) group performed better than the later arriving groups, the word recognition of the earliest arriving group was significantly poorer than that of the monolinguals. In this study, however, even the speakers in the early bilingual group who used Italian most frequently were still reported to have a noticeable foreign accent.

The results of these studies suggest that learning a second language at an early age is important for the ability to understand that language in noise but that speech understanding under degraded acoustic conditions is difficult even for those who learned both languages early in life. It is important to investigate the speech-perception abilities of early bilingual individuals such as the participants in the Mayo et al. (1997) BSI and BST groups, due to the increasing number of persons learning more than one language from birth and early childhood in the United States.

Early learners of English who received most or all of their education in their second language may have little or no foreign accent in their second language. In many cases, the bilingualism of such persons may not be recognized by their interlocutors, including teachers or, in the case of hearing loss or acquired speech impairment, clinicians. Even when their bilingual status is known, these highly proficient bilinguals may be assumed by some persons to be identical to monolinguals in all speech abilities (perception and production), because their production abilities are apparently the same. For both these reasons, no adjustment or accommodation is typically made for these highly proficient bilinguals. Thus, understanding differences in speech-perception abilities in this population may be particularly important.

Such cases are particularly common among persons of Hispanic ethnicity, who represent approximately 12% of the US population, comprise the fastest growing minority group in the United States, and are projected to become the largest minority group in the future (Therrien & Ramirez, 2000; US Bureau of the Census, 1999, 2000, 2001). Findings of studies investigating the effects of adverse listening environments on speech perception by these apparently nativelike bilingual participants may have important implications in setting standards for educational, occupational, and rehabilitative settings for this population.

In the present study, the speech-perception abilities of Spanish–English bilinguals were investigated because of the important demographic factors outlined above. We elected to focus on early bilinguals whose second language proficiency is recognized by listeners to be nativelike or near native because few studies have focused specifically on the abilities of such highly proficient bilinguals. Furthermore, apparently no studies have investigated the effects of adverse acoustic environmental factors other than noise on speech perception by early bilinguals.

The present study, therefore, examined the performance of a group of listeners similar to the BSI and BST groups of Mayo et al. (1997), and included a reverberant condition as well as direct evaluation of language proficiency and accent. The purpose of this study was to examine the effects of noise and reverberation on the perception of American English speech by adult early Spanish–English bilingual participants with normal hearing and similar self-reported language background and usage patterns for first language (L1) and second language (L2). A test of reverberant speech intelligibility (Koehnke & Besing, 1996) was used to assess performance in a listening environment typical of everyday communication. Bilingual listeners were selected by obtaining information about their language background. Conversational speech samples were obtained to ensure that others would perceive the bilingual participants as native or nativelike speakers of English, the latter representative of an emerging population in this country.

METHOD

Participants

The word recognition of two groups of young listeners with normal hearing was assessed in quiet, in the presence of noise alone, and in noise plus reverberation: (a) 15 monolingual English speakers (mean age = 25.3 years), and (b) 12 Spanish–English bilingual speakers (mean age = 24.7 years). All participants were between the ages of 18 and 35 years. All participants had normal hearing, defined as pure tone air conduction thresholds of 20 dB HL or better from 250 to 8000 Hz and air-bone gaps of 10 dB or less, bilaterally. All participants had normal middle ear function.

Both monolingual and bilingual participants were recruited from the students, faculty, and staff of the University of South Florida's Tampa Campus and the surrounding community. Eighteen potential monolingual participants and 14 potential bilingual participants were recruited, although only 15 monolingual and 12 bilingual participants subsequently qualified for participation (see below). All monolingual participants were native speakers of American English only. All bilingual participants were exposed to Spanish from birth, and were exposed to American English in early childhood (before the age of 6 years). This age was chosen because Mayo et al. (1997) found that the BSI and BST groups in the study performed similarly on speech perception in noise. Listeners in their BST group had learned English as a L2 before the age of 6 years, and listeners in their BSI group had acquired both languages in infancy. It should be noted, however, that the BSI group in Mayo's study contained only three participants, and a larger sample of participants who were bilingual from infancy may have performed differently.

Materials and instrumentation

A Grason–Stadler GSI-61 audiometer, Panasonic CD player, TDH-49 headphones, and Tele-acoustics double-walled sound-treated booth were used for the hearing evaluation and for the administration of the speech-perception tests. A Grason–Stadler Tymp Star was used to assess middle ear function. For the speech-perception tests, CD recordings of the Central Institute for the Deaf (CID) W-22 monosyllabic words were used. The CID W-22s are organized into four phonetically balanced lists (Hirsch et al., 1952), and are a commonly used test of word recognition. The words are in consonant–vowel, vowel–consonant, or consonant–vowel–consonant format, and were all recorded from the same male speaker of General American English. Sample items are “ace, ache, bathe, carve,” and “chew.”

The Speech Intelligibility Gain—Reverberant (SIG-R) Test (Koehnke & Besing, 1996) was used to assess binaural speech perception in a simulated reverberant environment with background noise. The SIG-R Test was administered at the following SNRs: +4, +2, and 0 dB. These SNRs were suggested by the developers of the SIG-R Test as challenging for young listeners with normal hearing in each environment, and were based on pilot studies (cf. Besing, Koehnke, Fedor, Lister, & Febo, 2001; Dethloff, Besing, & Koehnki, 1998). The simulated reverberant environment was created by first recording broadband noise in a naturally reverberant conference room and then using digital signal processing techniques to extract the effects of the reverberation. These reverberation effects were then applied to the speech and noise stimuli used in the present study (Besing & Koehnke, 1995; Koehnke & Besing, 1996). The RT was approximately 0.25 s at frequencies below 800 Hz and 0.4 s at frequencies above 800 Hz. This represents a mild level of naturally occurring reverberation common to many public meeting rooms. Because the sound source was located 1 meter in front of the recording microphone when the reverberation was recorded, the simulated location of the speech signal (words from the CID W-22 word lists) and noise signal (speech-spectrum noise) was approximately 1 meter from the listener's head at 0° azimuth (straight ahead). Detailed information about the SIG-R Test and the simulation of the reverberation conditions can be found in Koehnke and Besing (1996).

The unprocessed W-22 words from the SIG-R, to which no reverberation was added (henceforth referred to as Noisy W-22), were used to measure diotic speech perception in quiet and in the presence of speech-spectrum noise (equal energy per frequency from 250 to 1000 Hz with a 12 dB/octave rolloff from 1000 to 6000 Hz). Speech perception was measured diotically at SNRs of 0, −2, and −6 dB. For all speech-perception tests, the intensity level of the speech stimuli remained fixed at approximately 50 dB HL, and the intensity of the noise was varied to achieve the various SNRs. To equalize the level of the speech stimuli, the RMS amplitude of each individual word was adjusted to match that of a 1000-Hz calibration tone. To calibrate, the 1000-Hz calibration tone was played out of the audiometer at a level of 50 dB HL and the audiometer's VU meter was adjusted to 0 dB. The speech–spectrum noise was generated by the audiometer.

All participants completed a questionnaire that provided detailed information concerning language background (see Appendix A). The 14 language background questions exclusively for bilingual participants included questions probing language status, language history, language competency, and language mode, as suggested by Grosjean (1997). Five questions for both groups of participants probed hearing history and seven questions were used to confirm bilingualism status.

A Tascam DA-P1 digital audio tape recorder and an AKG C420 microphone were used to record conversational language samples and citation speech of all participants.

Procedure

All potential participants completed the participant questionnaire prior to audiometric testing. Selected information from this questionnaire is presented in Table 1 only for the 12 bilingual participants whose data were used in the study (see below).

Twelve of the 14 potential bilingual participants reported that they used English (L2) at least 50% of the time and Spanish (L1) at least 25% of the time, and that they understood (listening ability) both English and Spanish equally well. The remaining two potential participants indicated that they spoke Spanish relatively rarely, and that they understood English better than Spanish. These two participants were later eliminated from the study (see below). Of the 12 remaining participants, whose data are shown in Table 1, 7 indicated that they spoke English more fluently than Spanish, whereas 5 indicated that they spoke both equally fluently. In comparing reading and writing abilities in English and Spanish, 8 of the 12 potential participants whose data were used indicated better reading abilities in English and 10 indicated better writing abilities in English; 4 of these 12 indicated equal reading abilities in English and Spanish and 2 indicated equal writing abilities.

For all potential participants, a standard pure-tone hearing evaluation was performed, including spondee thresholds, word recognition in quiet (using the CID W-22s), and tympanometry. All participants who passed the aforementioned criteria for normal hearing were audio taped while engaging in conversation in English and while repeating 20 sentences (two sets) selected from the Harvard sentences (Egan, 1948; IEEE, 1969). The Harvard sentences were selected because the phoneme frequency in each list is designed to match the overall frequency of occurrence of phonemes in English (Egan, 1948). The sentences were all declarative and of similar length, each containing four or five one- or two-syllable content words and between two and five function words. The sentences contain mostly high-frequency words but have relatively low semantic predictability. A typical sentence is “The plant grew large and green in the window.”

Bilingual participants were also audio taped while engaging in conversation in Spanish. The Harvard sentences were presented via the Panasonic CD player and TDH-49 headphones to the bilingual participants, who were asked to repeat them. The third author (a Spanish–English bilingual) elicited the conversational samples by asking the same question in English and then in Spanish: (a) “What are your plans for the future?” and (b) “¿Cuáles son sus planes para el futuro?” Of the 14 potential Spanish–English bilinguals, two were unable to produce a conversational speech sample of 1 min without extensive prompting by the third author. Their speech samples were therefore considered not ratable, and were discarded for both English and Spanish rating. These were the same two potential subjects described above who had indicated that they spoke Spanish rarely and that they understood English better than Spanish.

Two speech–language pathology graduate students from the Speech, Language, and Hearing Center at the University of South Florida evaluated the recorded English speech samples for foreign and/or regional accent using a 9-point scale (1 = little or no accent, 9 = very heavy accent). A 9-point scale was used because Southwood and Flege (1999) found a 9-point scale to be more sensitive than 5- and 7-point scales for rating foreign “accentedness.” Raters indicated their judgments by circling the appropriate response for each utterance presented on a rating form. Both of the English raters were monolingual native speakers of American English. Two Spanish–English bilingual graduate students (one from Argentina and one from the Dominican Republic) judged Spanish conversational samples of the bilingual participants for foreign accentedness. Both began learning English intensively at age 12 or later. The same rating procedures were used for the Spanish conversational samples and the English sentences and conversational samples.

Sound files for all speakers were created by the first author and an assistant by digitally recording the output of the DAT recorder to the digital input of a high-quality sound card. Separate files were then created for the sentences and the conversational speech sample for each talker. If more than 1 min of conversational speech was available, samples slightly longer than 1 min were allowed to end the sample at the end of a sentence or phrase. These sound files were then copied to CD in a format that would allow them to be played on an ordinary CD player. Speech samples for each speaker were presented in sequence: first sentences, and then the conversational excerpt; only conversation was presented for the Spanish samples. English samples and Spanish samples were recorded onto separate CDs and rated separately. The order of presentation of speakers was randomized in each case.

For the English accentedness rating task, an average (across raters) rating between 1 and 3 was taken to indicate little or no regional or foreign accent because scores between 1 and 3 represented the bottom third of the scale. Sixteen of the 18 potential monolingual participants were judged to have little or no regional accent in conversational American English and in isolated sentences, as indicated by an average score between 1 and 3. The two potential monolingual participants judged to have a noticeable regional accent were eliminated from the study, leaving a sample of 16 monolingual participants. The 12 bilingual participants whose Spanish samples were rated were also judged by both raters to speak American English with little or no foreign or regional accent, as indicated by an average score between 1 and 3. A bivariate correlation was performed between the 30 scores for the two English raters (r = .499, p = .005). Although significant, the degree of correlation is considered low in that it accounts for only about 25% of the variance across listener ratings. The correlation would be expected to be rather low, however, given the restricted range of scores (mostly between 1 and 3). Furthermore, Southwood and Flege (1999) also report relatively low interjudge reliability for judgments of accentedness.

The average of the crossrater average scores was 1.44 for the 16 monolingual speakers with average scores below 3.0; the average of the crossrater average scores was 1.62 for the 12 bilingual speakers. A two-tailed two independent samples t test indicated no significant difference between the average ratings for these two groups of speakers, t (26) =−.728, p = .467.

For the Spanish rating task, the average ratings ranged from 1 to 7.5, indicating a substantial degree of foreign accent while speaking Spanish for some of the bilingual participants. A bivariate correlation was performed between the scores for the two Spanish raters (r = .678, p = .015). Of 6 bilingual participants who received average ratings above 3.0, 4 were among the 7 who had indicated that they spoke English more fluently than Spanish (see Table 1). Thus, it may be that some of these early bilinguals had suffered a degree of loss of L1. Nevertheless, all had been able to engage in a minute or more of conversational narrative in Spanish without substantial prompting by the third author. Another factor is that the bilinguals spoke a variety of regional dialects of Spanish. Although they were instructed to ignore regional variation, both raters commented that they found the rating task difficult for this reason. Given this difficulty and the fact that these participants were able to engage in conversational narrative with reasonable fluency, these 12 participants were retained.

During the speech-perception tests (Quiet, Noisy W-22, SIG-R), listeners were instructed to repeat the monosyllabic words that they heard. The noise (if applicable) began before the presentation of each word.

A single list of 25 words was presented at each SNR and in quiet for a total of four lists per participant. Three word lists were presented twice to each listener: once for the Noisy W-22 (noise alone) condition, and once for the SIG-R (noise plus reverberation) condition. The fourth word list was used for the quiet condition only. The number of correctly repeated words was recorded by the experimenter (the third author) for each word list used. The Noisy W-22 lists were all presented before the SIG-R lists for each participant. The SNRs for the Noisy W-22 and SIG-R words were presented in the following orders (most favorable to least favorable SNR) for each participant: (a) 0, −2, and −6 dB for the Noisy W-22 stimuli and (b) +4 , +2, and 0 dB for the reverberant stimuli. Within a listening condition (Noisy W-22 or SIG-R), the matching of the three different word lists used to the SNRs was counterbalanced across participants using a modified Latin square design (Maxwell & Satake, 1997). Thus, all lists were presented an equal number of times at each SNR within a processing condition. The repetition of word lists across listening conditions may have led to a practice effect. This possibility motivated presenting the Noisy W-22 condition before the SIG-R condition, which was assumed, based on pilot data, to be the more difficult condition.

In the Latin square design, the counterbalancing of lists across conditions must be complete for the design to be balanced. In the present case, three participants were needed for the three different lists presented across the SNRs to be balanced across subjects. For this reason, only data for 15 of the 16 monolingual participants whose recorded speech samples were judged to have little or no foreign or regional accent were used in the analysis of results. Data for the 16th speaker formed an incomplete square and were not used. Thus, the design was completed five times for the monolingual English participants and four times for the bilingual participants.

RESULTS

W-22s in quiet and in noise

All participants obtained scores of 100% word recognition in quiet, as shown by the rightmost data point shown for each listener group in Figure 1. In the added noise conditions, the monolingual group repeated more words correctly than the bilingual group at all three SNRs for the Noisy W-22 test, as shown by the remaining data points in Figure 1. Performance for both participant groups was poorest at −6 dB and best at 0 dB SNR.

The number of Noisy W-22 monosyllabic words repeated correctly by ([bull ]) monolingual and ([squf ]) bilingual listeners at three SNRs(−6, −2, and 0 dB) and in quiet. Standard error bars areplotted for each group and condition.

A two-way mixed-design analysis of variance (ANOVA) with one between-subjects factor (participant group) and one within-subjects factor (SNR) was performed on the Noisy W-22 results only (excluding quiet). This analysis revealed a significant main effect of group, F (1, 25) = 26.71, p = .00002, and a significant main effect of SNR, F (2, 50) = 43.57, p < .00001. The interaction between group and SNR was not significant (p = .15). A Tukey honestly significant difference (HSD) post hoc analysis of the main effect of SNR revealed that performance at all SNRs differed significantly from each other (p = .01 for 0 vs. −2 dB SNR; p = .0001 for 0 vs. −6 dB SNR; p = .0001 for −2 vs. −6 dB SNR). The quiet condition was excluded from the analysis because all subject scores were identical (i.e., 25 words correct) in this condition, and, therefore, were not normally distributed.

SIG-R

Performance for the two groups of participants in the simulated reverberant environment (SIG-R Test) is shown in Figure 2, following a pattern similar to that for the Noisy W-22 lists. The monolingual group obtained better scores than the bilingual group at all three SNRs on the SIG-R Test; however, the group difference for the +2 dB SNR was minimal. Performance for both participant groups was best at the +4 dB SNR and poorest at the 0 dB SNR. Note that despite the larger SNRs and the repetition of the stimulus lists across listening conditions, overall performance was poorer in the SIG-R condition than in the Noisy W-22 condition. If a practice effect due to list repetition mitigated the performance difference between the Noisy W-22 and SIG-R conditions, these data suggest that the true difference between the two conditions may be larger than is described here.

The number of monosyllabic words repeated correctly by ([bull ]) monolingual and ([squf ]) bilingual listeners across SNRs for the simulated reverberant (SIG-R) environment. Standard error bars are plotted for each group and condition.

A two-way mixed-design ANOVA with one between-subjects factor (subject group) and one within-subjects factor (SNR) was also performed for this condition and revealed a significant main effect of group, F (1, 25) = 6.87, p = .015, and a significant main effect of SNR, F (2, 50) = 36.35, p < .00001. The interaction between group and SNR was not significant (p = .34). A Tukey HSD post hoc analysis of the main effect of SNR revealed that performance at all SNRs differed significantly from each other (p = .0001 for +4 vs. +2 dB SNR, p = .0001 for +4 vs. 0 dB SNR, p = .04 for +2 vs. 0 dB SNR).

To compare the performance of the monolingual and bilingual groups across the noise and reverberant conditions at 0 dB SNR (the only SNR used in common between the two conditions), a third two-way mixed-design ANOVA with one between-subjects factor (subject group) and one within-subjects factor (listening condition) was performed. The main effect of group was significant, F (1, 25) = 10.80, p = .003. The main effect of listening condition was also significant, F (1, 25) = 344.27, p < .00001. The interaction between group and listening condition was not significant (p = .81). To illustrate the differences between the groups and listening conditions, the number of correct responses for each participant group is plotted as a function of listening condition for 0 dB SNR in Figure 3. The figure shows poorer performance by the bilingual group compared to the monolingual group for both test conditions. Both listener groups showed better performance in the Noisy W-22 condition than in the SIG-R condition. The degree of difference between groups is approximately equal across the two listening conditions.

The number of monosyllabic words repeated correctly by (open bars) monolingual and (filled bars) bilingual listeners for 0 dB SNR across the two listening conditions. Standard error bars are plotted for each group and condition.

DISCUSSION

The purpose of this study was to examine the effects of both noise and reverberation on the perception of American English speech by highly proficient, early (from age 6 years or earlier) Spanish–English bilingual adults with normal hearing and similar self-reported language background and usage patterns for L1 and L2 and who were judged to speak English with little or no foreign accent. Monosyllabic word recognition of bilingual and monolingual participants was compared in quiet, in noise (Noisy W-22), and in a simulated noisy, reverberant environment (SIG-R). In both the noise alone and the noise plus reverberation conditions, word recognition was measured at three SNRs. Significantly poorer performance was measured for the bilingual listeners than for the monolingual listeners across listening conditions and SNRs. All participants obtained identical, perfect scores on the word recognition test in quiet (100%). For both groups, performance decreased significantly with decreases in SNR, and overall performance was poorer in the simulated noisy, reverberant environment than in noise alone. These results suggest that, although early bilinguals have little difficulty understanding quiet speech, they are less able than monolingual listeners to tolerate acoustic degradations typical in everyday listening environments.

The present study differs from similar studies by Mayo et al. (1997) and Meador et al. (2000) in three main ways: a noise plus reverberation condition was included, all bilingual listeners were judged to have little or no foreign accent in English and rated themselves equally or more fluent in speaking English than Spanish, and perception of monosyllabic words was studied, rather than perception of words in sentences. As stated in the introductory section, it is important to understand differences between the speech-perception abilities of early Spanish–English bilinguals and those of monolingual English speakers because early bilinguals may not be recognized as such, or because their speech-perception abilities may be assumed to be identical to those of monolinguals. As a consequence, such persons may receive no adjustment or accommodation from interlocutors or persons providing speech pathology or audiology services.

The results for the monolingual listeners across SNR may be compared with those collected by the authors of the SIG-R Test (Koehnke & Besing, 1996) in pilot studies (cf. Besing et al. 2001; Dethloff et al., 1998) and to classic studies of word recognition in noise (e.g., Miller et al., 1951). Dethloff et al. (1998) measured monosyllabic word recognition by six young listeners with normal hearing in a simulated reverberant environment identical to the one used in the present study. For the range of SNRs used in the present study, they measured reverberant word recognition scores that ranged from approximately 50 to 65% correct. The performance of our monolingual listeners in reverberation ranged from 45 to 67% correct.

Overall performance of the monolingual listeners in the noise-added conditions in the present study was better than that of listeners in Miller et al. (1951). Slopes estimating increase in performance with increasing SNR are quite similar between the two studies, however. Miller et al. (1951) measured open set monosyllabic word recognition that improved by approximately 2.8% correct per decibel with increases in SNR from −6 to 0 dB. The word recognition performance of our monolingual listeners improved by 2.7% correct per decibel within the same range of SNRs. Differences in overall performance between the two studies may be attributable to differences in the method of presentation and calibration, speaker, frequency response of the system, and type of noise used between the two studies.1

Miller et al. (1951) used monitored live voice presentation with a carrier phrase. The peak level of the carrier phrase was used to equalize the output level to approximately 90 dB SPL. The speech stimuli were filtered between 200 and 3000 Hz. The interfering noise had a bandwidth of 7000 Hz and a uniform spectrum.

Comparisons with previous results using bilingual listeners are limited by differences in stimuli and listener characteristics. Despite these differences, however, the pattern of performance by our monolingual and bilingual listeners is very similar to that found by Mayo et al. (1997) and by Meador et al. (2000). Their early bilingual listeners showed significantly poorer speech understanding in noise than their monolingual listeners. The present results replicate those previous results and further suggest that highly proficient bilingual listeners may experience greater difficulty than monolingual English speakers in understanding degraded speech in a variety of contexts, even when the L2 is acquired early (before age 6) and when little or no foreign accent is detected in the bilinguals' L2. It should be noted, however, that, as in Mayo et al. (1997), only three bilingual participants indicated bilingualism since birth. Therefore, neither study is conclusive with regard to the performance of persons learning two languages from birth; it may be that, as a group, those who are bilingual from birth would not perform more poorly than their monolingual peers.

It is also of interest to recall that 7 of the 12 bilingual participants received relatively high accentedness ratings in Spanish (see Table 1), but not in English, perhaps indicating some degree of language loss. Anecdotally, some loss of L1 is not uncommon among second-generation young Hispanic adults in Florida, whose parents often switch from speaking Spanish in the home to speaking English in the home when the child enters school and does not understand English. Furthermore, 6 of the 12 participants indicated a degree of L2 dominance by rating themselves more highly for their abilities to speak, read, and write English, compared to Spanish. All 12 rated themselves as equally proficient in understanding English and Spanish. None of the bilingual participants would appear to be L1 dominant because none rated themselves more highly for Spanish than English in any category. An informal analysis of the data showed no tendency for the six apparently L2-dominant bilinguals to perform better than the six more balanced bilinguals. In fact, the average performance across the noisy and reverberant conditions was slightly lower for the more L2-dominant bilinguals than for the more balanced bilinguals (by 0.33 items correct). Considering the apparent L2 dominance of half of the bilingual participants, the differences in performance found between the monolingual and bilingual participants are perhaps more striking.

The differences in performance observed between monolingual and bilingual listeners in the present study may be explained by increased demand for attentional resources or increased processing demand for the bilinguals, due to one or more of the following factors: (a) the need to deactivate the nonactive language for bilinguals but not monolinguals (Grosjean, 1997; Mackay & Flege, 2004); (b) the need for bilinguals to select a target phoneme from among a larger number of alternatives that are more densely distributed in a common phonological space (Flege, 1995); or (c) the need for bilinguals to match native speaker productions to a perceptual category that may be intermediate between the norms for the two languages (Flege, 1987, 1995). The need for greater attentional resources for bilinguals to select the appropriate target word may have no functional effect under conditions of high signal quality and low task demand. When the target word is degraded by noise or reverberation, however, or when task demand is high, the effects of such differences may be evidenced by lower response accuracy, as in the present study and others of similar nature (e.g., Mayo et al., 1997; Meador et al., 2000).

A similar hypothesis is offered by Pichora-Fuller, Scheider, and Daneman (1995) to explain differences in processing abilities of young and old monolingual adults. They compared the ability of young and old monolingual adults to perceive and recall words heard in noise at different levels of context. Their older adults performed similarly to younger adults in relatively undemanding conditions, but performed less well than younger adults when task demand increased, especially when contextual information was removed and when cognitive load was increased (with greater differences at the lowest SNRs). Thus, they suggest that as speech processing becomes more effortful, even in good environments, because of the effects of aging, fewer resources remain when processing demands are increased.

Furthermore, the hypothesis of increased processing demand for the bilinguals is compatible with one hypothesis offered to explain the advantage of early bilinguals for cognitive skills such as memory and inhibition of attention (Bialystock et al., 2004; Bialystock & Martin, 2004; Kormi-Nouri et al., 2003; Ransdell et al., 2001). Bialystock (2002), for example, suggests that these advantages stem from the “constant management of two competing languages” (p. 290). That is, early bilinguals have superior cognitive abilities because they are always working harder while processing language. Given that assumption, it is not entirely surprising that the effects of this greater speech-processing workload will be seen in speech perception only when task demand is high, while advantages are mainly seen for tasks that do not involve speech processing. Similarly, Pichora-Fuller et al. (1995) suggested that their older monolingual adults benefited more from semantic context than younger monolingual adults because the increased processing demands associated with aging forced them to use contextual information more often.

Although all of the factors offered above as potential reasons for the increased speech-processing demands for bilinguals are compatible with lower performance by highly proficient bilinguals compared to monolinguals under noisy and reverberant conditions, the present data do not offer a way of distinguishing among these explanations, or other (possibly even sensory) explanations. Future studies should consider designs that may confirm or distinguish among them. Furthermore, although similar in design to other studies of speech perception by bilinguals, the present study included no measures of language or cognition and a relatively small number of participants per group. Thus, despite the groups' identical (perfect) performance in quiet, it is possible that between-group differences on variables such as vocabulary size could have accounted for the differences in the speech-perception measures obtained in the present study. Thus, future studies would be of greater interest if they included a larger number of participants in each group and the collection of a variety of language and cognition measures, as well as speech-perception measures.

Although the bilingual listeners included in this study are representative of the Spanish–English bilinguals living in many areas of the United States today, they are not necessarily representative of the bilingual populations that have been the focus of many previous studies of speech perception (e.g., Italian–English, Mexican Spanish–English, Japanese–English). The bilingual listeners included in this study were of Puerto Rican, Cuban, Panamanian, or Columbian descent, and had lived in the Southeastern area of the United States the majority of their lives. They were young adults who were apparently stable in American English, although some may have been in the process of losing Spanish. All participants were educated exclusively in American English. The bilingual participants reported speaking English between 50 and 75% of the time during a typical day and reported either studying or working in American English-speaking environments. Thus, the participants in this study are representative of a growing population in this country (Therrien & Ramirez, 2000; US Bureau of the Census, 1999, 2000, 2001).

The results of the current study may lead to several educational and/or clinical considerations. Various researchers have suggested modifications of room acoustics for children learning English as a L2 (Crandell & Smaldino, 2000; Picard & Bradley, 2001). From the results of the present study, as well as studies showing similar results (e.g., Mayo et al., 1997; Meador et al., 2000), such modifications would also appear to be beneficial for early bilingual students in postsecondary educational settings, such as large auditoriums typically used on college campuses, and in occupational situations. Improvements to room acoustics may be made by replacing hard flooring with carpet, installing acoustic ceiling tiles, placing drapes around windows, placing acoustically treated panels on the walls, and simply keeping the door closed. More costly modifications include lowering ceilings and removing windows. Other nonacoustic modifications such as improved lighting, placement of the student near the front of the room, and delivering information in multiple modalities may also benefit bilingual students of any age. The argument for such modifications is strengthened by the results of Abel et al. (1980), who found a significant negative effect of the use of hearing protection on intelligibility of speech in noisy environments for nonfluent bilinguals, but not for monolinguals. In particular, these results suggest that attenuators (i.e., hearing protection), maskers, or distractions that have no significant effect on intelligibility for monolinguals may have significant detrimental effects on bilinguals, so that further investigation of the effects of any potentially disadvantageous environment on perception by bilinguals is warranted.

The results of this study may also have implications for the management of hearing loss among older early bilinguals. Although it is known that age and hearing loss impair speech perception in adverse listening environments, the effects of noise and reverberation on older early bilinguals have not been investigated. The perception of speech in reverberation becomes more challenging as adults age (Nabelek & Robinson, 1982) and experience hearing loss (Helfer & Wilber, 1990; Nabelek & Pickett, 1974). As speech processing becomes more effortful for elderly monolinguals, the detrimental effects of increasing task demand and removing semantic context have also been found to be greater for this population, compared to younger monolinguals (Pichora-Fuller et al., 1995).

The results of the present study, when considered with the findings of poorer speech perception and greater effects of task demand among elderly monolingual listeners in real-world listening conditions, suggest that aging and hearing loss may have an even greater negative impact on the speech understanding of early bilinguals than on that of monolinguals. That is, older bilinguals may face increased processing demand on two fronts, both sensory (from declining auditory processing abilities) and cognitive (from the need to manage two language systems), thus depleting attentional resources needed for speech perception more than for older monolinguals. Thus, future studies examining speech perception by older early bilinguals could aid the diagnosis and rehabilitation of auditory disorders among this population. It will be important for such studies to include data on such factors as language proficiency, demand for use, and amount of use of each language in everyday contexts.

A final consideration for aging early bilinguals may be in the area of counseling, particularly as it regards the performance of bilinguals with little or no foreign accent, who may not be recognized as bilingual by their interlocutors. Pending the results of studies suggested above, the clinician might consider counseling elderly hearing-impaired early bilinguals that they may experience greater difficulties than their monolingual peers in understanding speech in difficult environments.

APPENDIX A

PARTICIPANT QUESTIONNAIRE

ACKNOWLEDGMENTS

Portions of this paper were presented at the 2004 convention of the American Academy of Audiology in Salt Lake City, UT. We gratefully acknowledge the assistance of Ms. Teresa DeMasi in the preparation of stimuli for accentedness rating.

References

Abel S. M., Alberti P. W., & Riko K.1980. Speech intelligibility in noise with ear protectors. Journal of Otolaryngology, 9, 256265.Google Scholar
Besing J., & Koehnke J.1995. A test of virtual auditory localization. Ear and Hearing, 16, 220229.Google Scholar
Besing J., Koehnke J., Fedor A., Lister J., & Febo D.2001. Evaluating listeners with normal and impaired hearing on clinical tests of spatial localization and speech intelligibility gain. Association for Research in Otolaryngology Abstracts, 395, 111.Google Scholar
Bialystock E.1988. Levels of bilingualism and levels of linguistic awareness. Developmental Psychology, 24, 560567.Google Scholar
Bialystock E.1997. Effects of bilingualism and biliteracy on children's emerging concepts of print. Developmental Psychology, 33, 429440.Google Scholar
Bialystock E.2002. Acquisition of literacy in bilingual children: A framework for research. Language Learning, 52, 159199.Google Scholar
Bialystock E., Craik F. I. M., Klein R., & Viswanathan M.2004. Bilingualism, aging, and cognitive control: Evidence from the Simon task. Psychology and Aging, 19, 290303.Google Scholar
Bialystock E., & Martin M. M.2004. Attention and inhibition in bilingual children: Evidence from the dimensional card sort task. Developmental Science, 7, 325339.Google Scholar
Bialystock E., Shenfield T., & Codd J.2000. Languages, scripts, and the environment: Factors in developing concepts of print. Developmental Psychology, 36, 6676.Google Scholar
Crandell C., & Smaldino J.1996. Speech perception in noise by children for whom English is a second language. American Journal of Audiology, 5, 4751.Google Scholar
Crandell C., & Smaldino J.2000. Classroom acoustics for children with normal hearing and with hearing impairment. Journal of Language, Speech, and Hearing Services in Schools, 31, 362370.Google Scholar
Dethloff C., Besing J., & Koehnke J.1998. Effects of presentation method on virtual speech intelligibility in noise. Paper presented at the Meeting of the American Speech–Language and Hearing Association, San Antonio, TX.
Egan J. P.1948. Articulation testing methods. Laryngoscope, 58, 955991.Google Scholar
Flege J. E.1987. The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics, 15, 4765.Google Scholar
Flege J. E.1995. Second-language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-linguistic research (pp. 229273). Timonium, MD: York Press.
Grosjean F.1997. Processing mixed language: Issues, findings, and models. In A. M. B. de Groot & J. F. Kroll (Eds.), Tutorials in bilingualism: Psycholinguistic perspectives (pp. 225253). Hillsdale, NJ: Erlbaum.
Helfer K., & Huntley R.1991. Aging and consonant errors in reverberation and noise. Journal of the Acoustical Society of America, 90, 17861796.Google Scholar
Helfer K., & Wilber L.1990. Hearing loss, aging, and speech perception in reverberation and in noise. Journal of Speech and Hearing Research, 33, 149155.Google Scholar
Hirsch I., Davis H., Silverman S., Reynolds E., Eldert E., & Benson R.1952. Development of materials for speech audiometry. Journal of Speech and Hearing Disorders, 17, 321337.Google Scholar
IEEE. 1969. IEEE recommended practice for speech quality measurements. IEEE Transactions on Audio and Electroacoustics, AU-17, 225246.
Kessler C., & Quinn M. E.1980. Positive effects of bilingualism on science problem-solving abilities. In J. E. Alatis (Ed.), Current issues in bilingual education: Proceedings of the Georgetown Roundtable on Languages and Linguistics (pp. 295308). Washington, DC: Georgetown University Press.
Kessler C., & Quinn M. E.1987. Language minority children's linguistic and cognitive creativity. Journal of Multilingual and Multicultural Development, 8, 173186.Google Scholar
Koehnke J., & Besing J.1996. A procedure for testing speech intelligibility in a virtual listening environment. Ear and Hearing, 17, 211217.Google Scholar
Kormi-Nouri R., Moniri S., & Nilsson L.-G.2003. Episodic and semantic memory in bilingual and monolingual children. Scandinavian Journal of Psychology, 44, 4754.Google Scholar
Luce P. A., Feustel T. C., & Pisoni D. B.1983. Capacity demands in short-term memory for synthetic and natural word lists. Human Factors, 25, 1732.Google Scholar
MacKay I. R. A., & Flege J. E.2004. Effects of the age of second language learning on the duration of first and second language sentences: The role of suppression. Applied Psycholinguistics, 25, 373396.Google Scholar
Maxwell D., & Satake E.1997. Research and statistical methods in communication disorders. Baltimore, MD: Williams & Wilkins.
Mayo L., Florentine M., & Buus S.1997. Age of second-language acquisition and perception of speech in noise. Journal of Speech, Language, and Hearing Research, 40, 686693.Google Scholar
Meador D., Flege J. E., & Mackay I. R. A.2000. Factors affecting the recognition of words in a second language. Bilingualism: Language and Cognition, 3, 5567.Google Scholar
Miller G., Heise G., & Lichten W.1951. The intelligibility of speech as a function of the context of the speech materials. Journal of Experimental Psychology, 41, 329335.Google Scholar
Moncur J., & Dirks D.1967. Binaural and monaural speech intelligibility in reverberation. Journal of the Acoustical Society of America, 10, 186195.Google Scholar
Nabelek A.1988. Identification of vowels in quiet, noise, and reverberation: Relationships with age and hearing loss. Journal of the Acoustical Society of America, 84, 476484.Google Scholar
Nabelek A., & Donahue A.1984. Perception of consonants in reverberation by native and non-native listeners. Journal of the Acoustical Society of America, 75, 632634.Google Scholar
Nabelek A., & Mason D.1981. Effect of noise and reverberation on binaural and monaural word identification by subjects with various audiograms. Journal of Speech and Hearing Research, 24, 375383.Google Scholar
Nabelek A., & Pickett J.1974. Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners. Journal of Speech and Hearing Research, 17, 724739.Google Scholar
Nabelek A., & Robinson P.1982. Monaural and binaural speech perception in reverberation for listeners of various ages. Journal of the Acoustical Society of America, 71, 12421248.Google Scholar
Newman A., & Hochberg I.1983. Children's perception of speech in reverberation. Journal of the Acoustical Society of America, 73, 21452148.Google Scholar
Picard M., & Bradley J.2001. Revisiting speech interference in classrooms. Audiology, 40, 221244.Google Scholar
Pichora-Fuller M. K., Schneider B. A., and Daneman M.1995. How young and old adults listen to and remember speech in noise. Journal of the Acoustical Society of America, 97, 593608.Google Scholar
Ransdell S., Arecco R., & Levy C. M.2001. Bilingual long-term working memory: The effects of working memory loads on writing quality and fluency. Applied Psycholinguistics, 22, 113128.Google Scholar
Rickard Liow S.1999. Reading skill development in bilingual Singaporean children. In M. Harris & G. Hatano (Eds.), Learning to read and write: A cross-linguistic perspective (pp. 196213). Cambridge: Cambridge University Press.
Steinberg J.1929. Effects of distortion on the recognition of speech sounds. Journal of the Acoustical Society of America, 1, 1211137.Google Scholar
Southwood M. H., & Flege J. E.1999. Scaling foreign accent: Direct magnitude estimation versus interval scaling. Clinical Linguistics and Phonetics, 13, 335349.Google Scholar
Takata Y., & Nabelek A.1990. English consonant recognition in noise and in reverberation by Japanese and American listeners. Journal of the Acoustical Society of America, 88, 663666.Google Scholar
Therrien M., & Ramirez R.2000. The Hispanic population in the United States: March 2000, Current Population Reports, P20-535. Washington, DC: US Census Bureau.
Von Hapsburg D., & Peña E.2002. Understanding bilingualism and its impact on speech audiometry. Journal of Speech, Language, and Hearing Research, 45, 202213.Google Scholar
US Bureau of the Census. 1990. Census of Population, CPHL-133. Washington, DC: US Department of Commerce.
US Bureau of the Census. 2000. Statistical abstract of the United States. Washington, DC: US Department of Commerce.
US Bureau of the Census. 2001. Supplementary survey profile. Washington, DC: US Department of Commerce.
Figure 0

Language background and age descriptors as reported by Spanish–English bilingual participants

Figure 1

The number of Noisy W-22 monosyllabic words repeated correctly by ([bull ]) monolingual and ([squf ]) bilingual listeners at three SNRs(−6, −2, and 0 dB) and in quiet. Standard error bars areplotted for each group and condition.

Figure 2

The number of monosyllabic words repeated correctly by ([bull ]) monolingual and ([squf ]) bilingual listeners across SNRs for the simulated reverberant (SIG-R) environment. Standard error bars are plotted for each group and condition.

Figure 3

The number of monosyllabic words repeated correctly by (open bars) monolingual and (filled bars) bilingual listeners for 0 dB SNR across the two listening conditions. Standard error bars are plotted for each group and condition.