Introduction
Presbycusis is a condition associated with ageing; its most pertinent clinical characteristic is difficulty in understanding and discriminating speech sounds (either with or without hearing loss), particularly in noisy and reverberant listening environments.Reference Tremblay, Piskosz and Souza 1 , Reference Van Rooij, Plomp and Orlebeke 2 Reduced hearing sensitivity (particularly above 3 kHz) is a significant contributory factor in reduced speech recognition in people older than 60 years.Reference van Rooij and Plomp 3 – Reference Martini, Comacchio and Magnavita 6
Degenerative alterations in the peripheral and central auditory pathways have been attributed to a reduced ability of the ageing auditory system to process spectral and temporal cues in speech at both the subcortical and cortical levels.Reference Ottaviani, Maurizi, D'alatri and Almadori 7 , Reference Pichora-Fuller, Schneider, MacDonald, Pass and Brown 8 Poor discrimination of auditory signals on psychoacoustic testing suggests that central auditory processing is compromised in the older population.Reference Lister, Maxfield, Pitt and Gonzalez 9 , Reference Tremblay, Billings and Rohila 10 However, the results of psychoacoustic tests are affected by cognition and memory, which confound their clinical interpretation.
Hence, auditory-evoked potentials represent an objective test for examining the neural representation of temporal speech cues at the cortical level. Many studies into neural encoding of the spectral and temporal features of speech have reported deficient neural representation of temporal cues at the cortical level in older listeners compared with young adults.Reference Van Rooij, Plomp and Orlebeke 2 , Reference Tremblay and Ross 11 – Reference Oku and Hasegewa 14 Nonetheless, neural processing of speech information may also be impaired at the subcortical level.
Conventional auditory brain stem response (ABR) testing using clicks has demonstrated delayed latencies in geriatric people compared with younger adults, suggesting subcortical involvement in poor speech discriminationReference Rosenhall, Björkman, Pedersen and Kall 15 , Reference Rosenhall, Pedersen and Dotevall 16 ; in contrast, other studies have found minimal or no effect of advancing age on ABR latencies.Reference Jerger and Johnson 17 – Reference Chandrasekaran and Kraus 20 Thus, the results of click-evoked ABRs are contradictory and controversial. Hence, ABRs evoked with speech syllables are likely to be more effective and provide additional information for evaluating neural speech encoding at the subcortical level. Consequently, speech-evoked ABR testing provides an accurate, precise method of describing the neural representation of timing events within a consonant–vowel stimulus.Reference Chandrasekaran and Kraus 20 – Reference Russo, Nicol, Trommer, Zecker and Kraus 22 In this test, transient components of the response indicate the encoding of syllable onset and offset and the sustained portion of the response indicates a neural frequency following response to the fundamental frequency and vowel formants of the syllable.Reference Chandrasekaran and Kraus 20 , Reference Song, Banai, Russo and Kraus 21 , Reference Wible, Nicol and Kraus 23
Autism spectrum disorder patients have abnormal speech-evoked ABRs despite having normal click-evoked ABRs, suggesting that they have inadequate speech transcription in both quiet and noisy conditions.Reference Hall 24 Similarly, an exploratory study into the relationship between click-evoked and speech-evoked ABRs in people with learning disabilities concluded that the responses obtained by the two stimuli reflect separate neural processes and that only the processes involved in speech encoding are altered in children with learning disabilities.Reference Wible, Nicol and Kraus 23
Therefore, it has been suggested that impaired speech processing at the brainstem level may be an objective indicator of alterations in the physiological mechanisms responsible for deficient and abnormal speech perception in children and adults, including the geriatric population.Reference Krishnan 25 This indicator can differentiate between poor discrimination caused by subcortical and cortical contributions, which may help identify appropriate amplification strategies to resolve discrimination problems and thus provide informative counselling and promote realistic expectations of amplification in older adults.
However, very few studies have investigated the speech stimulus processing at the subcortical level. Hence, there is a need for more research into the neural speech encoding at this level. The current study aimed to record speech-evoked ABRs and describe speech processing in a geriatric population with clinically normal hearing to establish a routine tool for clinical decision-making.
Materials and methods
Study design
This prospective survey was conducted at the Electrophysiological Laboratory, Department of Audiology, Ali Yavar Jung National Institute for the Speech & Hearing Disabilities (Divyangjan). Ethical approval was obtained from the institutional review board and informed consent was obtained from all participants.
Participants
A total of 50 adults of both sexes were recruited into the study and separated into two groups: a young adult group (n = 25; age range 18–25 years; mean ± standard deviation (SD) 21.3 ± 3.2 years) and a geriatric group (n = 25; age range 60–75 years; mean ± SD 66.1 ± 6.2 years). All participants were right-handed with no known history of neurological disease, otological disease, trauma or psychiatric problems. All participants had pure-tone thresholds of 25 dB HL or below at octave frequencies from 0.25 kHz to 8 kHz in both ears and had normal middle-ear function on immittance evaluation.
Further, inclusion of participants in the young adult group also required wave V latency to a 100 μs click within the range of normative values (5.41–5.96 ms, mean ± 1.5 SD presented at 45 dB nHL in rarefaction polarity to the right ear at a rate of 21.1 clicks per second) and a speech identification score of at least 90 per cent at 40 dB SL in both ears. The geriatric group was the study group and the young adult group served as reference group in all analyses.
Stimulus and recording parameters
A Hindi stop voiced phoneme of the consonant–vowel combination |da of 40 ms duration and comprising five formants was synthesised: the initial noise burst was 10 ms and the formant transition between the consonant and vowel was of 30 ms. The fundamental frequency (F0) and the first three formants (F1, F2, F3) change linearly over the duration of the stimulus: F0, 0.113–0.147 kHz; F1, 0.24–0.77 kHz; F2, 1.67–135 kHz; and F3, 2.68–2.55 kHz. Formants F4 and F5 remain constant at 3.70 and 4.60 kHz, respectively. Although the stimulus does not contain any steady-state portion of vowel, it is still perceived as the syllable |da|.
With participants comfortable on a reclining chair, speech-evoked ABRs were recorded through Ag–AgCl electrodes with a surface contact impedance of less than 5 kΩ positioned centrally on the scalp at Cz (high forehead placement), behind the right mastoid (reference) and on the forehead (ground). Stimuli were applied to the right ear at a rate of 11.1 per second at a comfortable listening level of 65 dB SL relative to the threshold at 1 kHz in alternating polarity through ER-3 insert earphones (Etymotic Research, Elk Grove Village, Illinois, USA).
Two blocks of 2000 sweeps were collected in quiet conditions and noisy conditions (ipsilateral white Gaussian noise, +5 signal-to-noise ratio). The sampling rate was 20 kHz and responses were online band passed filtered from 0.1 to 3 kHz at 12 dB per octave. Trials with eye blinks or other motion artefacts greater than ± 35 μV were subject to online automatic rejection during the recording. The recording window was 50 ms, starting 10 ms prior to stimulus onset. Waveforms were averaged online using SmartEP software version 2.39 (Intelligent Hearing Systems, Miami, Florida, USA).
Data analysis
The obtained waveforms were labelled V, A, C, D, E, F, and O. Waves V and A reflect the onset of the response, wave C the transition response, waves D, E and F the periodic response (i.e. the frequency-following response), and wave O the offset of the response. The absolute latencies and amplitudes of these waves were determined. The V-A complex latency, amplitude, area (VA) and slope (V/A = VA amplitude ÷ VA duration) were determined.
Mean and SD values were calculated for each sample. Independent sample two-tailed t-tests were used to assess differences in speech-evoked response between young adult and geriatric groups in quiet and noisy conditions. Statistical significance was set at a p value of 0.05.
Results
The neurophysiological responses of the brainstem to a click and to a Hindi stop voiced phoneme of the consonant–vowel combination |da| were recorded in all participants in the young adult and geriatric groups.
Click-evoked auditory brainstem response
Statistical analysis of click-evoked ABRs of both groups demonstrated that the latencies of waves I, III and V were slightly delayed in the geriatric group compared with the young adult group, although all values were within the normal range according to Hall's normative data (Table I): wave I, 1.54 ± 0.10 ms; wave III, 3.70 ± 0.15 ms; wave V, 5.60 ± 0.19 ms; I–III inter-peak latency, 2.20 ± 0.16 ms; III–V inter-peak latency, 1.84 ± 0.17 ms; and I–V inter-peak latency, 4.04 ± 0.18 ms.Reference Russo, Nicol, Musacchia and Kraus 26
Data are means ± standard deviation. *p < 0.05
Absolute latencies were slightly delayed in the geriatric group compared with the young adult group. This difference was significant for wave V but not for waves I and III. The inter-peak latency values were within the normal range for all participants, suggesting that auditory brainstem transmission in response to a click stimulus is intact in both groups.
Speech-evoked auditory brainstem response
The waveforms obtained in response to the stimulus structure comprising the consonant and vowel portions of the syllable |da| in quiet conditions indicated that the onset (peaks V and A) and the transition (peak C) were observed in all participants (100 per cent), whereas the frequency-following response (peak D, E and F) were present in 93.3 per cent and 84 per cent, respectively, and the offset response (peak O) was detected in 99 per cent and 79 per cent, respectively, of participants in the young adult and geriatric groups.
Upon the introduction of noise at a +5 dB signal-to-noise ratio, onset peaks V and A, peak C and the sustained portion (peaks D, E and F) were present in 86 per cent, 77 per cent and 69 per cent, respectively, of the young adult group, and in 53 per cent, 45 per cent and 23 per cent, respectively, of the geriatric group. Transient waves V and A, as well as C, were particularly affected: these were severely degraded and completely obscured in more than 53 per cent of participants in the geriatric group. Waves V and A were present in most participants (86 per cent and 76 per cent, respectively) in the young adult group.
Latencies and amplitudes of discrete peaks
Speech-evoked ABR waveform latencies, the amplitudes of discrete peaks (V, A, C, D, E, F and O), and the latency, amplitude, area and slope between waves V and A (i.e. the V/A complex) were calculated for all participants in the young adult and geriatric groups in quiet and noisy conditions. The absolute mean latencies and amplitudes of peaks for both groups in quiet and noisy conditions are presented in Table II (including mean ± SD for discrete peaks latencies, amplitudes and slope in both conditions).
Data are means ± standard deviation. *p < 0.05.
Analysis of variance showed that both noisy conditions (F(1, 47) = 84.12, p < 0.05) and group (F(1, 50) = 19.07, p < 0.05) had a significant effect on the latency and amplitude of speech-evoked ABRs. A paired sample t-test showed a significantly delayed mean latency and a significantly reduced mean amplitude in the geriatric group.
There was a significant interaction between study group and noisy conditions (F(1, 47) = 5.49, p < 0.05), indicating that the effect of noisy conditions differed between groups. Data were more variable among participants in the geriatric group than among participants in the young adult group in noisy conditions. The latencies of waves V, A and O in noisy conditions exhibited significant between-group differences (p < 0.05). The V, A, O and V-A amplitudes differed significantly between groups, as did the slope and area of the V/A complex (p < 0.05). The difference in magnitudes of the V and A amplitude and V/A slope between the young adult group and the geriatric group in noisy conditions are shown in Figure 1. The latency of peak V was significantly increased in the speech-evoked ABR compared with the click-evoked ABR for the geriatric group (p < 0.05).
Speech stimulus evoked auditory brainstem response: sustained portion
For all participants, responses to formant transitions of the stimulus were analysed using frequency Fourier transform and root mean square measures in noisy and quiet conditions. These measures provide information about the overall magnitude of sustained neural activity and the phase-locking capabilities of the neural population in the auditory system.Reference Chandrasekaran and Kraus 20 , Reference Kraus and Nicol 27 – Reference Vander Werff and Burns 29
Frequency Fourier transform was performed to calculate the timing of the frequency-following response and the magnitude of the neural response over the entire period of the stimulus (root mean square amplitude). Timing of the frequency-following response is indicated by S–R correlation, and the magnitude of the response was evaluated with root mean square, F0, F1 and high frequency amplitudes (shown in Table III).
Data are means ± standard deviation. RMS = root mean square; F0 = fundamental frequency; F1 = first format; HF = high frequency. *p < 0.05.
The addition of noise at a +5 signal-to-noise ratio obscured onset peaks in the responses of many participants, but participants in the geriatric group had the most severely affected responses. These participants had significantly delayed latencies and reduced amplitudes for waves D, E and O. The composite frequency-following response was almost indiscernible in noisy conditions in most participants in the geriatric group. The root mean square amplitude and S–R correlations showed significant reductions in noisy conditions (p < 0.05).
The spectral magnitude of F0, F1 and the high frequency component were also significantly affected by the presence of noise in the older population (p < 0.05). There was a significant difference in representation at the brainstem level in older people in noisy conditions (shown in Figure 2).
Discussion
The current study investigated age-related alterations in subcortical neural encoding of speech features by comparing neurophysiological responses to both click and speech stimuli in young adults and geriatric people with normal hearing.
Click-evoked auditory brainstem response
Significantly longer V latency values were noted in older than in younger adults. However, the inter-peak latency values were within the normal limits in both groups. This finding suggests that there are subtle differences in brainstem neural timing from the auditory nerve to the inferior colliculus in older and younger adults.Reference Rosenhall, Björkman, Pedersen and Kall 15 – Reference Jerger and Johnson 17 This is consistent with reports suggesting subcortical involvement in poor speech discrimination in older adults due to impaired neural processing.
However, in 1991, both Martini et al. and Ottaviani et al. reported contradictory results.Reference Martini, Comacchio and Magnavita 6 , Reference Ottaviani, Maurizi, D'alatri and Almadori 7 These researchers concluded that click-evoked ABR latency differences between younger and older groups result from audiometric threshold differences rather than impaired neural processing in the central auditory system. Thus, click-evoked ABRs are probably affected by confounding factors and may therefore be unsuitable for identifying possible abnormalities in the subcortical portion of the auditory pathway in older people.
Speech-evoked auditory brainstem response
Transient peak latency and amplitude
There were significant differences in the absolute mean latencies and amplitudes of peaks between groups. Latencies of the V and A waves were significantly longer and amplitudes of the A, C and O wave were reduced in the geriatric population compared with younger adults. The latency, amplitude and slope of the VA complex were reduced in older adults. The transient onset and offset components of the speech-evoked ABR were recently reported to have longer latencies and reduced amplitudes with advancing age.Reference Vander Werff and Burns 29
Sustained portion peaks latency and amplitude
All peaks latencies within the sustained portion (frequency-following response) of the speech-evoked ABR in noisy conditions were significantly different between groups. There were significant between-group differences in stimulus–response (lag) timing and in spectral amplitudes related to the F0, F1 and high frequency components of the frequency-following response. This finding is consistent with studies into speech-evoked ABRs in children with language-based learning, auditory processing and phonological processing problems who had poor speech perception and discrimination ability attributable to a reduced subcortical capacity to precisely encode dynamic temporal–spectral features of speech.Reference Russo, Nicol, Trommer, Zecker and Kraus 22 , Reference Wible, Nicol and Kraus 23
Degenerative changes and increased susceptibility to neural de-synchronisation in ageing auditory system have been reported.Reference Vander Werff and Burns 29 Therefore, deficits in response generator synchronisation (amplitude differences) and nervous signalling transmission velocity (latency differences) due to structural changes in ageing subcortical structures may affect sensorineural coding of speech stimuli and thus limit the effectiveness of acoustic information processing at the cortical level, resulting in poor speech perception in older people.
-
• Speech-evoked auditory brainstem response testing provides important information and quantifiable measures about the neural mechanisms responsible for normal and altered auditory function
-
• This test is important for investigating the auditory processes involved in speech discrimination deficits in elderly people
-
• Test findings may be critical for designing effective rehabilitation strategies for elderly people with auditory discrimination deficits
It is therefore likely that speech-evoked ABRs may detect degenerative age-related alterations and neural de-synchronicity at the subcortical level in the geriatric population. However, it is not possible to make firm conclusions about speech-evoked ABR utility in older individuals because speech recognition was not assessed in this study. Hence, it is difficult to link age-related differences in speech-evoked ABR measures to difficulties in understanding speech in older adults. Therefore, the potential correlation between reduced speech perception abilities and speech-evoked ABR measures should be investigated in older people.
In addition, further study into the relationship between neural processing of speech at the brainstem and higher processing levels in older adults could be informative for understanding the underlying mechanism of age-related auditory processing difficulties. This knowledge could lead to objective diagnostic tests, as well as techniques to determine appropriate intervention strategies and to monitor their effectiveness in the elderly population.
Conclusion
This study found that the early stages of auditory pathway processing of speech stimuli differ in geriatrics and younger adults. The geriatric population had a longer latency of transient peaks and shorter spectral magnitude of the higher frequency components, reflecting reduced synchronous neural activity in the geriatric population in response to the rapidly changing features of an acoustic stimulus.
These results suggest that geriatric individuals have a general reduction in synchronous neural firing in response to transient speech information at the onset of a speech syllable in noisy conditions. Impaired timing of the neural response to the offset of the stimulus may partly explain the reduced ability of the ageing auditory system to encode the temporal features of speech. Therefore, analysis of speech-evoked ABRs may provide insight into the biological processes influencing speech processing in the geriatric population.
However, studies that directly assess speech perception and temporal processing in the elderly and compare performance of these tasks with speech-evoked ABRs are needed to determine the clinical relevance of assessing individuals with difficulties in understanding speech. Previous reports of the brainstem response to speech stimulus in populations with poor speech discrimination and understanding highlight the importance of performing similar studies in the elderly.