Hostname: page-component-7b9c58cd5d-f9bf7 Total loading time: 0 Render date: 2025-03-15T11:55:50.942Z Has data issue: false hasContentIssue false

Temporal acoustic correlates of the voicing contrast in European Portuguese stops

Published online by Cambridge University Press:  24 November 2010

Marisa Lousada
Affiliation:
Escola Superior de Saúde da Universidade de Aveiro, Universidade de Aveiro, Portugal marisalousada@ua.pt
Luis M. T. Jesus
Affiliation:
Escola Superior de Saúde da Universidade de Aveiro & Instituto de Engenharia Electrónica e Telemática de Aveiro, Universidade de Aveiro, Portugal lmtj@ua.pt
Andreia Hall
Affiliation:
Departamento de Matemática da Universidade de Aveiro e Centro de Investigação e Desenvolvimento em Matemática e Aplicações, Portugal andreia.hall@ua.pt
Rights & Permissions [Opens in a new window]

Abstract

This study focuses on the temporal analysis of stops /p b t d k ɡ/ and devoicing analysis of voiced stops /b d ɡ/ produced in different word positions by six native speakers of European Portuguese. The study explores acoustic properties related to voicing. The following acoustic properties were measured: voice onset time (VOT), stop duration, closure duration, release duration, voicing into closure duration, duration of the preceding vowel and duration of the following vowel. Results suggested that when [b d ɡ] were devoiced, the acoustic properties stop duration, closure duration, duration of the following vowel, duration of the preceding vowel and duration of voicing into closure were relevant for the voicing distinction. Implications for research and practice in speech and language therapy are discussed. Further investigation is needed to find how the productions analysed in the present study were perceived by listeners, specifically productions of devoiced stops.

Type
Research Article
Copyright
Copyright © International Phonetic Association 2010

1 Introduction

There have been many studies of stop voicing distinctions (Caramazza & Yeni-Komshian Reference Caramazza and Yeni-Komshian1974, Klatt Reference Klatt1975, Luce & Charles-Luce Reference Luce and Charles-Luce1985, Cho & Ladefoged Reference Cho and Ladefoged1999, Brunner, Fuchs, Perrier & Kim Reference Brunner, Fuchs, Perrier and Kim2003, van Alphen & Smits Reference van Alphen and Smits2004), showing that the properties that are relevant vary across languages. However, there are few studies of European Portuguese (EP), none of which analyse the different stops in word-final position (Andrade Reference Andrade1980, Viana Reference Viana1984, Veloso Reference Veloso1995, Castro & Barbosa Reference Castro and Barbosa1996). This is a study of the voicing distinction and other production characteristics in EP using new detailed temporal descriptions and devoicing criteria.

Lisker & Abramson (Reference Lisker and Abramson1964) found in a cross-linguistic study of word-initial position three categories of voice onset time (VOT): voicing lead (voiced), short lag (voiceless unaspirated) and long lag (voiceless aspirated). Keating, Linker & Huffman (Reference Keating, Linker and Huffman1983) examined 51 languages and showed that the voiceless unaspirated category is the most common category. Almost all the studied languages used this category. The voiced and voiceless aspirated categories appeared equally frequently as the voicing category contrasting with the voiceless unaspirated category. Keating et al. (Reference Keating, Linker and Huffman1983) also observed that the use of these different VOT categories varied as a function of word position in many languages.

Andrade (Reference Andrade1980) compared the VOT of homorganic stops, in initial position before a vowel, in words produced by a speaker of EP. Results showed that some voiced stops had a period of prevoicing (between 120 and 130 ms) followed by a devoiced period (between 10 and 20 ms). Results also showed that the VOT was larger for velars than for labials and dentals, as in English (Klatt Reference Klatt1975, Cho & Ladefoged Reference Cho and Ladefoged1999). Viana (Reference Viana1984) concluded that [b d ɡ] were sometimes devoiced in EP. Viana (Reference Viana1984) and Veloso (Reference Veloso1995) observed that stop duration and duration of the following vowel were acoustic properties that cued voicing in EP.

Voicing of stops produced by speakers of various other languages have long been studied. Caramazza & Yeni-Komshian (Reference Caramazza and Yeni-Komshian1974) concluded that in Canadian French more than 58% of the voiced tokens were produced without prevoicing. Luce & Charles-Luce (Reference Luce and Charles-Luce1985) suggested that vowel duration was the most reliable correlate of voicing for stops in word-final position. The mean vowel duration was longer for words ending in voiced stops (177 ms) than those ending in voiceless stops (122 ms). Brunner et al. (Reference Brunner, Fuchs, Perrier and Kim2003) suggested other acoustic properties related to the voicing distinction for Korean velar stops in medial position, namely closure duration, duration of the following vowel, duration of the preceding vowel and voicing into closure duration. Van Alphen & Smits (Reference van Alphen and Smits2004) showed that stops were often devoiced in Dutch and that there were multiple acoustic properties related to the voicing distinction, e.g. the duration of prevoicing for both labial and alveolar stops, F0 movement for labials and the spectral centre of gravity of the burst for alveolars.

Vowel devoicing and deletion in EP, as well as the effect of deletion in the neighbouring segments, have been also studied for more than a century (Andrade Reference Andrade1994). Andrade (Reference Andrade1993, Reference Andrade1995) presented acoustic and perceptual studies of CC stop clusters with identical and different places of articulation, as well as with and without an underlying vowel. Those studies provide important information on the effects of vowel devoicing and deletion on the temporal and spectral characteristics of the consonants. Moreover, in a sequence of two words, when the last syllable of the first word is identical or very similar to the first syllable of the second word, the final vowel of the former may be deleted and the two consonants may become a geminate or reduce to just one consonant. This phenomenon, known as syllable degemination (or haplology), was originally studied by Sá Nogueira (Reference Sá Nogueira1938, Reference Sá Nogueira1941) and more recently re-analysed by Frota (Reference Frota2000) in the Prosodic Phonology framework. Frota (Reference Frota2000) showed, based on the analysis of spectrograms and auditory tests, that a C1V1C2V2Footnote 1 sequence reduces to C2V2 at prosodic word boundaries within a phonological phrase, to C1C2V2 at a phonological phrase boundary and remains as such at intonational phrase boundaries, where reduction is most disfavoured/blocked.

The principal aim of this study is to contribute to the knowledge of the acoustic properties related to the voicing distinction of stop consonants /p b t d k ɡ/. So far, there have been no journal studies published about the production of EP stops, so there is a lack of reference data for normal cross-language studies, and no baseline data that can be used by Portuguese speech and language therapists in their clinical practice (e.g. VOT values could be used in the differential diagnosis of different types of dysarthria, as reported for other languages) (Morris Reference Morris1989, Ackermann & Hertrich Reference Ackermann and Hertrich1997).Footnote 2

In this study, the stops were produced in all word positions (initial, medial and final), as there is no research that has analysed EP stops in word-final position. An exhaustive temporal analysis was conducted (an EGG signal was used to determine voicing onset/offset) to obtain different acoustic properties (VOT, stop duration, closure duration, release duration, voicing into closure duration, duration of the preceding vowel and duration of the following vowel) in all word positions. Also, criteria that have been previously used for devoicing analysis of fricatives (Jesus & Shadle Reference Jesus and Shadle2002, Reference Jesus, Shadle, Mamede, Baptista, Trancoso and Nunes2003; Jesus & Jackson Reference Jesus, Jackson, Teixeira, Lima, Oliveira and Quaresma2008) were adapted to the study of stops.

2 Method

2.1 Recording and annotation

A corpus of 54 EP real words containing six stops, /p b t d k ɡ/, was recorded using a Philips SBC ME 400 unidirectional condenser microphone located 20 cm in front of the subject's mouth. An electroglottograph (EGG) signal was also collected using an EGG processor (model EG-PC3 produced by Tiger DRS, Inc., USA). The acoustic and EGG signals were pre-amplified (Rane MS 1-b) and recorded with a Sony PCM-R300 DAT recorder, each with 16 bits and a sampling frequency of 48 kHz.

The corpus contained an equal number of words (eighteen) with stops in three positions: initial position, followed by the vowels /a/, /i/ and /u/; medial position, preceded by the vowels /a/, /i/ and /u/ and followed by the vowel /ɐ/; and final position, preceded by the vowels /ɔ/ and /a/. The words were produced within the frame sentence Diga ___ por favor.Footnote 3 The words in the corpus are listed in the Appendix. The subjects were six native speakers of EP: three men – LJ (second author), HR and PA (aged 25–34 years) – and three women – ML (first author), IM and SC (aged 24–42 years) – all without any speech, language or hearing problems. LJ, HR, ML and IM lived in Aveiro, and PA and SC lived in Porto. In these regions, due to dialect characteristics, voiced stop consonants are often produced as non-strident continuants (fricated stops; Viana Reference Viana1984, Cruz-Ferreira Reference Cruz-Ferreira1999). We used six stops in three word positions, produced by six speakers in a frame sentence.

The corpus was manually segmented with Adobe Audition 3.0. The words were then analysed using the Speech Filing System (SFS) version 4.7/Windows (Huckvale et al. Reference Huckvale, Brookes, Dworkin, Johnson, Pearce and Whitaker1987). The acoustic events that were annotated are listed here (see Figures 1 and 2).

  1. 1. The beginning of a preceding vowel (IV1) was defined using the following criterion: the instant in time at which the second formant intensity becomes characteristic for a vowel in the spectrogram (Brunner et al. Reference Brunner, Fuchs, Perrier and Kim2003).Footnote 4 Figure 1 shows an example where this criterion was used (tokens where the EGG signal was clearly periodic before F2 onset). When the EGG signal was not periodic before F2 onset, IV1 was marked where the periodic signal begins both in the acoustic and EGG signals.

  2. 2. The end of a preceding vowel and beginning of closure (IO) was marked where the second formant was no longer visible in the spectrogram (Brunner et al. Reference Brunner, Fuchs, Perrier and Kim2003). It was always possible to determine IO because the words were produced within a frame sentence, so even word-initial stops were preceded by a word that ended with a vowel.

  3. 3. The beginning of prevoicing (IPV) was defined as the instant in time at which evidence of vocal fold vibration could be observed in the acoustic and EGG signals (van Alphen & Smits Reference van Alphen and Smits2004).

  4. 4. The voice offset (FV) was marked at the point where the periodic signal ends (the vocal folds ceased to vibrate) in the acoustic and EGG signals, as can be seen in Figure 1.

  5. 5. The end of prevoicing (FPV) was defined as the instant in time where the burst started.

  6. 6. The end of closure and the beginning of the release (IR) was defined by a sudden peak in the waveform and as a vertical bar in the spectrogram. When there were multiple bursts, the one with the highest intensity was chosen, as the one with the highest intensity is believed to correspond to the actual opening (Fuchs Reference Fuchs2005).

  7. 7. The end of release and beginning of the following vowel (FR) was marked where the second formant amplitude begins in the spectrogram (Brunner et al. Reference Brunner, Fuchs, Perrier and Kim2003) or where the periodic signal begins both in the acoustic and in the EGG signals.

  8. 8. The end of the following vowel (FV2) was set where the second formant was no longer visible in the spectrogram (Brunner et al. Reference Brunner, Fuchs, Perrier and Kim2003).

Figure 1 Waveform and its spectrogram, and EGG signal of the VCV sequence in the word [ˈnapɐ] ‘sheepskin’ produced by speaker PA.

Figure 2 Waveform and its spectrogram, and EGG signal of the first CV sequence in the word [ˈbufu] ‘owl’ produced by speaker PA.

In the annotation files, we also registered the position in the word (initial, medial and final) and the type of voicing according to new criteria based on those previously proposed for fricatives by Jesus & Shadle (Reference Jesus and Shadle2002):

  • When voicing was present during less than a third of the closure interval, the stop was classified as devoiced.

  • When voicing lasted from between a third and a half of the closure duration, the stop was classified as partially devoiced.

  • When the duration of voicing was greater than half the closure duration, the stop was classified as voiced.Footnote 5

2.2 Temporal analysis

The following temporal measurements were extracted from the annotation files of the corpus for all speakers:

  1. 1. Duration of preceding vowel: IO-IV1.

  2. 2. Voicing into closure duration: FV-IO (for stops in medial and in final positions).

  3. 3. Closure duration: IR-IO.

  4. 4. Voice onset time (VOT): FR-IR (for voiceless, devoiced and partially devoiced stops) or IPV-IR (for voiced stops).

  5. 5. Release duration: FR-IR (positive VOT).

  6. 6. Duration of following vowel: FV2-FR.

  7. 7. Stop duration: FR-IO.

This list of temporal measures presents some redundancies (e.g. stop duration could be calculated through the duration of all components of the stop), but it allowed us to design an algorithm to automatically extract the durations, using Matlab scripts.

2.3 Statistical tests

The software package SPSS 10.0 for Windows was used to run the statistical tests for stop durations. We first determined if the distribution was normal, based on the observation of histograms, analysis of QQ-plots of normal distribution and the results of both the Kolmogorov-Smirnov test with Lilliefors correction and Shapiro-Wilk test in initial, medial and final positions. The results refute the normality of the data for medial (p < .05) and final positions (p < .05) so we used the Mann-Whitney U test to compare the groups (duration of voiceless stops vs. duration of voiced stops). This non-parametric test allows us to compare the means of two groups when the data are not normal.Footnote 6

3 Results

3.1 Devoicing analysis

Results of devoicing as a function of place of articulation showed that in initial position, the percentage of devoiced stops was exactly the same for [b] and [d] and was zero for [ɡ]. However, [ɡ] was partially devoiced in some cases. The percentage of devoicing increased as the place of articulation moved posteriorly in medial position (see Figure 3). In word-final position, the percentage of devoiced stops was greater for [ɡ] than for [b] and [d].

Figure 3 Devoicing of [b d ɡ] for all speakers in different word positions.

3.2 Temporal analysis

The results of stop duration, closure duration, release duration, duration of the preceding vowel and duration of the following vowel, voicing into closure duration, and VOT are presented separately in the following sections.

3.2.1 Stop duration

The mean duration (averaged over the six speakers) of stops in initial position was: 174 ms for [p], 173 ms for [t], 178 ms for [k], 111 ms for [b], 102 ms for [d] and 104 ms for [ɡ]. Mean stop duration was longer for voiceless (mean = 175 ms, N = 54) than for voiced stops (mean = 106 ms, N = 54), as shown in Figure 4. For stops in medial position, the values were: 129 ms for [p], 135 ms for [t], 140 ms for [k], 74 ms for [b], 85 ms for [d] and 76 ms for [ɡ]. Mean stop duration was longer for voiceless (mean = 134 ms, N = 54) than for voiced stops (mean = 78 ms, N = 53). In final position, the values were: 164 ms for [p], 176 ms for [t], 173 ms for [k], 131 ms for [b], 117 ms for [d] and 132 ms for [ɡ]. Mean stop duration was also longer for voiceless (mean = 172 ms, N = 44) than for voiced stops (mean = 126 ms, N = 38). These differences were statistically significant (p< .001) using the Mann-Whitney U test. The analysis of the p-value of this test led us to conclude that there were significant differences between the duration of voiced stops and the duration of voiceless stops in initial, medial and final positions. The statistical tests were not applied for the other parameters because the database did not always provide more than 30 tokens in each group. This fact occurs for different reasons, e.g. the closure duration in some cases was not determined (it was impossible to determine the end of closure because many speakers produced the voiced stops without a release, particularly in medial and in final positions).

Figure 4 Stop duration of [p t k b d ɡ] in initial position.

3.2.2 Closure duration

Mean closure duration of stops in initial position was: 155 ms for [p], 146 ms for [t], 128 ms for [k], 108 ms for [b], 90 ms for [d] and 75 ms for [ɡ]. Mean closure duration was longer for voiceless (mean = 143 ms, N = 54) than for voiced stops (mean = 88 ms, N = 20), as can be seen in Figure 5. In medial position, the values of closure duration were: 110 ms for [p], 114 ms for [t], 104 ms for [k], 102 ms for [b], 57 ms for [d] and 79 ms for [ɡ]. Mean closure duration was longer for voiceless (mean = 109 ms, N = 54) than for voiced stops (mean = 65 ms, N = 16). The values obtained for stops in final position were: 123 ms for [p], 131 ms for [t], 132 ms for [k], 90 ms for [b], 77 ms for [d] and 86 ms for [ɡ]. Mean closure duration was longer for voiceless (mean = 129 ms, N = 44) than for voiced stops (mean = 82 ms, N = 21).

Figure 5 Closure duration of [p t k b d ɡ] in initial position.

3.2.3 Release duration

The release duration was longer for voiceless (mean = 33 ms, N = 54) than for voiced stops (mean = 26 ms, N = 33) in initial position. In medial position the release duration was longer for voiced (mean = 32 ms, N = 17) than for voiceless stops (mean = 25 ms, N = 54). In final position the release duration was also longer for voiced (mean = 60 ms, N = 18) than for voiceless stops (mean = 43 ms, N = 44).

3.2.4 Duration of vowels

The duration of the following vowel was longer in voiced–stop contexts (N = 54) than in voiceless–stop contexts (N = 54), when stops occurred in initial position, except for [pi] and [bi] (see Table 1) and in medial position. The duration of the preceding vowel was longer in voiced–stop contexts (N = 54 in both groups), when stops occurred in medial (see Table 2) and final positions.

Table 1 Mean duration of the following vowel in the context of stops [p t k b d ɡ] in initial position.

Table 2 Mean duration of the preceding vowel in the context of stops [p t k b d ɡ] in medial position.

3.2.5 Voicing into closure duration

The voicing into closure duration was longer for voiced (N = 10 in word-medial position and N = 23 in word-final position) than for voiceless stops (N = 54 in word-medial position and N = 52 in word-final position) in medial and word-final positions (see Figure 6). The voicing into closure duration in voiceless stops corresponds to the interval of time during which the vocal folds continue to vibrate so it is expected to be shorter for these stops.

Figure 6 Voicing into closure duration of [p t k b d ɡ] in final position.

3.2.6 VOT

In initial and medial position, voiceless stops had a positive VOT, and voiced stops had a negative VOT (fully voiced stops) or a positive VOT. The duration of positive VOT (averaged over all speakers, N = 18) in initial position was: 20 ms for [p], 28 ms for [t], 51 ms for [k], 28 ms for [b], 16 ms for [d] and 17 ms for [ɡ]. Mean VOT duration for voiceless stops was 33 ms and mean VOT duration for voiced stops was 20 ms. The average values of negative VOT were: -114 ms for [b], -89 ms for [d] and -73 ms for [ɡ] (mean [b d ɡ] = -88 ms).

In medial position, the duration of positive VOT (averaged over all speakers, N = 18) was: 19 ms for [p], 22 ms for [t], 35 ms for [k], 33 ms for [d] and 38 ms for [ɡ]. The mean value for [b] was not reported here because many speakers produced the stop without a burst (it was not possible to determine release onset). Mean VOT duration for voiceless stops was 25 ms, and mean VOT duration for voiced stops was 36 ms. The values of negative VOT were: -102 ms for [b] and -52 ms for [d] (mean [b d] = -59 ms).

Overall, VOT was on average shorter for [p] than for [t], and shorter for [t] than for [k] in initial and medial positions. In addition, VOT was on average longer before high vowels than before low vowels, which suggests that the VOT changed as a function of the characteristics of the following vowel (see Table 3).

Table 3 VOT in initial position as a function of the following vowel. Vowels are grouped in terms of their height.

4 Discussion

This study examined the acoustic properties correlated with the voicing distinction, based on a corpus that included stops in different word positions, as a contribution to the acoustic description of EP stops.

The analysis of devoicing showed that stops [b d ɡ] were sometimes partially devoiced or devoiced as reported previously for Canadian French (Caramazza & Yeni-Komshian Reference Caramazza and Yeni-Komshian1974), EP (Andrade Reference Andrade1980, Viana Reference Viana1984) and Dutch (van Alphen & Smits Reference van Alphen and Smits2004). Vibration of the vocal folds can only occur when two physiological and aerodynamic conditions are met. First, the vocal folds must be adducted and tensed. Second, a sufficient transglottal pressure gradient is needed to cause enough positive airflow through the glottis to maintain vibration (van den Berg Reference van den Berg1958). There is a recognised difficulty in supporting vibration during stop production because the air flowing through the glottis accumulates in the oral cavity, causing oral pressure to approach subglottal pressure (Ohala Reference Ohala and MacNeilage1983).

The results also showed that the percentage of devoicing increased as the place of articulation moved posteriorly except for stops in initial position. There is evidence that devoicing varies as a function of place of articulation. Ohala (Reference Ohala and MacNeilage1983) has suggested that voiced velar stops are more easily devoiced than voiced stops produced at other places of articulation. This is due to aerodynamic reasons and to the net compliance of the surfaces on which oral air pressure impinges during the production of stops. In velar stops, only the pharyngeal walls and part of the soft palate can yield to the air pressure. In dentals, these surfaces plus the greater part of the tongue surfaces and all of the soft palate are involved. In labials, these surfaces plus all of the tongue surface and some parts of the cheeks participate (Rothenberg Reference Rothenberg1968).

In initial position, the mean stop duration was longer for voiceless (175 ms) than for voiced stops (106 ms). These results agree with those of Viana (Reference Viana1984) and Veloso (Reference Veloso1995). In medial position, the stop duration was longer for voiceless (134 ms) than for voiced stops (78 ms). In final position, this measure was also longer for voiceless (172 ms) than for voiced stops (126 ms). The duration of the preceding vowel was longer in voiced stop contexts, when stops occurred in medial position, as shown previously by Brunner et al. (Reference Brunner, Fuchs, Perrier and Kim2003). The duration of the preceding vowel before voiced stops has been shown to be longer than before voiceless stops in English (Peterson & Lehiste Reference Peterson and Lehiste1960, Luce & Charles-Luce Reference Luce and Charles-Luce1985). The mean closure duration in initial position was longer for voiceless (143 ms) than for voiced stops (88 ms) confirming the results reported by Viana (Reference Viana1984). In medial position, voiceless stops had longer closure durations (109 ms) than voiced stops (65 ms), as previously reported for Korean (Brunner et al. Reference Brunner, Fuchs, Perrier and Kim2003). The closure duration was also longer for voiceless (129 ms) than for voiced stops (82 ms) in final position.

Longer vowel durations were accompanied by shorter closure durations for /b d ɡ/ while shorter vowel durations were associated with longer closure durations for /p t k/. Kluender, Diehl & Wright (Reference Kluender, Diehl and Wright1988) have suggested an auditory explanation for these results. They have proposed that covariation of voicing correlates is planned to increase perceptual distinctiveness. Consequently, vowel duration differences serve to augment the distinctiveness of the closure duration cue to the voicing distinction. Long vowel durations preceding short closures result in the perception of even shorter closures, whereas short vowels preceding long closure intervals are perceived as longer closures. This auditory hypothesis suggested that speakers can exert control over the cues that have mutually reinforcing auditory effects to signal phonetic contrasts.

The release duration was longer for voiceless (33 ms) than for voiced stops (26 ms) in initial position, as in other studies of EP (Viana Reference Viana1984) and in a study of Dutch (van Alphen & Smits Reference van Alphen and Smits2004). In medial position, the release duration was longer for voiced (32 ms) than for voiceless stops (25 ms), which is not expected, although the number of occurrences of voiced stops was much lower than that of voiceless stops, to generalise these results. In final position, the release duration was also longer for voiced (60 ms) than for voiceless stops (43 ms).

The duration of the following vowel was longer in voiced stop contexts than in voiceless stop contexts, when stops occurred in initial and medial positions, confirming results from previous studies of EP (Viana Reference Viana1984) and Korean (Brunner et al. Reference Brunner, Fuchs, Perrier and Kim2003). The voicing into closure duration was longer for voiced than for voiceless stops in word-medial and word-final positions, as we expected. The results of voicing into closure duration for stops in medial position were similar to those presented by Brunner et al. (Reference Brunner, Fuchs, Perrier and Kim2003).

These results suggested that when [b d ɡ] were devoiced, the acoustic properties stop duration, closure duration, duration of the following vowel, duration of the preceding vowel and duration of voicing into closure were relevant for the voicing distinction, and not only the two properties (stop duration and duration of the following vowel) that have been previously proposed for EP (Viana Reference Viana1984, Veloso Reference Veloso1995). Perceptual tests are needed to complement our data to reveal how the productions of the present study were perceived by listeners, e.g. in devoiced stops.

Results also showed that many voiced stops were produced without a release particularly in medial and in final positions. This can be due to the fact that voiced stops may be produced as non-strident continuants (fricated stops), which often occurs in some speakers’ regions. These fricated stops may be characterised by weak friction noise or by a transitional approximant-like formant structure, depending on the surrounding segmental and prosodic context. Frication of stops is most favoured in word-medial position (Viana Reference Viana1984, Cruz-Ferreira Reference Cruz-Ferreira1999). Concerning final position, release can also be missing because a stop followed by another stop ([p] of por favor in the second part of the frame sentence) may be unreleased (Andrade Reference Andrade1994).

Results confirmed that place of articulation and VOT were related (VOT was longer for [k] than for [t] and [p]), as previously reported by Klatt (Reference Klatt1975), Andrade (Reference Andrade1980), Viana (Reference Viana1984) and Cho & Ladefoged (Reference Cho and Ladefoged1999). In velar stops, the volume of the cavity behind the point of constriction is relatively smaller than in bilabial and dental stops, and this causes greater pressure, which will take longer to fall, allowing adequate transglottal pressure for the beginning of vocal fold vibration (Cho & Ladefoged Reference Cho and Ladefoged1999: 209). This could explain why the VOT was longer for [k] than for [t] and [p]. The volume of the cavity in front of the point of constriction is greater in velar stops than in bilabial and dental stops, and this causes a greater obstruction to the release of the pressure behind the velar stop, thus this pressure will take longer to fall, provoking a delay in the production of adequate transglottal pressure (Cho & Ladefoged Reference Cho and Ladefoged1999: 209), also resulting in a longer VOT for [k] than for [t] and [p]. A faster articulatory velocity in bilabials (movement of the lips) than in velars (movement of tongue dorsum) allows a more rapid decrease in the pressure behind the closure and consequently a shorter time before an appropriate transglottal pressure is reached (Cho & Ladefoged Reference Cho and Ladefoged1999: 210), which results in a shorter VOT for [p] than for [k]. Results also indicate that the VOT changed as a function of the characteristics of the following vowel, as previously observed by Klatt (Reference Klatt1975) and Viana (Reference Viana1984).

Results show that the stops /b d ɡ/ in EP are generally fully voiced (see Figure 3). Voiceless stops /p t k/ are unaspirated, as in French, Spanish, Italian and many other Romance languages (Ladefoged Reference Ladefoged2006: 148), making the stop /p/ of these languages similar to English initial /b/. The VOT data presented in this study complement previous studies (Lisker & Abramson Reference Lisker and Abramson1964, Keating et al. Reference Keating, Linker and Huffman1983) that compare VOT in different languages.

5 Conclusions

Until now, studies of Portuguese stops have been few and limited. The analysis of EP stops in different word positions contributes to the acoustic phonetic description of EP stops. The devoicing analysis and the detailed temporal description used in this work are also new for EP stops. The data obtained in the present study could now be used by Portuguese speech and language therapists, namely VOT values could be used in the differential diagnosis of types of dysarthria, as previously reported in other studies (Morris Reference Morris1989, Ackermann & Hertrich Reference Ackermann and Hertrich1997), and devoicing values could also be used as reference data to compare with dysarthric patients who usually devoice consonants, particularly in word-final position (Scott & Ringel Reference Scott and Ringel1971, Platt, Andrews & Howie Reference Platt, Andrews and Howie1980).

The criteria adapted from Jesus & Shadle (Reference Jesus and Shadle2002) are a new, useful and practical method to analyse devoicing in stops. These criteria have been applied to the analysis of stops for the first time in the current study and have been applied in other studies (Barroco et al. Reference Barroco, Domingues, Pires, Lousada and Jesus2007, Pinho, Jesus & Barney Reference Pinho, Jesus and Barney2010) for normal and disordered speech. The exhaustive temporal description presented in this study allows analysis of different acoustic events that are relevant to the voicing distinction in stops when these consonants are devoiced.

Acknowledgements

The authors would like to thank Dr. Amália Andrade and Dr. Maria do Céu Viana, at the Centro de Linguística da Universidade de Lisboa (CLUL), Portugal. This work was developed during the first author's M.Sc. in Speech and Hearing Sciences at the Universidade de Aveiro, Portugal.

Appendix. Corpus used for the experiments

The transcriptions using the International Phonetic Alphabet (IPA) were adapted from the illustration proposed by Cruz-Ferreira (Reference Cruz-Ferreira1999) for European Portuguese.

Table A1 Words of corpus with stops /p b/ in initial, medial and final position.

Table A2 Words of corpus with stops /t d/ in initial, medial and final position.

Table A3 Words of corpus with stops /k ɡ/ in initial, medial and final position.

Footnotes

1 C1 = C2

2 The only related study published in a journal is that of Barroco, Domingues, Pires, Lousada & Jesus (Reference Barroco, Domingues, Pires, Lousada and Jesus2007). It analyses stops in two Portuguese children, one with speech disorders and one with normal speech.

3 Sentences were produced with a pause between the end of the target word and the voiceless bilabial stop consonant that followed, so the carrier sentence did not have a direct impact on the duration measurements of the target stop consonants occurring in word-final position.

4 The words were analysed using wideband spectrograms. The bandwidth for the wideband display was fixed at 300Hz.

5 These criteria are also used in Pinho, Jesus & Barney (Reference Pinho, Jesus and Barney2010).

6 The use of non-parametric statistical tests does not imply that sample means and standard deviations are meaningless. In addition, we provide the sample median as a complement to the traditional mean and standard deviation statistics. Since the size of the considered sample lies between 38 and 54, we believe that the goodness-of-fit test considered (Kolmogorov-Smirnov with Lilliefors correction and Shapiro-Wilk) is meaningful.

References

Ackermann, Hermann & Hertrich, Ingo. 1997. Voice onset time in ataxic dysarthria. Brain and Language 56 (3), 321333.CrossRefGoogle ScholarPubMed
Andrade, Amália. 1980. Estudos experimentais aerodinâmicos, acústicos e palatográficos do vozeamento nas consoantes. Lisboa, Portugal: Centro de Linguística da Universidade de Lisboa.Google Scholar
Andrade, Amália. 1993. Estudo experimental de sequências de oclusivas em Português Europeu. IX Encontro Nacional da Associação Portuguesa de Linguística, Coimbra, Portugal, 1–15.Google Scholar
Andrade, Amália. 1994. Reflexões sobre o ‘e-mudo’ em Português Europeu. Congresso Internacional Sobre o Português, Lisboa, Portugal, vol. 2, 303344.Google Scholar
Andrade, Amália. 1995. Percepção de C ou CC oclusivas por ouvintes nativos do Português. XI Encontro Nacional da Associação Portuguesa de Linguística, Lisboa, Portugal, vol. 3, 153186.Google Scholar
Barroco, Mário, Domingues, Marta, Pires, Maria F., Lousada, Marisa & Jesus, Luis M. T.. 2007. Análise temporal das oclusivas orais do Português Europeu: Um estudo de caso de normalidade e perturbação fonológica (Temporal analysis of European Portuguese stops: a case study of normality and phonologically disturbance). Revista CEFAC 9 (2), 154163.CrossRefGoogle Scholar
Brunner, Jana,Fuchs, Susanne, Perrier, Pascal & Kim, Hyeon-Zoo. 2003. Mechanisms of contrasting Korean velar stops: A catalogue of acoustic and articulatory parameters. ZAS – Papers in Linguistics 32, 1530.CrossRefGoogle Scholar
Caramazza, Alfonso & Yeni-Komshian, Grace. 1974. Voice onset time in two French dialects. Journal of Phonetics 2, 239245.CrossRefGoogle Scholar
Castro, São L. F. & Barbosa, Manuel F. S.. 1996. Estudo percepto-acústico do contraste de vozeamento em oclusivas portuguesas: Primeiros resultados. Psychologica 15, 159164.Google Scholar
Cho, Taehong & Ladefoged, Peter. 1999. Variation and universals in VOT: Evidence from 18 languages. Journal of Phonetics 27, 207229.CrossRefGoogle Scholar
Cruz-Ferreira, Madalena. 1999. Portuguese (European). In IPA (eds.), Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet, 126130. Cambridge: Cambridge University Press.Google Scholar
Frota, Sónia. 2000. Prosody and focus in European Portuguese: Phonological phrasing and intonation. New York: Garland.Google Scholar
Fuchs, Susanne. 2005. Articulatory correlates of the voicing contrast in alveolar obstruent production in German. Ph.D. dissertation, Queen Margaret University College, Edinburgh, UK.CrossRefGoogle Scholar
Huckvale, Mark,Brookes, David Michael, Dworkin, Leigh, Johnson, Michael, Pearce, David & Whitaker, Louise. 1987. The SPAR Speech Filing System. European Conference on Speech Technology, Edinburgh, 305308.Google Scholar
Jesus, Luis M. T. & Jackson, Philip J. B.. 2008. Frication and voicing classification. In Teixeira, António, Lima, Vera, Oliveira, Luís & Quaresma, Paulo (eds.), Computational processing of the Portuguese language, 1120. Berlin: Springer.CrossRefGoogle Scholar
Jesus, Luis M. T. & Shadle, Christine H.. 2002. A parametric study of the spectral characteristics of European Portuguese fricatives. Journal of Phonetics 30 (3), 437464.CrossRefGoogle Scholar
Jesus, Luis M. T. & Shadle, Christine H.. 2003. Devoicing measures of European Portuguese fricatives. In Mamede, Nuno J., Baptista, Jorge, Trancoso, Isabel & Nunes, Maria G. V. (eds.), Computational processing of the Portuguese language, 18. Berlin: Springer.Google Scholar
Keating, Patricia A.,Linker, Wendy & Huffman, Marie. 1983. Patterns in allophone distribution for voiced and voiceless stops. Journal of Phonetics 11, 277290.CrossRefGoogle Scholar
Klatt, Dennis H. 1975. Voice onset time, frication, and aspiration in word-initial consonant clusters. Journal of Speech and Hearing Research 18 (4), 686706.CrossRefGoogle ScholarPubMed
Kluender, Keith R.,Diehl, Randy L. & Wright, Beverly A.. 1988. Vowel length differences before voiced and voiceless consonants: an auditory explanation. Journal of Phonetics 16, 153159.CrossRefGoogle Scholar
Ladefoged, Peter. 2006. A course in phonetics, 5th edn. Boston: Thomson Wadsworth.Google Scholar
Lisker, Leigh & Abramson, Arthur S.. 1964. A cross-language study of voicing in initial stops: acoustical measurements. Word 20, 384422.CrossRefGoogle Scholar
Luce, P. & Charles-Luce, J.. 1985. Contextual effects on vowel duration, closure duration, and the consonant/ vowel ratio in speech production. Journal of the Acoustical Society of America 78 (6), 19491957.CrossRefGoogle ScholarPubMed
Morris, Richard J. 1989. VOT and dysarthria: A descriptive study. Journal of Communication Disorders 22 (1), 2333.CrossRefGoogle ScholarPubMed
Ohala, John J. 1983. The origin of sound patterns in vocal tract constraints. In MacNeilage, Peter F. (ed.), The production of speech, 189216. New York: Springer.CrossRefGoogle Scholar
Peterson, Gordon & Lehiste, Ilse. 1960. Duration of syllable nuclei in English. Journal of the Acoustical Society of America 32 (6), 693703.CrossRefGoogle Scholar
Pinho, Cátia,Jesus, Luis M. T. & Barney, Anna. 2010. Aerodynamics of voiced stop production. In International Conference on Voice Physiology and Biomechanics, Madison, WI.Google Scholar
Platt, L. J.,Andrews, Gavin & Howie, Pauline M.. 1980. Dysarthria of adult cerebral palsy II. Phonemic analysis of articulation errors. Journal of Speech and Hearing Research 23, 4155.CrossRefGoogle ScholarPubMed
Rothenberg, Martin. 1968. The breath-stream dynamics of simple–released–plosive production (Bibliotheca Phonetica 6). Basel: Karger.Google Scholar
Sá Nogueira, Rodrigo. 1938. Elementos para um tratado de fonética Portuguesa. Lisboa: Centro de Estudos Filológicos.Google Scholar
Sá Nogueira, Rodrigo. 1941. Tentativa de explicação dos fenómenos fonéticos do Português. Lisboa: Livraria Clássica Editora.Google Scholar
Scott, Cheryl M. & Ringel, Robert L.. 1971. The effects of motor and sensory disruptions on speech: A description of articulation. Journal of Speech and Hearing Research 14, 819828.CrossRefGoogle ScholarPubMed
van Alphen, Petra M. & Smits, Roel. 2004. Acoustical and perceptual analysis of the voicing distinction in Dutch initial plosives: the role of pre-voicing. Journal of Phonetics 32 (4), 455491.CrossRefGoogle Scholar
van den Berg, Janwillem. 1958. Myoelastic-Aerodynamic theory of voice production. Journal of Speech and Hearing Research 1 (3), 227243.CrossRefGoogle ScholarPubMed
Veloso, João. 1995. Aspectos da percepção das “oclusivas fricatizadas” do Português: Contributo para a compreensão do processamento de contrastes alofónicos. Provas de Aptidão Pedagógica e Capacidade Científica, Universidade do Porto, Porto, Portugal.Google Scholar
Viana, Maria C. 1984. Étude de deux aspects du consonantisme du portugais: fricatisation et dévoisement. Ph.D. dissertation, Université des Sciences Humaines de Strasbourg, Strasbourg, France.Google Scholar
Figure 0

Figure 1 Waveform and its spectrogram, and EGG signal of the VCV sequence in the word [ˈnapɐ] ‘sheepskin’ produced by speaker PA.

Figure 1

Figure 2 Waveform and its spectrogram, and EGG signal of the first CV sequence in the word [ˈbufu] ‘owl’ produced by speaker PA.

Figure 2

Figure 3 Devoicing of [b d ɡ] for all speakers in different word positions.

Figure 3

Figure 4 Stop duration of [p t k b d ɡ] in initial position.

Figure 4

Figure 5 Closure duration of [p t k b d ɡ] in initial position.

Figure 5

Table 1 Mean duration of the following vowel in the context of stops [p t k b d ɡ] in initial position.

Figure 6

Table 2 Mean duration of the preceding vowel in the context of stops [p t k b d ɡ] in medial position.

Figure 7

Figure 6 Voicing into closure duration of [p t k b d ɡ] in final position.

Figure 8

Table 3 VOT in initial position as a function of the following vowel. Vowels are grouped in terms of their height.

Figure 9

Table A1 Words of corpus with stops /p b/ in initial, medial and final position.

Figure 10

Table A2 Words of corpus with stops /t d/ in initial, medial and final position.

Figure 11

Table A3 Words of corpus with stops /k ɡ/ in initial, medial and final position.