1. Introduction
Spoken Nigerian English (hereafter NigE) is said to differ significantly from Received Pronunciation (hereafter RP). Several studies (e.g. Adetugbo, Reference Adetugbo1977; Reference Adetugbo, Dadzie and Awonusi2004; Atoye, Reference Atoye1991; Udofot, Reference Udofot, Awonusi and Babalola2004) conducted particularly from segmental and suprasegmental perspectives have established this. However, not so much has been done to verify this at the level of connected speech. Yet the features of connected speech contribute significantly to the marked difference between the native and non-native English accents and are capable of impairing intelligibility between speakers of both varieties (Allen, Reference Allen1961: xiv; Laver, Reference Laver1968: 156). Therefore, this study investigates two connected speech features (assimilation and elision) at morpheme and word boundaries, in order to provide explanations for how spoken educated NigE approximates to and deviates from RP.
In view of the fact that NigE is yet to be codified, the study focuses on the educated variety. Speakers of this variety are exposed to learning of English within Nigerian schools up to at least the post-secondary level, use the language for daily communication, academic activities and official purposes, and have achieved a minimum level of mastery considered to be socially acceptable and internationally intelligible. (Banjo, Reference Banjo1996: 75–79).
2. Appropriateness of RP as a Model in Contemporary Nigeria
RP is the accent most generally associated with Standard British English. It is an adopted prestige variety devoid of any geographical affiliation in England and a codified model of pronunciation used in the teaching of English, especially as a second or foreign language (Hannisdal, Reference Hannisdal2006: 11). However, in recent times the appropriateness of RP and General American (GA) as pronunciation models in non-native English settings is being questioned, in view of the democratisation and globalisation of English. Other varieties like English as a Lingua Franca (ELF), English as an International Language (EIL) and regional and continental standards are being proposed as alternative models (Akinjobi, Reference Akinjobi2012: 54; Awonusi, Reference Awonusi, Owolabi and Dasylva2004: 189–190; Jenkins, Reference Jenkins2002). Divergent views have been expressed by scholars in this regard.
Jenkins (Reference Jenkins2002) is of the opinion that English is no longer confined to communicating between native and non-native speakers, but now serves as a lingua franca for non-native speakers from different L1 backgrounds. Therefore, ability to communicate with other L2 speakers rather than acquisition of a native-like accent should be the target for non-native learners. She therefore advocates adoption of English as an International Language (EIL) as a pronunciation model for non-native speakers, and further proposes what she refers to as the Lingua Franca Core (LFC) – a list of features that should be the minimum acceptable standard for intelligible communication among non-native speakers of English, upon which basis the pronunciation syllabus of these learners of English should be designed (Zoghbor, Reference Zoghbor2011: 285).
In Awonusi's (Reference Awonusi, Owolabi and Dasylva2004: 189) view, RP is already outdated and has been subjected to internal changes. Besides, the sociolinguistic realities which maintained it are no longer in existence, leading to its decline and the emergence of many non-RP accents in government, media and industry which are now preferred to RP in the United Kingdom. He, therefore, proposes ‘the development and codification of regional and continental standards (native or non-native)’ to replace RP as pronunciation models (Awonusi, Reference Awonusi, Owolabi and Dasylva2004:190).
However, Akinjobi (Reference Akinjobi2012: 58–59) considers the adoption of a non-native model for non-native speakers of English as unrealistic. She argues that this option may impair intelligibility with native speakers, which according to her is paramount in an age of technology-driven globalisation. She therefore supports the retention of RP, albeit aided by technology-based non-enculturation sources of speech practice, as a pronunciation model.
These arguments notwithstanding, RP remains, to date, the constitutionally recognised standard entrenched in the national curricula in Nigeria (Jowitt, Reference Jowitt1991: 70). It is the target accent for Nigerian learners of English, and the model examinations bodies adopt for teaching and tests in Nigerian schools. This is because RP is codified and well documented, and is the pronunciation model for most pronouncing dictionaries and textbooks on phonetics, especially for foreign learners (Hannisdal, Reference Hannisdal2006: 15; Roach, Reference Roach2000: 3–4). Therefore, until other suggested options are fully developed and accepted, RP remains the pronunciation model in use for teaching and learning purposes in Nigeria. This explains why the educated NigE accent is assessed against RP.
3. Connected Speech features in RP
Connected speech features refer to the phonetic variations that typify words in continuous speech compared to when produced in isolation (Gimson, Reference Gimson1980: 283). When sounds occur close to each other in a connected utterance, various phonetic alterations and phonemic modifications, occasioned by the phonological environment of the phonemes or speaker's articulatory mechanisms, normally occur. According to Cruttenden (Reference Cruttenden2001: 278), these modifications may influence a whole word or segments at word or morpheme boundaries. Prominent among such processes in RP are assimilation, elision, liaison, lenition, and reduction of weak forms (Nolan and Kerswill, Reference Nolan, Kerswill and Ramsaran1990: 296). The present study examines assimilation and elision.
3.1 Assimilation
Assimilation, a process whereby a sound segment is modified to resemble an adjacent one within a word or at word boundary, is a common feature of speech in RP. According to Farnetani (Reference Farnetani, William and Laver1999: 6), it is a ‘Contextual variability of speech sounds, by which one or more of their phonetic properties are modified and become similar to those of the adjacent segments’. An example is the final sound of the word this, pronounced as /s/ in isolation; but when followed by a word beginning with /ʃ/ in fast speech (e.g. shop) it tends to become /ʃ/, as in /ðɪʃ ʃɒp/ (Roach, Reference Roach2009: 7). A related concept is coarticulation, which, however, is concerned with neurological and mechanical explanations for the occurrence of assimilation, and is governed by language universal rules and covers changes that extend over a number of segments rather than those affecting just two contiguous sounds (Farnetani, Reference Farnetani, William and Laver1999: 6; Roach, Reference Roach2009: 15–16).
Different assimilatory processes exist and have been categorised by scholars (e.g. Roach, Reference Roach2000:138–142; Simo Bobda & Mbangwana, Reference Simo Bobda and Mbangwana1993: 79–81; Skandera and Burleigh, Reference Skandera and Burleigh2005: 90–94). Skandera and Burleigh (Reference Skandera and Burleigh2005: 90) identified four categorisations based on:
• the distance between the two sounds involved: contiguous/contact and non-contiguous/distant assimilation
• the direction of the influence exerted: regressive, progressive and coalescent assimilation
• the particular distinctive feature affected: assimilation of voice, place and manner
• the degree to which one sound assimilates to another: partial and total assimilation
Contiguous (also contact, contextual or juxtapositional) assimilation is a process whereby the pronunciation of a segment is altered under the influence of an adjacent sound especially at a word boundary. An example is is she, pronounced /ɪz/ and /ʃi:/ respectively in isolation, but as /ɪʒ ʃi:/ in conversational speech. Non-contiguous assimilation, which according to Skandera and Burleigh (Reference Skandera and Burleigh2005: 90) is a modification involving two distant sounds, is not commonly found in English.
Regressive, progressive and coalescent assimilation concern the direction of the influence that the adjacent sounds exert on each other. The most common type in this category in RP is regressive (anticipatory) assimilation. It is a process whereby a sound exerts influence on the preceding one, e.g. ten bikes /ten baɪks/ becoming [tem baɪks]. Progressive (perseveratory) assimilation, in which the preceding phoneme influences the subsequent one, e.g. lunch score /lʌnʧ skɔ:/ becoming [lʌnʧ ʃkɔ:], is less common (Roach, Reference Roach2000: 138–140; Cruttenden, Reference Cruttenden2001: 286). Coalescent assimilation, also called yod coalescence by Wells (Reference Wells1982), is a common and permitted process in RP colloquial speech (Cruttenden, Reference Cruttenden2001: 212, 286) in which the palatal approximant /j/ fuses with preceding alveolar consonants /t, d, s, z/, either within a word or across word boundary, to become palato-alveolar /ʃ, ʒ, ʧ, ʤ/ respectively; for example, issue /ɪsju:/ becoming [ɪʃu] and would you? /wʊd ju:/ becoming [wʊʤu].
Assimilation processes in the third category are concerned with the distinctive feature affected by the change. While assimilation of place relates to changes in the place of articulation of a segment, e.g. that person /ðæt pɜ:sṇ/ as [ðæp pɜ:sṇ] (alveolar stop /t/ changes to the same place of articulation as bilabial stop /p/), assimilation of manner, also believed to be rare in RP, concerns changes that affect manner of articulation, e.g. that side /ðæt saɪd/ as [ðæs said] (Roach, Reference Roach2000: 140). In assimilation of voice, contiguous consonants tend to be either all voiced or all voiceless, depending on the state of the glottis. However, what is commonly observed in RP is devoicing, where a voiceless segment affects a voiced one irrespective of the relative order of the two. This may be either regressive or progressive. Regressive devoicing occurs if a voiced sound is modified to become more like the voiceless one following it; for example, I have to go is pronounced as [aɪ hæftə gəʊ], not as [aɪ hævtə gəʊ]. The devoicing, however, becomes progressive when a voiced consonant is devoiced to reflect a preceding voiceless sound; for example, black dog pronounced as [blæk d̥ɒg] rather than [blæg dɒg] (Gimson, Reference Gimson1980: 289; Katalin and Szilárd, Reference Katalin and Szilárd2006: 96).
Nevertheless, progressive voicing is also possible in RP, especially in the plural morpheme, e.g. dogs [dɒgz] (voiceless /s/ changes to voiced /z/), in the reduced form of the third person singular form of the verb be, e.g. she's a girl [ʃɪz], and in the possessive marker, e.g. John's [dʒɒnz] (Simo Bobda, Reference Simo Bobda2007: 299).
The last category is the degree to which one sound assimilates to another. In what Skandera and Burleigh (Reference Skandera and Burleigh2005:90–94) describe as partial assimilation (perhaps for lack of a more apt term), the contiguous sounds involved differ from each other in at least one of the distinctive features. For example, the assimilated /b/ of good pen [gʊb pen] has similar place and manner of articulation with the following /p/ of pen but differs in terms of voicing. On the other hand, the two sounds involved in total assimilation are completely alike. For instance, the /t/ of that cup [ðæk kʌp] takes the same features of the /k/ it precedes.
3.2 Elision
Elision refers to the omission of one or more sounds (a vowel, a consonant or a whole syllable) within a word or at a word boundary in order to maximise articulatory ease. Jackson (Reference Jackson1982: 32) refers to it as a process ‘involving the complete disappearance of a phoneme from a phonetic environment.’ This normally happens when there is a cluster of two or more consonants word-internally or across word boundary, as in han(d)kerchief, Chris(t)mas, nex(t) day, I don'(t) know, etc.
Simo Bobda and Mbangwana (Reference Simo Bobda and Mbangwana1993: 81–82) identify two types of elision: historical and contextual. Historical elision concerns sounds that have disappeared in the course of the evolution of a language, and are no longer pronounced in the contemporary form of such language. Such elision forms are already established, though the old spelling may still be retained, e.g. cupboard /kʌbəd/ and talk /tɔ:k/. Contextual (juxtapositional) elision, on the other hand, relates to cases of sounds that exist in a word said in isolation but are omitted in the environment of another word in spontaneous speech, e.g.:
[əgʊ dil] a good deal for /əgʊd dil/
[lɑ:s taɪm] last time for /lɑ:st taɪm/
[blaɪn mæn] blind man for /blaɪnd mæn/
[le ðǝm] let them for /let ðǝm/
4. Connected Speech Features in Nigerian English
Scholars have identified aspects of connected speech features observed in NigE in one form or the other. There is a general consensus that NigE tends towards regressive assimilation, e.g. in case [iŋ kes]; final devoicing, e.g. the dog's mine [dɔks]; and consonant elision, e.g. don't buy [don bai] (Jibril, Reference Jibril1982: 110–114; Josiah, Reference Josiah2009: 291–298; Laver, Reference Laver1968: 158–160; Simo Bobda, Reference Simo Bobda2007: 299).
As shown in Laver's (Reference Laver1968: 159–160) analysis, NigE also exhibits extensive cases of assimilation of place, e.g. iron bar [aiɔm ba], hard blow [hab blo] and allows regressive voicing assimilation which RP does not, e.g. make them [meg dεm]. Jibril (Reference Jibril1982: 110–113), however, claims that only nasals undergo assimilation of place in NigE, e.g. man power [mam pa:wa:], and that regressive assimilation of voice affects final plosives only, which become devoiced or voiced before a word beginning with a voiceless or voiced consonant, e.g. with the [wid di], twelve thousand [twep θauzn].
However, none of these studies was able to provide phonological explanations for how these features approximate to and deviate from RP norms. This is the gap this study intends to fill.
5. Methodology
The data were collected from 360 educated Nigerian speakers of English, randomly sampled from different language groups in Nigeria. Two RP speakers served as controls. They all produced 25 utterances and a short passage into digital recording devices. All potential assimilation and elision sites extracted from the data were grouped into related boundary contexts. The features observed at the contexts were transcribed perceptually to reflect the NigE accent, using a pronunciation scheme proposed by Adetugbo (Reference Adetugbo, Dadzie and Awonusi2004:181–186). The analysis was carried out statistically; an appropriate RP pronunciation in each case was allotted 1 mark, while 0 mark was recorded for each non-RP variant. The total score for all participants in each identified feature was converted to a percentage, the higher percentage taken as the norm. The potential assimilation and elision sites found in the data were grouped as follows:
Assimilation sites:
1. A word-final voiced obstruent followed by a word-initial voiceless obstruent, e.g. chose six, have to, live show, of course, we’ve planned and five pounds.
2. The reduced form of the third person singular form of the verb be preceded by a voiced segment, e.g. she's, he's, dog's mine.
3. A word-initial voiced obstruent preceded by a word-final voiceless obstruent, e.g. black dress, half-done, nice boy, ice blue.
4. The alveolar stops /t, d/ followed by bilabial or velar stops /p, b, k, g/ at word boundary, e.g. met Peter, that case, good bye and good girl.
5. The alveolar nasal /n/ followed by bilabial stops /b, p/ or velar stop /k/ at word boundary e.g. ten boys, ten pounds and in case.
6. /t, d, s and, z/ preceded by the palatal glide /j/ at word boundary, e.g. miss your, those young men, what you want, could you.
Elision sites:
1. Word-final /t/ before another consonant at word boundary, e.g. doesn’tshe, won’tdo it, exac tcolour, tes tdrive, don 'tbuy it.
2. Morpheme-final /t/ before another consonant at word boundary, e.g. kept quiet, jumped well, equipped with, fixed price.
3. Word-final /d/ before another consonant at word boundary, e.g. old man, cold launch.
4. Morpheme-final /d/ before another consonant at word boundary, e.g. found, five, seemed glad, robbed both, advertised car.
6. Data Analysis
6.1 Assimilation
Table 1 shows the frequency and percentage scores in assimilatory processes for NigE and RP speakers.
Table 1: Frequency and percentage scores for assimilation processes
Key: RD: Regressive Devoicing; PV: Progressive Voicing; PD: Progressive Devoicing; RV: Regressive Voicing; RP(AS): Regressive Place (Alveolar Stop); RP(AN): Regressive Place (Alveolar Nasal); YC: Yod Calescence; YR: Yod Retention; N/ASS: Non-Assimilation.
In context 1, NigE speakers overwhelmingly produced regressive devoicing assimilation, e.g. [ʧos siks] chose six, [haf tu] have to. Specifically, 2,143 (99.2%) tokens of this assimilatory process and just 17 (0.8%) of the unassimilated form were recorded. Their overall score compared to the controls' 100% use of the same feature. In each case, the preceding segment was devoiced in anticipation of the following voiceless sound. In context 2, however, whereas the control group articulated progressive voicing 100% of the time, NigE speakers were only able to produce 229 tokens in 1,080 sites, constituting 21.2%, e.g. [hiz] he's, [dɔgz] dog's mine. Progressive assimilation was not observed, as obtained in RP, in 851 (78.8%) cases.
In context 3, progressive devoicing, e.g. [haf d̥ɔn] half done, [nais b̥ɔi] nice boy, was substantial in the speech of NigE speakers with 937 (65.1%) occurrences out of 1,440 sites. A similar trend, though with a higher figure (100%), was found in the controls' production. The initial segment of the second word was affected by the voicelessness of the final consonants of the first. It was only in 503 cases, representing 34.9%, that the NigE speakers differed by producing regressive voicing.
Unlike the control group that produced 100% tokens of regressive place (alveolar stop) assimilation in context 4, NigE speakers’ performance in this assimilatory process was low, scoring just 366 (25.4%) tokens out of 1,440 expected, e.g. [mεp pita] met Peter, [gug gεl] good girl. However, no assimilation was recorded in 1,074 (74.6%) cases. In context 5, significant tokens of regressive place (alveolar nasal) assimilation were articulated by NigE speakers, scoring 686 (63.5%) out of 1,080 tokens, e.g. [tεm bɔis] ten boys, [iŋ kes] in case. Absence of assimilation was observed in only 394 cases (36.5%).
Finally, in context 6, participants’ overall performance in yod coalescence was very low. Only 89 (6.2%) instances of appropriate yod coalescence were observed, compared to 87.5% tokens for the controls, whereas yod was retained in 1,351 (93.8%) cases.
The above findings, as captured by Table 1 and Figure 1, show that NigE speakers were able to articulate regressive devoicing, progressive devoicing and regressive Place (involving alveolar nasal) assimilatory processes in a manner that closely approximates to the RP norms, while they deviated from RP at varying degrees in progressive voicing, Regressive Place (involving alveolar stops) and yod coalescence.
Figure 1. Assimilation scores in NigE and RP
6.2 Elision
Table 2 reveals the frequency and percentage scores for NigE speakers and the controls in elision processes. Altogether, there were 5,400 potential elision sites (1,800 tokens in contexts 1; 1,440 in 2 and 4, and 720 in context 3.)
Table 2: Frequency and percentage scores for elision variants
Key: N/E: Non-Elision;
The table shows that in context 1, NigE speakers realised 1,204 (66.9%) tokens of /t/ elision, e.g. [egza kɔlɔ] exact colour, [don bai] don't buy, but failed to elide /t/ in 596 cases (33.1%). This suggests that participants approximated to the RP form. In the second context, the incidence of /t/ elision produced by NigE speakers was 817 (56.7%) tokens (e.g. [ʤɔmp wel] jumped well, [kep kwaiet] kept quiet) against 623 (43.3%) instances of non-elision variant. This performance, although less than what obtained in the first, compared to the controls' percentage of 83.3%.
In context 3, NigE speakers' performance, again, approximated to the controls' of 83.3%. They produced 436 (60.6%) tokens of /d/ elision, e.g. [ol man] old man; while they failed to elide /d/ in 284 (39.4%) instances. Context 4 also shows significant preference for /d/ elision in NigE. Participants recorded 893 (62%) cases of elision, e.g. [rɔb boθ] robbed both, [advatais ka] advertised car, compared to the 100% performance of the control group. They failed to elide /d/ in the same position in 547 cases, representing 38%.
The overall percentage score for NigE speakers in elision processes (Table 2) shows that 3,350 (62%) tokens of elision were produced, while 2,050 (38%) cases of absence of elision were recorded. The NigE speakers' performance, which to some extent compared with the controls' score of 93.3% (represented in Fig. 2), suggests a tendency for consonant elision at word and morpheme boundaries in NigE and thus approximates to RP.
Figure 2. Elision scores in NigE and RP.
To a great extent, this can be explained in relation to consonant cluster simplification strategy. Nigerian indigenous languages permit a limited number of syllable structures (Dunstan, Reference Dunstan1969: 27–28); the complex consonant clusters of RP are rare and therefore pose difficulties for many NigE speakers. In a bid to resolve this linguistic dilemma, as Simo Bobda (Reference Simo Bobda2007: 417) claims, NigE speakers often simplify consonant clusters by vowel epenthesis or by consonant deletion (as in this case).
7. Discussion and Conclusion
The analysis of spoken educated NigE vis-à-vis RP connected speech features shows that NigE exhibits varying degrees of proximity to RP. Speakers showed considerable approximation to RP in three assimilation processes (regressive devoicing, progressive devoicing, and regressive place [alveolar nasal] assimilation) and in consonant elision. On the other hand, they deviated significantly from the RP norms in three other features (progressive voicing, regressive place [alveolar stop] assimilation, and yod coalescence.)
Arising from the foregoing is the fact that NigE exhibits a pattern of connected speech features which is considerably different from RP. Speakers of the variety were able to approximate to RP in features considered phonologically natural (e.g. devoicing, homorganic nasal assimilation and deletion.) Such natural features, according to Hyman (Reference Hyman1975), are phonetically motivated, common, and usually attested in more languages, and ‘can be attributed to either articulatory or acoustic assimilations or simplifications’ (171). For instance, while a word-final devoicing rule is regarded as more normal than a voicing rule in the same position (Schane, Reference Schane1973: 111), consonant-deletion processes are found to be prevalent in different languages (Hyman, Reference Hyman1975: 162). At the same time, homorganic nasal assimilation is a common phonological process in most Nigerian indigenous languages (Yusuf, Reference Yusuf2010). On the other hand, NigE speakers deviated significantly in other features involving voicing and coalescence (which require more articulatory energy or greater gestural overlap.)
This suggests that operations of connected speech features have restricted occurrence in NigE. Unlike in RP, where they are widespread due to the native speakers' penchant for speaking fast, with sounds (and by implication words) linked with each other, the occurrence of connected speech features in NigE is largely influenced by mother tongue transfer and articulatory exigencies - the need to employ simple and natural processes that require less articulatory effort (Hyman, Reference Hyman1975: 138–139, 147). This fact, therefore, distinguishes NigE from RP connected speech features.
ROTIMI OLADIPUPO teaches English at the Centre for Foundation Education, Bells University of Technology, Ota, Nigeria. He obtained a BA in Education (English) from Obafemi Awolowo University, Ilé-Ifẹ̀, and MA and PhD in English Language from the University of Ibadan. His research interests cover phonetics and phonology of English, sociophonetics and sociolinguistics. Email: olarotimi2002@yahoo.com