1 Introduction
The present paper presents a systematically controlled experimental investigation of correlates of lexical stress in Moroccan Arabic (MA).Footnote 1 The status of lexical stress as present or absent in this language has been subject to some debate and is not currently resolved (see Maas Reference Maas, Kuty, Seeger and Talay2013).
The first goal of this paper, therefore, is to contribute to this debate with a detailed study of the acoustic properties of presumed stressed and unstressed syllables, providing a type of evidence that has hitherto been absent from the discussion. The specific claim that is being tested is that stress is penultimate, unless the final syllable of the word is heavy, in which case stress is final (Benkirane Reference Benkirane1998, see also Boudlal Reference Boudlal2001).
The second goal of this paper is to once more make explicit that the investigation of lexical stress requires stimuli that isolate lexical stress from phrase-level prosodic effects.Footnote 2 This confound is also seen in some of the existing literature on Moroccan Arabic stress and has led to claims that we will dispel with here. Specifically, acoustic correlates of stressed syllables in words produced in word lists do not reflect properties of lexical stress alone, but also of phrase-level pitch events such as the presence of a pitch accent on the stressed syllable, or durational enhancement in the form of accentual lengthening. The present experimental paradigm was designed to circumvent these effects and may serve as an example methodology for the investigation of acoustic correlates of lexical stress.
The structure of this paper is as follows: Section 2 provides a general background to this study, with Section 2.1 highlighting relevant aspects of the experimental study of stress. Section 2.2 introduces Moroccan Arabic, and Section 2.3 gives an overview of what is currently known about its lexical phonology. Section 2.4 reviews prior claims about lexical stress and its interaction with intonation, and Section 2.5 presents the aim of the experiment and some predictions. Section 3 outlines the methods of the experiment, and Section 4 the results, with each of the four acoustic correlates tested (f0, duration, spectral Centre of Gravity, vowel quality) treated separately. Section 5 forms the discussion and Section 6 concludes the paper.
2 Background
2.1 Methodological issues in the identification of lexical stress
This section serves to review a number of theoretical issues relevant to the identification and investigation of stress in general. The next section then addresses how the present experiment furthers the debate about stress in MA.
The first issue relates to the general definition of stress, as it is well known that stress is defined differently in different sources (see van der Hulst Reference van der Hulst2014a). The most important distinction is between the definition that takes stress to refer to perceived or ‘actual’ prominence (Ladd Reference Ladd2008: 53), and the definition that takes stress to refer to an abstract property that differentiates one syllable in a word from the others, where stress is typically considered to be part of a word’s lexical entry. It is this latter definition of stress which is adopted here (in line with Hayes Reference Hayes1995: 8; Gussenhoven Reference Gussenhoven2004; Ladd Reference Ladd2008: 50f.; Goedemans & van der Hulst Reference Goedemans, van der Hulst, Dryer and Haspelmath2013; Hyman Reference Hyman2014: 56).Footnote 3
The second issue relates to how stress is typically diagnosed (see Hayes Reference Hayes1995), which may include a variety of strategies such as native speaker judgments, the differential status of stressed syllables with respect to phonological rules, and, as one of the most widely used diagnostics, the relative acoustic enhancement of stressed syllables. This latter diagnostic is, however, rather problematic. First, there is no one-to-one mapping between actual (acoustic) prominence and stress, and secondly, the interpretation of null results for acoustic enhancement may present a difficult rhetorical task.
The first problem with acoustic enhancement (specifically the (mis)identification of acoustic enhancement as lexically specified inherent, phonological prominence) is best explained by the observation that stress and actual prominence are not characterised by a one-to-one mapping: acoustic prominence does not equal stress and stress does not equal acoustic prominence.
The first of these statements (acoustic prominence does not equal stress), comes down to the observation that such prominence may result from postlexical intonational movements. When these take the form of pitch accents that associate with a stressed syllable, the correct syllable might still be identified as stressed, but the enhancement should nevertheless be attributed to the pitch accent rather than to stress itself (with stress status by definition persisting irrespective of whether or not a given instance of that syllable carries a pitch accent) (see Gordon & Roettger Reference Gordon and Roettger2017). However, not all intonational prominence reflects pitch accentuation, and acoustically and perceptually salient movements such as those at phrase boundaries may co-occur with unstressed syllables. Unfortunately, the acoustic enhancement resulting from intonational movements such as final rising f0 observed in list-based elicitation has often been mistakenly identified as a correlate of lexical stress proper (see Section 2.4.1 for specific reference to Moroccan Arabic).
For the second of these (stress does not equal acoustic prominence), stressed syllables are not necessarily acoustically enhanced in all languages. Firstly, there is a lot of variation, with different languages using different acoustic parameters to mark stressed syllables. Many, though not all, use duration, while others also differentiate syllables in terms of spectral tilt, intensity or more extreme formant values for vowels (for a critical overview see Gordon & Roettger Reference Gordon and Roettger2017). The role of intensity or f0 alone in cueing stress remains controversial, the former because it is highly correlated with f0 and other spectral characteristics (Lehiste Reference Lehiste and Lass1970), and the latter because of its role in signalling postlexical prominence. Secondly, some languages may not at all reliably or perceptually differentiate stressed syllables. For example, in Hungarian, in which stress is uncontroversially word-initial, intensity may be the only correlate (Varga Reference Varga2002, Szalontai et al. Reference Szalontai, Wagner, Maády, Windmann, Draxler and Kleber2016). The existence only of subtle intensity differences casts doubt on whether stress is perceptually retrievable in Hungarian. In sum, the presence or absence of acoustic enhancement of (presumed) stressed syllables should not be taken, on its own, to provide conclusive evidence about the existence of stress in a given language.
The second reason why acoustic enhancement of stressed syllables is a problematic diagnostic concerns a more general interpretative complication. Specifically, if an experiment does not find any acoustic correlates to stress, this does not prove that stress in general is absent: Negative evidence cannot be taken to support a null hypothesis. It is therefore very difficult to show that lexical stress is absent in a given language. A most convincing approach with this aim would have to accumulate results from various diagnostics to stress and show that they all converge in failing to provide evidence in favour of the existence of stress.
The present article will contribute to the debate about the existence of lexical stress in Moroccan Arabic by investigating the acoustic enhancement of presumed stressed syllables, and relate findings to what is currently known about other diagnostics to stress in Moroccan Arabic.
2.2 Moroccan Arabic within the Arabic-speaking world
Before reviewing previous work on the Moroccan Arabic phonological system, including stress, this section will relate Moroccan Arabic to other varieties of Arabic. Moroccan Arabic denotes a variety of Arabic that is strictly spoken, and in this article it refers specifically to the Moroccan variety spoken in Casablanca. Morocco is characterised by a high degree of multilingualism, with many speakers being bilingual in Berber and Moroccan Arabic, and many also proficient in Modern Standard Arabic (MSA) as well as French and/or Spanish, depending on the region. The multilingual character of Moroccan society is further enhanced by the diglossic situation that characterises all modern Arabic-speaking societies, involving varying registers of Arabic ranging from local varieties of Moroccan Arabic to the supranational Standard Arabic, i.e. MSA (for the situation in Morocco specifically see Maas & Procházka Reference Maas and Prochaézka2012; for diglossia in the Arabic-speaking world in general see chapters in Owens Reference Owens2013, Versteegh Reference Versteegh2014). To the extent that a national variety of Moroccan Arabic can be identified, it is clear that it differs considerably from Modern Standard Arabic (Mitchell Reference Mitchell1993, Maas & Procházka Reference Maas and Prochaézka2012). Moroccan Arabic is not intelligible to speakers of most other varieties of Arabic, and its many divergent phonological characteristics are usually traced back to extended and intensive contact with various Berber languages (Mitchell Reference Mitchell1993, Heath Reference Heath and Kaye1997, Dell & Elmedlaoui Reference Dell and Elmedlaoui2002, Maas Reference Maas2019). In fact, Moroccan Arabic and (Tashlhiyt) Berber have been said to exhibit similar ‘surface phonologies’ (Dell & Elmedlaoui Reference Dell and Elmedlaoui2002: 227). In this context it is worth pointing out that Tashlhiyt Berber specifically is considered to lack lexical stress (Stumme Reference Stumme1899, Dell & Elmedlaoui Reference Dell and Elmedlaoui2002, Roettger, Bruggeman & Grice Reference Roettger, Bruggeman and Grice2015, Roettger Reference Roettger2017, Bruggeman Reference Bruggeman2018), although the absence of lexical stress is possibly also a feature of other varieties of Northern Berber (Kossmann Reference Kossmann, Frajzyngier and Shay2012).
Lexical stress has been investigated in many varieties of Arabic other than MA. Phonetic investigations have been conducted on several varieties, with results typically supporting native speakers’ intuitions about stress positions, including Egyptian and Jordanian (Almbark, Bouchhioua & Hellmuth Reference Almbark, Bouchhioua and Hellmuth2014), Lebanese (Chahal Reference Chahal2003) and Tunisian (Bouchhioua Reference Bouchhioua2008; see also Ghazali & Bouchhioua Reference Ghazali and Bouchhioua2003). Several more varieties are subject to ongoing work (based on the Hellmuth & Almbark Reference Hellmuth and Almbark2017 corpus).Footnote 4 Work using Metrical Stress Theory adds a number of varieties to this list, with at least nine synchronic varieties reported to have stress in Hayes (Reference Hayes1995) alone. In most varieties of Arabic stress assignment is subject to weight and position, with stress typically targeting a final superheavy syllable (e.g. CVCC), or, in the absence of such a syllable, a penultimate heavy syllable (e.g. CVː) (Watson Reference Watson, van Oostendorp, Ewen, Hume and Rice2011).
2.3 Moroccan Arabic lexical phonology
2.3.1 Syllable structure
Most Arabic varieties distinguish between phonologically heavy and light syllables, where the presence of a coda or a long nucleus results in a heavy syllable (e.g. Watson Reference Watson, van Oostendorp, Ewen, Hume and Rice2011). Assuming that Moroccan Arabic does not have a vowel length distinction (see Section 2.3.2), the number of consonantal slots in the coda determines the weight of the syllable as light (e.g. CV), heavy (e.g. CVC), or, under some analyses, superheavy (e.g. CVCC). CəC syllables are typically considered light (Dell & Elmedlaoui Reference Dell and Elmedlaoui2002).
Moroccan Arabic syllable structure has been investigated in great detail, resulting in varying claims (Benhallam Reference Benhallam1980, Reference Benhallam1990; Benkirane Reference Benkirane1982; Dell & Elmedlaoui Reference Dell and Elmedlaoui1985, Reference Dell and Elmedlaoui2002, Reference Dell and Elmedlaoui2008; Boudlal Reference Boudlal2001). What is clear from all sources is that MA allows for more complex consonant clusters than most other varieties of Arabic, while the representation of these clusters in terms of branching or simplex onsets and codas remains disputed. For example, Benkirane (Reference Benkirane1998) lists a number of syllable types including CV, CCV, CCVC, and CCəCC, while Dell & Elmedlaoui (Reference Dell and Elmedlaoui2002) argue that syllable onsets cannot be branching, and that codas can be branching only if they consist of geminates. In order to account for what seem to be syllable-initial clusters, Dell & Elmedlaoui (Reference Dell and Elmedlaoui2002) instead propose a complex general syllabification algorithm that posits onsetless syllables and empty nuclei.
What is important is that a distinction is made in all works on MA between heavy and light syllables, and sometimes superheavy syllables. The degree of consensus is limited to CV being considered light and CVC(C) heavy (with the exception of CəC). The stimuli used in the present experiment reflect only this uncontroversial distinction between light syllables on the one hand and heavy/superheavy syllables on the other hand.
2.3.2 Vowel inventory
Various claims have been made about the vowel system in Moroccan Arabic. On some accounts MA has a five-vowel system consisting of /iː ə aː ʊ uː/ (Hamdi 1991 as cited in Al-Tamimi Reference Al-Tamimi2009, Reference Al-Tamimi2017). Most researchers however posit only three or four vowels, namely /i a u/ plus a central vowel (Benkirane Reference Benkirane1998, Boudlal Reference Boudlal2001, Dell & Elmedlaoui Reference Dell and Elmedlaoui2002). The central vowel is usually considered non-phonological, serving primarily to break up illicit consonant clusters (e.g. Dell & Elmedlaoui Reference Dell and Elmedlaoui2002). In addition to the number of vowels, the representation of length is also a matter lacking consensus, as can be judged from the juxtaposition of the phonological categories /iː/ and /i/, and /uː/ and /u/ by the aforementioned sets of authors. This might be caused by the existence of a surface contrast in length, with CVC syllables having longer vowels than CV syllables (Benkirane Reference Benkirane1982, Dell & Elmedlaoui Reference Dell and Elmedlaoui2002, Yeou Reference Yeou2005). Despite disagreement about the correct representation of length, it is widely acknowledged that MA lacks a phonological vowel length distinction for vowels with the same place of articulation, which sets MA apart from most other varieties of spoken Arabic. This is backed by the absence of minimal pairs of the type /sin/ ‘tooth’ ~ /siːn/ ‘the letter “sin”’ (example from Iraqi Arabic, Al-Ani Reference Al-Ani1970). It will be assumed here that the phonological vowel inventory of MA can be represented as simply /i a u/, with an additional centroid vowel which may be either phonological /ə/ or phonetic [ə].
2.4 Earlier work on Moroccan Arabic stress
2.4.1 Proposed stress generalisations
As previously mentioned, it is currently not clear whether MA has lexical stress. In most teaching materials for Moroccan Arabic no reference is made to lexical stress (including Harrell Reference Harrell1962, Harrell, Abu-Talib & Carroll Reference Harrell, Abu-Talib and Carroll1965, Andjar, Bacon & Benchehda Reference Andjar, Bacon and Benchehda2014, Peace Corps 2016, although accent assignment rules are given in Hoogland Reference Hoogland2017). At least one dictionary does not indicate stress in the entries (Harrell & Sobelman Reference Harrell and Sobelman2006).Footnote 5 Highly informative also is the review found in Maas (Reference Maas, Kuty, Seeger and Talay2013), who discusses more than 10 sources published between 1894 and 2008 that all differ to some extent in their views on the existence of word prominence in this language. Unfortunately, most of the reviewed sources are not fully clear on the phenomenon being discussed, as reflected in the choice of terminology, which includes ‘Wortakzent’, ‘Akzent’ and ‘Accent’ (by German authors), ‘accent’ and ‘accent de mot’ (in French works), ‘accento tonico’ (in Italian) and ‘stress’ (in English). Some of these terms are perhaps best interpreted as referring to postlexical pitch prominence rather than lexical stress, while others do in fact seem to refer to inherent word-level prominence. In reviewing the evidence in detail, Maas (Reference Maas, Kuty, Seeger and Talay2013) argues that the various positions can be allocated to two main groups: one group assumes that Moroccan Arabic has word stress, although what kind of stress remains unclear (including El Mejjad Reference El Mejjad1985, Benhallam Reference Benhallam1989; both cited in Maas Reference Maas, Kuty, Seeger and Talay2013); the other group considers MA to lack word stress (Stumme & Socin Reference Stumme and Socin1894, Brockelmann Reference Brockelmann1908, Fischer Reference Fischer1917, Cantineau Reference Cantineau1960, Durand Reference Durand1994, Aguadé Reference Aguadé and Versteegh2008).
In addition to the sources reviewed by Maas (Reference Maas, Kuty, Seeger and Talay2013), there are further proponents of the existence of word stress who posit specific stress rules and generalisations. These include, notably, Benkirane (Reference Benkirane1998), according to whom stress falls on the final syllable if it is heavy (i.e. a closed syllable such as CVC) and on the penult otherwise. This position is shared by Nejmi (Reference Nejmi1993, as cited in Boudlal Reference Boudlal2001) and in part by Boudlal himself (see next paragraph). Others assume a fixed position for stress, such as final stress (‘final prominence’ in Watson Reference Watson, van Oostendorp, Ewen, Hume and Rice2011), or penultimate stress (e.g. Benhallam Reference Benhallam1989, as cited in Maas Reference Maas, Kuty, Seeger and Talay2013). Yet others argue for a more variable stress position that may target syllables prior to the penultimate (El Hadri Reference El Hadri1993 as cited in Boudlal Reference Boudlal2001).
Finally, one particularly complicated picture is sketched by Boudlal (Reference Boudlal2001: 99), who posits that ‘the location of stress depends on whether or not the items considered occur in isolation or in context’. Accordingly, stress would be final when words are produced in context, but words in isolation would be captured by Benkirane’s generalisation. This is in fact in line with Mitchell’s (Reference Mitchell1993: 202) observation that ‘the place of prominence in a word in isolation is not carried over to its occurrence in the phrase and sentence. It is only in phrase- or sentence-final position in unemphatic affirmative sentences that the pattern of the word-isolate may be repeated, and then by no means certainly’. Assuming that lexical stress is an invariant property of a word, reflected in its ‘dictionary entry’ (Abercrombie Reference Abercrombie1976 in van der Hulst Reference van der Hulst2014a:5), the very fact that the ‘stress’ or prominence location in a word may vary suggests that the phenomenon in question is not lexical stress but rather postlexical intonational prominence. Boudlal’s (Reference Boudlal2001) interpretation of MA having final lexical stress on words produced in isolation is in fact readily interpreted as reflecting postlexical prosody. As Boudlal (Reference Boudlal2001) notes, final syllables of words produced in isolation are marked by ‘high’ (rising-falling) f0, which is the main reason why these syllables are considered stressed. This final rise-fall is not consistently present on words produced within a sentence, suggesting that the pitch movement at the right edge of words produced in isolation reflects an Intonational Phrase (IP-)final pitch effect. Words were in fact read aloud from a list, which strongly suggests that Boudlal’s right-edge high pitch reflected a continuation rise or list intonation.
2.4.2 Native speaker perception of stress
The inconsistent analyses of word stress are matched by equally incongruous judgments on the position of word stress by native speakers. A number of studies, reviewed in Boudlal (Reference Boudlal2001), investigate where the main perceptual prominence of a word lies (El Hadri Reference El Hadri1993, Fares Reference Fares1993, Nejmi Reference Nejmi1993, all as cited in Benhallam Reference Benhallam1990, Boudlal Reference Boudlal2001). In addition, Boudlal (Reference Boudlal2001) himself provides a study of his own. In all of these studies, participants were asked to indicate what they think is the most prominent syllable for words presented in written form in a list. While the authors propose different analyses of stress assignment in MA, all studies have in common that they find a great deal of disagreement between participants on the location of stress for any given word. For example, Boudlal (Reference Boudlal2001) found that the word limun ‘oranges’ was judged to have initial stress by 23 participants and final stress by 11 participants, while the numbers for likum ‘for/to you’ were 16 and 18, respectively. In any of the aforementioned stress assignment scenarios these two words would be expected to behave the same.
There are several difficulties in comparing and interpreting the results of these studies. Firstly, varying interpretations by the authors might in part be due to the different types of words and items tested. Fares (Reference Fares1993, as cited in Boudlal Reference Boudlal2001) tested nouns and conjugated verbs, El Hadri (Reference El Hadri1993, as cited in Boudlal Reference Boudlal2001) tested conjugated verbs only, some with affixes, while Benhallam (Reference Benhallam1990) and Boudlal (Reference Boudlal2001) tested cliticised forms in addition to all of the aforementioned forms. While an adequate description of stress in a language should be able to account for all types of words, it is possible that the varying morphological structure of words impacted on the different authors’ identification of general patterns. A second problem is that the studies might have tested potentially different stress systems in the first place: Judgments reported by El Hadri (Reference El Hadri1993) and Fares (Reference Fares1993) came from speakers of Tetouan Moroccan Arabic. This variety has a somewhat differing vowel system from Casablancan Moroccan Arabic, which was the native variety of subjects in some of the other studies (Boudlal Reference Boudlal2001).
Another type of difficulty relates to the variety of Arabic that was being responded to. Many of the words tested were typical Moroccan Arabic, but not all. Disagreement between native speakers might therefore in part be due to the fact that written Arabic is strongly associated with Modern Standard Arabic. Most if not all participants in the aforementioned stress identification studies were university students, which increases the likelihood that their judgments are influenced by their (good) knowledge of MSA. Despite the existence of prescriptive rules for MSA, it is not clear that it has a fixed stress system, since MSA is known to exhibit prosodic features of the national/regional varieties of the speaker (Benkirane Reference Benkirane1998). Since each native variety of Arabic has its own, slightly different rules for stress assignment, the association of MSA with written stimuli might have unpredictable effects on stress judgments by MA speakers.
In short, there are several possible explanations as to why MA stress defies a clear analysis when the analysis is based on native speaker judgments. Nevertheless, the fact that several independent studies each found that judgments differ considerably between participants, and the fact that such disagreement exists in the first place, are in line with other indications that word stress in this language is rather elusive.
2.4.3 The role of stress in the intonation system
To date there is only a handful of sources discussing aspects of MA intonation. The most comprehensive treatment is Benkirane’s (Reference Benkirane1998) qualitative analysis which involves a concise inventory, in INTSINT style (International Transcription System for Intonation, Hirst & Di Cristo Reference Hirst and Di Cristo1998a), of the prosodic properties that are characteristic of various sentence types, including yes–no questions, declaratives, imperatives and question word questions.Footnote 6 His characterisations are based on sentences read by various speakers combined with his own observations as a native speaker. In essence, Benkirane’s claim about the interaction between stress and intonation is that sentence accent (nuclear pitch accent) targets the stressed syllable of the final word. He adds to this that in non-final words, stressed syllables are not differentiated in terms of pitch pattern (Benkirane Reference Benkirane1998: 349).
A small set of experimental works on MA intonation paints a somewhat more complicated picture. Yeou, Embarki & Al-Maqtari (Reference Yeou, Embarki and Al-Maqtari2007) compare the prosodic marking of contrastively focused words in a sentence context in Moroccan Arabic with two other varieties of Arabic.Footnote 7 Assuming the same stress assignment rule for all three varieties, the authors find that in each variety, contrastive focus is marked by the presence of a rising-falling pitch contour on the relevant word (and it is taken for granted that within the word it aligns with the stressed syllable). However, when we consider the few contours they provide as examples, it appears that MA might use rising-falling contours that are less localised to the ‘stressed’ syllable than in other varieties.
Yeou et al.’s (ibid.) interpretation, which might be summarised as ‘pitch-prominence-goes-to-stress’ contrasts with Burdin et al.’s (Reference Burdin, Phillips-Bourass, Turnbull, Yasavul, Clopper and Tonhauser2015) claims. These authors look at the way different types of focus are realised within the noun phrase (in MA: a noun followed by an adjective). Their findings fail to support that focus is marked with pitch prominence in MA, as the location of pitch prominence in the noun phrase does not correspond to the intended location of focus. Secondly, they fail to find evidence in favour of the standard view in the literature that prominence-marking pitch events (in this case something like a focus marking pitch accent) seek out stressed syllables; the presumed stressed syllables do not attract pitch movement, and any pitch movement they find is analysed as constituent edge-marking. These authors consequently argue that MA marks focus by means of phrasing only, and lacks pitch accents altogether.
Finally, a middle way is taken by Hellmuth et al. (Reference Hellmuth, Louriz, Chlaihani and Almbark2015), who look at question prosody in yes–no questions. These authors observe that the alignment of the rise-fall that consistently marks the right edge of questions is not fully predictable either with reference to the right phrasal edge or the stressed syllable alone (stress assignment following Benkirane Reference Benkirane1998). Instead, the rise-fall is aligned with reference to both of these domains simultaneously, in that it aligns consistently with the final foot of the phrase-final word.
In sum, given a small body of work on the topic, the correspondence between lexical and stress and the location of pitch prominence in MA is anything but straightforward. The language clearly has rising-falling pitch movements in some contexts, but how exactly these align with the segmental string, including the possible attractor role of specific syllables (stressed or not), remains to be investigated. Section 4.1, which discusses the f0 properties of the present data, will return to this issue.
2.5 Aim of experiment and predictions
The preceding discussion has highlighted that there is presently no consensus regarding the existence of lexical stress in Moroccan Arabic. Among those who posit the existence of stress, there is moreover no agreement regarding the specific stress generalisation that would capture its assignment. In this paper, we test the most widely held assumption about Moroccan Arabic stress, namely that it is weight-sensitive (as in other varieties of Arabic) and targets either the penultimate or the final syllable (Benkirane Reference Benkirane1998, Boudlal Reference Boudlal2001). Recent work adopting or testing this view on MA stress includes Yeou, Embarki & Al-Maqtari (Reference Yeou, Embarki and Al-Maqtari2007), Burdin et al. (Reference Burdin, Phillips-Bourass, Turnbull, Yasavul, Clopper and Tonhauser2015) and Hellmuth et al. (Reference Hellmuth, Louriz, Chlaihani and Almbark2015).
We report on an experiment that tries to find acoustic correlates of stress as manifested in seven target syllables. Specifically, the experiment contrasts syllables as a function of presumed stress status. Since the hypothesis that is tested involves the stress-by-weight principle, all target syllables are light /CV/ syllables in word-initial position in disyllabic words. Each syllable occurs in two target words, once as presumed stressed and once as presumed unstressed, as a function of the weight of the final syllable (e.g. mu being stressed ˈmuka ‘owl’ and unstressed in muˈkat ‘owls’). In the following, these syllables will be referred to simply as stressed and unstressed, even if their status is based on a hypothesis.
The expectation is that if there is word-level stress, presumed stressed syllables should be differentiated in terms of their acoustic properties, in terms of durational, vowel quality and/or spectral properties. A large amount of literature on the topic leads us to hypothesise that under stress, syllables exhibit enhancement and may have (i) longer duration and (ii) more peripheral vowel quality. Spectral properties are likely to differ, too, but the direction will depend on the vowel in question. The expectation for f0 however is the absence of an effect, as the elicitation context was designed to avoid the presence of postlexical pitch events of any kind on the target syllable (see Section 3.3), and few, if any, studies have convincingly shown that f0 is a correlate of lexical stress proper, i.e. in the absence of postlexical pitch prominence (Gordon & Roettger Reference Gordon and Roettger2017). We are therefore presenting f0 results mainly for the sake of completeness.
3 Method
3.1 Participants
Participants were recruited among students of the Université Hassan II in Casablanca, where they were also recorded. Two groups of speakers were recorded, each with an equal gender division. The first group consisted of 12 native speakers of Moroccan Arabic who grew up with Arabic only at home (the ‘monolingual’ group).Footnote 8 These speakers were aged between 21 years and 34 years. Ten of them were born in Casablanca and had lived there all their life, one speaker moved to Casablanca at the age of 2 years, and one speaker was born in nearby Kenitra.
The second group consisted of 12 speakers of Moroccan Arabic who were also native speakers of Tashlhiyt Berber through one or both parents (the ‘bilingual’ group). The choice to include bilinguals served to explore the possibility that their production of stress patterns in Moroccan Arabic might be different from that of the monolinguals due to the fact that Tashlhiyt lacks lexical stress (see Section 2.2). The ages of participants in this group ranged between 20 years and 32 years. Nine speakers in this group were born in Casablanca, the other three moved to the city at the respective ages of 6, 12 and 14 years. All 24 participants were also fluent in Modern Standard Arabic and French, and had received a number of years of English instruction in school.
3.2 Procedure
The present experiment was conducted as part of a larger recording session for the Intonational Variation in Arabic (IVAr) corpus (Hellmuth & Almbark Reference Hellmuth and Almbark2017). For the present experiment participants were recorded individually in a quiet university room at the Université Hassan II. Recordings were made with a Shure SM-10 headset microphone. Participants were first given oral instructions by a native speaker of MA (one of the authors) as to what the task entailed. They then received a printout of the experimental stimuli, consisting of 60 mini-monologues containing one target word each (see next section). There were no practice items in order to minimise the duration of the session as a whole, but there were three fillers at the top and bottom of each of the stimuli sheets. There was one mini-monologue for each target word and no repetitions. Participants read these monologues out loud in full at their own pace, and this part of the recording session took approximately five minutes.
3.3 Speech materials
The experimental design is based on the paradigm first used in Bouchhioua (Reference Bouchhioua2008) and subsequently in Almbark et al. (Reference Almbark, Bouchhioua and Hellmuth2014). The experiment contrasts identical initial syllables in disyllabic words that are hypothesised to form a minimal stress pair according to the aforementioned weight-sensitive interpretation of MA stress, such as mu in ˈmuka ‘owl’ and muˈkat ‘owls’. The experiment tested the initial syllables of 12 word pairs that were chosen on the basis of their comparability in the full range of dialects used in the IVAr database. Among these, seven word pairs exhibited the MA stress contrast as per Benkirane’s (Reference Benkirane1998) generalisation (Table 1) and were analysed for this paper. The observed MA pronunciation columns reflect a broad phonetic transcription.Footnote 9 Of the seven target syllables, five have an identical segmental environment across the stressed/unstressed conditions (baʃar/baʃart, marra/marrart, muka/mukat, saada/saadat, and murra/murrin although the latter pair does exhibit a vowel difference in the second syllable). sira/sinat and sura/Sudan have a different intervocalic consonant in the two stress conditions, so it should be noted that there might be segmental coarticulatory effects that could result in acoustic differences between the stress conditions. The aim for these two word pairs, then, is to find observable acoustic enhancement in the expected direction (stressed syllables being enhanced) that cannot be explained by segmental effects.
Table 1 Target syllables and their carrier words. For both syllables commencing with mu the consonant at the onset of the next syllable is given in brackets to distinguish between the word pairs.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_tab1.png?pub-status=live)
1 The transcription of the intended stressed syllable containing a schwa is unproblematic as there is no reason why a central vowel would theoretically be banned from being stressed in Moroccan Arabic (as opposed to in e.g. Germanic). Here its use reflects both native speaker intuitions and the authors' auditory impression that the vowel sounds different from [i a u].
2 The pharyngealisation in sura was unexpected given the written stimuli that lacked pharyngealisation. In any case, this word still forms a near-minimal pair with its counterpart Sudan.
Target words were placed in a carrier sentence which was in turn embedded within a short scripted monologue consisting of a total of three sentences, given below (second line representing MA pronunciation):
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_figu1.png?pub-status=live)
Context sentence 1
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_figu2.png?pub-status=live)
Context sentence 2
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_figu3.png?pub-status=live)
Target sentence
Stimuli were presented in Arabic script. As Moroccan Arabic is not usually written (Arabic script almost exclusively being used for Modern Standard Arabic), participants were explicitly instructed to produce Moroccan Arabic when reading the stimuli. Many of the lexical choices (e.g. ʕawd ‘repeat.imp’, ʒuʒ ‘two’), including some of the target stimuli, are used exclusively in Moroccan Arabic. For any words also used in Standard Arabic (such كلمة ‘word’) Moroccan renderings were used, e.g. kəlma (Standard: kalima). Auditory impressions by two of the authors (native Arabic speakers) confirmed the authentic Moroccan Arabic nature of the speech thus produced.
Target words’ embedding in a sentence which in turn formed part of a larger context served to minimise the possibility that the target words carried postlexical prominence. Specifically, target words were (i) pragmatically given, as they are mentioned in both preceding context sentences, and (ii) postfocal, occurring immediately following ʕawd ‘repeat!’, which is contrastively focused due to the occurrence of ktəbe ‘write!’ in the immediately preceding sentence (the imperative ʕawd was moreover presented in boldface). Finally, target words occurred in non-IP-final position in order to avoid phrase-final lengthening effects. An example spectrogram and waveform are given for one mini-dialogue in Figure 1 with the target word muka ‘owl’. Note that only the third occurrence of the target word is being analysed in the rest of this paper.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig1.png?pub-status=live)
Figure 1 Example waveform, spectrogram and pitch track for one mini-monologue with the word muka ‘owl’, as spoken by a female speaker (IVAr filename: moca-slb3-f5). Shaded occurrence of muka is the target token analysed.
In the target phrase (rightmost) the target word muka does not receive any pitch prominence in contrast to in the preceding two phrases. Specifically, it occurs after the main pitch event on contrastively focused, and it also is not subject to edge-marking pitch prominence (judging from the absence of pausing and/or pitch reset). This situation matches the intended context, allowing for the examination of correlates of stress in the absence of postlexical pitch prominence. As mentioned previously, therefore, we do not in fact expect to find an effect for f0 irrespective of whether lexical stress in MA exists or not (considering that f0 is an uncommon marker of word-level stress).
3.4 Analysis
3.4.1 Data processing and measurements
The acoustic parameters analysed were f0, duration, spectral Centre of Gravity, and vowel quality. Annotation (in Praat, Boersma & Weenink Reference Boersma and Weenink2015) proceeded as follows: Automatic segmentation of utterances into words and segments was performed by means of the Prosodylab Aligner algorithm (Gorman, Howell & Wagner Reference Gorman, Howell and Wagner2011). The segmentation of target words was then manually checked and corrected where needed, and coded for preceding and following pauses (preceding/following/none/both). Pauses were defined as periods of silence in the signal. They were identified on the basis of the auditory impression of a pause, supported by visual inspection of speech discontinuity in the spectrogram. The theoretical number of target items was N = 336 (2 speaker groups × 12 speakers × 2 stress conditions × 7 target syllables). Of these, N = 317 were targetlike (the intended word being produced without disfluencies; bilinguals N = 165, monolinguals N = 152). With respect to pausing, the most frequent location for pause insertion was following the target word (N = 86; an additional N = 4 had a preceding pause, and N = 6 had both). N = 221 items were produced without any pauses. Pausing is included as a binary predictor in the models detailed below (pause presence yes/no).
F0 measurements (in semitones) are based on a handcorrected version of the output of the standard pitch-tracking algorithm in Praat. Manual correction was limited to pitch-tracking errors, such as octave jumps and the tracking of pitch in cases of phonetically voiceless segments. Several static f0 measures were taken: Mean f0 throughout the target vowel, maximum f0, and f0 at intensity peak, with all measures additionally being converted to z-scores. Models were run on all measures, and on absolute as well as z-scored values. As the results were very similar in all cases the absolute mean f0 values only will be reported. To allow for a more holistic and dynamic analysis of f0 movements, phrasal f0 contours were extracted by means of measuring f0 at 20 equally spaced timepoints throughout each word, apart from target words, in which 10 points were measured per syllable (also 20 in total).
For duration, measurements of target vowels as well as syllables were taken. Vowel duration was determined as the period of time following the initial consonant with strong energy across the second and third formant. For the segmentation of intervocalic /r/, the onset of the /r/ was determined as the start of the (first) closure (virtually all /r/s were realised as either trill or tap). Statistical models were run on both absolute and z-scored values. When results between measures (vowel/syllable, absolute/z-score) are similar, only results for absolute vowel duration are reported. In cases of discrepancies in significance between models these are reported.
Spectral balance was measured to reflect the energy distribution across the vowel. If the aim is to characterise the loudness of a segment or syllable as a possible indicator of stress, spectral balance characteristics such as those termed ‘spectral tilt’, ‘spectral slope’ and ‘spectral Centre of Gravity’ have been shown to be more reliable measures than average intensity measures (van Son & van Santen Reference van Son and van Santen2005, Sluijter, van Heuven & Pacilly Reference Sluijter, van Heuven and Pacilly1997). Here, we calculated the spectral Centre of Gravity (CoG), which can be taken to represent ‘both the relative produced power and the perceived loudness’ (van Son & van Santen Reference van Son and van Santen2005: 105). Specifically, the CoG as a measure gives the frequency at which the spectral energy for a given range of frequencies is balanced. Like most other energy and intensity measurements it is sensitive to intrinsic vowel differences in being relatively low for back vowels such as /u/ (its low formants resulting in a low CoG) and being relatively high for front vowels like /i/ (with high F2 and F3 resulting in a higher CoG). The CoG was calculated at the midpoint of the steady-state portion of the vowel from the spectrogram (25 ms Gaussian window, up to 5000 Hz, timestep 50 ms).
Finally, vowel quality was measured by F1 and F2 values taken at the midpoint of the steady-state portion of the target vowel (which corresponded to the midpoint of the vowel with the exception of the syllables si and su which were characterised by considerable formant transitions in the stressed condition, see results). Measurements were extracted by means of the ‘Burg’ method in Praat (standard settings with timestep 10 ms). All values were verified manually, corrected where needed, or excluded where reliable formant values could not be extracted. Results are reported both on the raw F1 and F2 values and on Lobanov-normalised values. The latter were calculated with the Norm vowel normalisation suite (Thomas & Kendall Reference Thomas and Kendall2007) and the R package vowels (Kendall & Thomas Reference Kendall and Thomas2014).
3.4.2 Statistics
Statistical analysis was performed in R (R Core Team 2016) with linear mixed-effects regression models with the package lme4 (Bates et al. Reference Bates, Maechler, Bolker and Walker2015). Models with a very similar structure were run for each of the acoustic parameters under investigation. These always included presumed stress status (yes/no) as a fixed effect which interacted with group (monolingual/bilingual) to investigate the possibility that monolinguals and bilinguals produce different acoustic enhancement patterns. Since pausing can be expected to have an effect on duration and f0, a third, non-interacting fixed effect of pause (yes/no) was included for those two parameters. Models for f0 and spectral CoG moreover included duration (of the target vowel) as a covariate, to take into account that longer vowel length might contribute to more extreme values for other acoustic properties. To allow for speaker- and item-specific variation, random intercepts for syllable and speaker were included in all models, excepting those for CoG and vowel quality. For these, syllable was entered as a main effect interacting with stress as the effect of stress can be expected to differ for different combinations of F1 and F2 (and CoG by extension). Thus, as two examples, the model for f0 had the structure stress*group + pause + duration + (1|speaker) + (1|syllable), whereas the model for CoG had the structure stress*group + syllable + stress:syllable + pause + duration + (1|speaker). statistical significance was calculated by means of likelihood ratio tests (LRTs) comparing main models with corresponding null models that lacked the relevant fixed effect or interaction term. When there was no significant interaction, the interaction term was dropped (this was only the case for the group:stress interaction, with almost all results giving no reason to assume bilinguals behaved different from monolinguals). Multiple comparisons in the case of complex interactions (CoG and vowel quality) are performed using the Tukey method with the package lsmeans (Lenth Reference Lenth2016).
4 Results
4.1 F0
In this section we will first report on the global f0 contours characterising the utterances in which target words were embedded. After this we will consider static scaling properties of the f0 contours in terms of mean values in target vowels.
Figure 2 shows the time-normalised mean contours for all utterances in which target words were not surrounded by any pauses (N = 221, as described above). Contours on the target words do not differ much in overall shape, suggesting that there is little if any effect of stress on the f0 characteristics of target words. This is to be expected under the assumption that the words in this dataset should not be subject to postlexical prominence marking.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig2.png?pub-status=live)
Figure 2 Phrasal intonation contours (averaged and normalised for duration) across N = 221 target utterances without pauses, male (bottom) and female (top) contours separated. Syllables of target words (σ1: target syllable) indicated by vertical lines.
It can also be observed that all target words, irrespective of stress status, exhibit a small rise on the second syllable, near the right word edge. This is probably best interpreted as an edge-marking tonal event rather than a prominence-marking event, given their occurrence in a context where the target words are both postfocal and given. Crucially, though, the presence of a marginal pitch movement, which is consistent in location and does not map onto presumed stress status, highlights the difficulty in obtaining pitch-neutral stimuli despite careful experimental design. In the present experiment target syllables are found in word-initial position and thus do not themselves carry this rise. It would have been problematic, however, if presumed stressed syllables had occurred in word-final position, in which case there would have been a confound between positional and stress-related f0 characteristics. Such a scenario might have been the case in Boudlal’s (Reference Boudlal2001) experiment. As mentioned previously, Boudlal proposes that stress targets the final syllable in cases where a word is produced in isolation. Words in isolation in his study were characterised by high pitch on the final syllable, most likely due to list intonation. Therefore, what Boudlal (Reference Boudlal2001) considers a pitch correlate of ‘stress’ rather seems to reflect a positional effect, and not necessarily a correlate of stress proper.
Having established that in the present case, target syllables are comparable in terms of their pitch properties, we now turn to more localised measures, applied to the full set of target words of N = 317. Firstly, there was no main effect of pause on mean absolute f0 (as measured in semitones (ST)) (LRT χ2(1) = 0.86, p = .35; β = −0.1964, SE = 0.21, t = −0.93). There was, however an interaction between group and stress (LRT χ2(1) = 6.07, p = .013; β = −0.56, SE = 0.23, t = −2.5). This meant that the bilingual group produced a slightly larger difference in pitch as a function of stress (stressed vowels had 1.1 ST higher pitch than unstressed vowels), than monolinguals (0.5 ST). There is therefore a general effect of stress with presumed stressed vowels having overall higher pitch than unstressed vowels. The size of these predicted differences however suggests that an attempt at a further explanation might be superfluous. Predicted differences of the above-mentioned magnitude are unlikely to translate into a robust perceptual cue to stress, as differences of 1 ST have been reported to be an absolute minimum in order for listeners to distinguish dynamic pitch movements (for example, exceptionally good listeners in ’t Hart Reference ’t Hart1981, but see ’t Hart Reference ’t Hart1976, d’Alessandro & Mertens Reference d’Alessandro and Mertens1995 for the suggestion that greater differences are required). Of course, perceptual retrievability is not a prerequisite for something to be a robust correlate of stress, but it does cast doubt on whether this correlate would be able to play a role in determining native speakers’ judgment of stress at all. In short, the f0 results are more or less as expected, with no strong evidence in favour of the interpretation that presumed stressed syllables are enhanced in terms of f0. There was only a small, potentially negligible effect.
4.2 Duration
The total number of target words analysed for duration was the full dataset of N = 317 tokens, as mentioned above. Firstly, there was no main effect of pause on any of the measures (for absolute vowel duration: LRT χ2(3) = 4.68, p = .20; β = 0.007, SE = 0.07, t = 0.11). The stress/group interaction was significant only for z-scored syllable duration (LRT χ2(1) = 4.46, p = .03; β = −0.0091, SE = 0.005, t = −1.77). Since none of the other three duration measures (absolute syllable duration, and absolute/z-scored vowel duration) were significant this effect will not be further considered. There was, however, a main effect of stress on vowel duration (z-scored and absolute), although the predicted difference involves ‘stressed’ vowels being 3 ms shorter than ‘unstressed’ ones (LRT χ2(1) = 4.70, p = .03; β = −0.003, SE = 0.0017, t = −2.18). This change is not in the expected direction, but more importantly a change of 3 ms on an average vowel duration of around 77 ms does not reflect a meaningful change.
Figure 3 shows the distribution of absolute vowel duration, for each syllable separately and pooled across the groups. It does appear that differences exist between stressed and unstressed tokens of the syllables si and su. An explanation for this observation might be found in the segmental make-up of the word pairs involved. In the target words in which these syllables are stressed (sira and sura), there is a high vowel followed by [r] or [rˤ]. The rhotic considerably affected the preceding vowel formant structure and resulted in longer duration than in the unstressed counterparts sinat and sudan, which had steady-state only initial vowels (note that the other target syllables followed by /r/, i.e. mu and ma in murra [mərːa] and marra [mərˤːa], respectively, are much shorter, but the target vowel in these cases is central rather than high). Pre-/r/ lengthening of high vowels preceding rhotics is also observed in other languages, including Dutch (Rietveld, Kerkhoff & Gussenhoven Reference Rietveld, Kerkhoff and Gussenhoven2004).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig3.png?pub-status=live)
Figure 3 Duration of vowels as a function of presumed stress status. Lines link productions by the same speaker, large dots and triangles represent means.
In short, both statistics and the above observations suggest that there is no evidence to support an interpretation in terms of stress-induced vowel or syllable lengthening in MA across the board. To confirm that there are also no individual speakers who produce produce consistent durational enhancement of stressed vowels, Figure 4 shows speaker-specific behaviour (tokens of si and su are removed). Unstressed and stressed tokens of the same syllable are connected by lines (e.g. unstressed mu in mukat and stressed mu in muka). The varying direction of these lines for almost all speakers, and the large overlap between the presumed stressed and unstressed categories within each speaker, indicates that speakers did not systematically differentiate between stressed and unstressed vowels in terms of duration.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig4.png?pub-status=live)
Figure 4 Duration of vowels in matched pairs of stressed/unstressed syllables, per speaker, tokens of si and su excluded.
4.3 Spectral Centre of Gravity
The mean spectral Centre of Gravity for all items is shown in Figure 5, with matched pairs of vowels (productions by the same speaker) connected by lines (N = 309, four outliers with values above 1000 Hz were removed, and four could not be determined).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig5.png?pub-status=live)
Figure 5 Spectral Centre of Gravity (absolute values) as a function of presumed stress status. Lines link productions by the same speaker, large dots and triangles represent means.
Firstly, there was no interaction between stress and group (LRT χ2(1) = 0.73, p = .39; β = −17.73, SE = 20.79, t = −0.85), indicating that we did not find any production differences between monolingual and bilingual speakers. There was however, as expected, an interaction between stress and syllable (LRT χ2(6) = 18.253, p = .006), meaning that the effect of stress status was significantly different for the different vowel pairs. Posthoc multiple comparisons on this interaction revealed that the only significant differences as a function of stress occur with the syllables si and su, whose ‘stressed’ CoGs are higher than their ‘unstressed’ counterparts by an estimated 112 Hz (SE = 28.86, df = 293.09, t = 3.87, p = .0001) and 120 Hz (SE = 27.18, df = 284.35, t = 4.41, p < .0001), respectively. These are considerable differences, but might be in part explained by appealing to vowel quality differences which in turn might be the result of coarticulation and/or pharyngealisation (the sound following the ‘stressed’ vowels was a rhotic in both cases, see also Table 1). We will return to this finding in the next section on vowel quality.
In sum, for five out of seven syllable pairs tested here there was no evidence that either speaker group or presumed stress status had any effect on the distribution of the spectral Centre of Gravity.
4.4 Vowel quality
Vowel quality was measured by F1 and F2. For N = 14 items the formants could not be determined, so that analysis was performed on N = 303. Given the aforementioned observations about formant transitions preceding the rhotic in target words sira and sura, formant measurements for these particular words are reported for the midpoint of the initial, steady-state part of the vowel.
As for the CoG, if there is an effect of presumed stress status, it is expected to affect different vowels differently, in the sense that stress might result in more extreme realisation in a relevant articulatory dimension (fronting for /i/ as opposed to backing for /u/, for example). In the following, results are shown for Lobanov-normalised values. Firstly, matching previous results, there was no interaction between stress and group, for neither F1 nor F2 (F1: LRT χ2(1) = 0.06, p = .8; β = −0.03, SE = 0.10, t = −0.25, and F2: LRT χ2(1) = 1.66, p = .20; β = 0.06, SE = 0.05, t = 1.29).
Moving on to the effect of stress on individual vowel pairs, Figure 6 shows the stressed and unstressed vowels within a Lobanov-normalised vowel space. A first observation is that stressed vowels are generally not realised in the more extreme regions of the vowel space. There is a high degree of overlap in the distribution of most of the matched pairs of stressed/unstressed vowels: (i) sada~sadat, (ii) baʃar~baʃart, (iii) marra~marrart, and (iv) muka~mukat. The only clear differences between matched vowels occur in the syllables si and su, but those differences do not reflect hyperarticulation under stress. Stressed si appears to be less fronted than its unstressed counterpart, and stressed su appears to be less high. The differences within both vowel pairs are confirmed statistically by posthoc multiple comparisons: stressed si has a lower F2 when stressed (β = −0.56, SE = 0.07, df = 295.37, t = −8.42, p < .0001), while su has both higher F1 when stressed (β = 1.07, SE = 0.14, df = 303, t = 7.90, p < .0001) and lower F2 (β = −0.49, SE = 0.07, df = 280.67, t = −7.33, p < .0001). While an effort was made to measure vowel quality in the steady-state portion of the vowel, which presumably is less prone to anticipatory coarticulation with the following rhotic than the exact midpoint, it is likely that the attested differences are the result of differing neighbouring segments rather than stress per se. We will return to this finding below.
A second observation concerns the overlap in the distribution of vowels in murra~murrin and marra~marrart. We impressionistically transcribed both these vowels as [ə] in Table 1. The attested overlap in the vowel space confirms this similarity, despite the presence of vowel diacritics in the Arabic stimuli that should have made the distinction clear.
There were two remaining statistical differences: F2 was different between murra and murrin (lower F2 when the syllable is stressed, β = −0.25, SE = 0.06, df = 277.43, t = −3.91, p < .0001). We have no meaningful explanation for this effect and given its small magnitude (see overall distributional overlap in Figure 6) we will not discuss it further. A final observation concerns the differences between stressed and unstressed sa, with stressed sa having higher F1 (β = 0.33, SE = 0.13, df = 303, t = 2.43, p = .02).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig6.png?pub-status=live)
Figure 6 Mean formant values (Lobanov-normalised) for N = 303 target vowels, ellipses indicate 1 SD.
In the following, we return to the significant differences that appeared to occur as a function of stress status and required some further explanation. Figure 7 presents an overview and a closer examination of the four vowel pairs that apparently differed as a function of stress: F1 differed for si and sa, F2 in si, su and mu(rː). Judging from the individuals’ patterns, the only consistent differentiation happens with stressed and unstressed vowels in su (sura~Sudan). This, incidentally, is also a vowel pairing that involved a non-identical segmental environment, where sura was realised with both consonants pharyngealised. The effect on a vowel of a pharyngealised consonant in its vicinity is F1 raising and F2 lowering (e.g. Al-Tamimi Reference Al-Tamimi2017). This is exactly what is observed for the vowel in sura (stressed) compared to the vowel in Sudan (unstressed). As was observed previously, su had a higher spectral CoG under stress, which can now be explained by its lower F1 under stress.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig7.png?pub-status=live)
Figure 7 All individual tokens of target vowels for the four syllables mu(rː), sa, si and su with lines linking individual speaker’s stressed/unstressed renditions of the same vowel. Lobanov-normalised values.
The differences in the other three syllables seem to be of a different kind. In murra~murrin and sira~sinat most of the effects seems to be carried by a small number of speakers (the few longer lines). For sira~sinat moreover the possibility of anticipatory coarticulation with the following rhotic in sira might be the cause of the ‘stress’ effect. Despite the posthoc significant difference, the vowels in the words sada~sadat exhibit a great deal of overlap, suggesting that no robust differentiation is made by most individual speakers. Additionally, in murra~murrin, where stressed vowels are somewhat more peripheral, the ‘stress’ effect concerns a front/backness distinction, which might be explained away by coarticulation with the following vowel. The unstressed target vowel in murrin is followed by a high front vowel, while the stressed target vowel in murra is followed by a low vowel, so that in the former (stressed) case, we might have already expected to see some coarticulatory fronting.
In sum, for three out of seven syllable pairs there were no F1 and F2 differences between stressed and unstressed vowels. For the other four pairs, we argued that any apparent effects of stress on formant values could be explained by factors unrelated to stress, including pharyngealisation, coarticulation effects and/or speaker-specific behaviour. These results together do not provide evidence to support the hypothesis that vowel quality reliably distinguishes between ‘stressed’ and ‘unstressed’ positions, or indeed that individual speakers have ‘stressed’ and ‘unstressed’ realisational categories for vowels.
5 Discussion
For none of the acoustic correlates measured in this experiment (f0, duration, spectral CoG and vowel quality) were there convincing differences between presumed stressed and unstressed syllables, nor consistent patterns across speakers or speaker groups. No acoustic parameter was used consistently across syllable pairs to mark the distinction between stressed and unstressed syllables, and no syllable was consistently enhanced by multiple acoustic parameters. Additionally, in no case was there a meaningful difference between the speaker groups, which means this study provides no evidence that speakers who have Tashlhiyt as a first language in addition to Moroccan Arabic produce different lexical prominence patterns.
In order to conclude that stressed syllables stand out acoustically, there would have to be consistent differences across the board. A lexicon-wide effect in terms of acoustic enhancement of stress would require stressed syllable members to stand out from unstressed ones in most if not all words. In the present experiment, the only differences that were near-consistent were found for the syllables si and su, which appeared to have different duration and were subject to a CoG effect. Vowel quality in these syllable pairs was also different between stressed and unstressed members, but not in the expected direction (i.e. a more peripheral realisation under stress). The attested quality difference could be explained in terms of coarticulation rather than stress status, and it was argued that differences for other acoustic parameters are best interpreted as parasitic on this vowel difference.
Additionally, for the differentiation of stressed and unstressed syllables to be robust, individual speakers would be expected to systematically produce this distinction. In this experiment, different speakers did not make a consistent distinction: they did not use a combination of cues to enhance stressed syllables, nor did they consistently use any single cue (with the exception of those acoustic differences that could be explained in terms of an effect of neighbouring segments). This suggests that speakers do not produce two acoustically distinct categories ‘stressed’ and ‘unstressed’.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211118100148327-0482:S002510032000002X:S002510032000002X_fig8.png?pub-status=live)
Figure 8 Spectral Centre of Gravity, duration, F1 and F2 values for /a/ in initial position (sada) compared to in final position (muka, sada and sira) across the all tokens produced without pausing (N = 51). Large triangles reflect means.
While the present findings can thus not be considered to provide evidence in favour of the existence of (acoustic correlates of) lexical stress in MA, they also cannot simply be taken to provide evidence against it. On the one hand, this is a problem inherent to null results, as detailed earlier. A further complication is that even if MA has no lexical stress according to the rule tested here (stress the final syllable when it is heavy and the penultimate otherwise), lexical stress could still, potentially, be captured by appealing to another stress rule. In order to test different predictions, a different experimental setup would be needed. For example, when testing the claim that stress is final (e.g. Watson Reference Watson, van Oostendorp, Ewen, Hume and Rice2011), stimuli should contrast the same syllables in final and non-final positions. Although the present experiment was not designed to test for fixed final stress, the present data in fact allow for a quick comparison along these lines, contrasting the vowel /a/ in CV syllables in different positions, specifically in word-initial position in sada, versus in word-final position in sada, muka, and sira. If stress were consistently final, the vowel in da, ka and ra can be expected to be enhanced, or at least differentiated from initial sa. Figure 8 shows the distribution of measurements for /a/ in our dataset, comparing its spectral Centre of Gravity, duration, F1 and F2 in these syllables. This set (N = 51) excludes those syllables that were followed by pauses, to avoid the confound of phrasally induced lengthening effects that would likely result in disproportional enhancement of final syllables. Potential side-effects caused by different segmental make-up of the onset (even if not measured) cannot be excluded: /s/ is used in initial position where /d k r/ occur as onsets in final position. On the whole, however, given these very basic results, it does not appear that vowels in word-final position are systematically differentiated from vowels in initial position.
To return to the question of what, if any, stress generalisation might hold in MA, currently there is no concrete evidence from any diagnostic of stress (acoustic correlates, native speaker judgments, phonological rules, etc.) that support an interpretation in favour of MA having a stress-by-position system, or in fact in favour of any other interpretation of stress assignment.
Thus, the present null results are likely to accurately reflect a situation in which MA lacks lexical stress altogether. Such an interpretation is compatible with prior work, including the general lack of success in identifying the position of lexical stress, and the apparent absence of other exponents of stress in the language. Firstly, this lack of success in diagnosing stress concretely involves native speakers' varied judgments on stress position in MA and the century-long disagreement among scholars on the proper representation of stress. Both of these types of evidence highlight that the concept of stress in MA is an elusive notion, which is a strong indication that it might not play much of a role in the phonology of the language. Secondly, the apparent absence of other exponents of stress has to do with the intonational phonology of MA and the generally held assumption that stressed syllables serve as docking sites for postlexical pitch accents. As reviewed in Section 2.4.3, all experimental studies to date on the alignment of prominence-lending intonational events in MA suggested that similar movements in MA are either absent or difficult to characterise as pitch accents. A different argument to support an interpretation along the lines of the absence of lexical stress in MA comes from one of its contact languages, Tashlhiyt Berber. This variety of Berber specifically is considered to lack lexical stress (see Section 2.2), although other, closely related varieties of Berber spoken in Morocco might also lack it (Kossmann Reference Kossmann, Frajzyngier and Shay2012). It is well known that the segmental phonology of MA exhibits features that can be traced back to prolonged contact with Berber (e.g. Heath Reference Heath and Kaye1997, Dell & Elmedlaoui Reference Dell and Elmedlaoui2002, Maas & Procházka Reference Maas and Prochaézka2012; see also Zellou Reference Zellou2010). It is conceivable therefore that not only segmental phonological structure, but prominence structure too has been influenced by contact with the Berber languages, supporting the possibility that stress is absent in both Moroccan Arabic and Moroccan Berber.
In order to claim convincingly that Moroccan Arabic lacks lexical stress, in the sense of designating one syllable in each word as being marked by culminative prominence, further evidence will be needed. The absence of acoustic enhancement of presumed stressed syllables, as shown in the present experiment, is but one of several diagnostics that might serve to support the claim that stress is absent. Nevertheless, most if not all experimental evidence available to date is compatible with the absence of stress in MA.
6 Conclusion
The experiment reported in this paper investigated acoustic correlates of lexical stress in Moroccan Arabic by contrasting presumed stressed syllables with unstressed ones. According to the view tested, the penultimate syllable of a word is stressed, unless the final syllable is heavy, in which case stress is final (Benkirane Reference Benkirane1998, Boudlal Reference Boudlal2001).
The results from the present experiment could not provide evidence in favour of acoustic enhancement of ‘stressed’ syllables according to this rule. Acoustic properties of stressed syllables – f0, duration, spectral Centre of Gravity or vowel quality – did not consistently or meaningfully differ from those found in unstressed counterparts. This lack of consistent enhancement was observed across the board. Specifically, effects were absent for most syllable comparisons (suggesting that the effect does not hold across the lexicon), and, where these could have been expected, systematic effects were also absent for individual speakers (suggesting that speakers do not produce categorical differences between stressed and unstressed syllables).
While these null findings do not provide conclusive evidence in favour of the absence of lexical stress in MA, they are very much compatible with it. The present results are also compatible with earlier claims and experimental results suggesting that lexical stress does not play a large role, if any, in the phonology of the language.
Acknowledgements
The authors would like to thank Basma Chlaihani for help with the recordings and the department of English Language and Literature at Casablanca’s Hassan II university (Ain Chock Faculty of Letters and Humanities) for generously letting us record, as well as all participating students for their time and patience. We thank three anonymous reviewers and the editors for their helpful comments. Funds for this research include a DAAD Ph.D. scholarship to the first author and an ESRC grant supporting the IVAr project (ES_I010106).