Introduction
This study is a preliminary investigation of word order changes in Arabic dialects spoken on the Iranian side of the Persian Gulf in the southern provinces of Bushehr and Hormozgan. The present study is part of a larger project to document and describe the Arabic varieties in the provinces of Bushehr and Hormozgan, which have not yet received much scholarly attention. Apart from some anecdotal observations, mainly on phonological features,Footnote 1 there is no published worked on these dialects. Currently, a first description of the basic phonological, lexical and morphosyntactic traits of the Arabic dialect in Bandar-e MoqamFootnote 2 based on elicited data and questionnaires designed for the large project “Atlas of the Languages of Iran”Footnote 3 is underway. Thus, our knowledge of the dialects spoken in that area is still very scarce. Judging from the history of the Arab settlements in the regionFootnote 4 and the preliminary analysis of the data we have already gathered in a first field trip and the data from Bandar-e Moqam, it seems safe to say that the South Iranian Arabic (SIA) dialects are closely related to the Gulf dialects of the Arabian peninsula. Our project, which is a collaboration of Arabic Language Studies and General Linguistics not only aims at providing a detailed description of the Arabic varieties in this part of Iran from a dialectological and socio-linguistic viewpoint, but also to identify the wealth of language contact phenomena in these dialects, which are spoken in close contact with Persian as the dominant language.Footnote 5 The sociolinguistic situation seems to differ between villages with a higher percentage of Arab population and those where the use of Arabic has already receded. Among the four sites (see Figure 1) we have chosen for the present study, Nakhl-e Taqi (Asaluyeh County, Bushehr) and Kish (Hormozgan) exhibit a larger proportion of Arabic speakers than Bandar-e Kangan and Jazire-ye Shomali (Bushehr).Footnote 6 However, Persian generally seems to compete with Arabic as a language of communication between family members and friends in the entire region. Code-switching is commonplace between ethnic Arabs and although the children we interviewed were still fluent speakers of Arabic, their dominant language seems to be Persian.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0001.png?pub-status=live)
Figure 1. Four fieldwork sites: Jazire-ye Shomali, Kangan and Nakhl-e Taqi in the province of Bushehr and the island of Kish in the province of Hormozgan.
Source: Google Maps, 2017.
We already have ample evidence for contact phenomena in our small database, including the expected massive lexical borrowing from Persian. Examples for phonological phenomenaFootnote 7 are the loss of the interdentals that have mostly shifted to sibilants (e.g. misil Footnote 8 “like” < miṯil) or a back quality of the long a-vowel as [ɒ] for some speakers, the frequent coinage of Persian-style light verb constructions such as maǧbūr ṣirna “we were forced” with the light verb ṣār “become,” but without number agreement in the participle (i.e. maǧbūr (sg) instead of maǧbūrīn (pl)) as also observed for Khuzestani Arabic,Footnote 9 and the use of ke as a complementizer. Precursors of a general loss of gender distinction also appear in the speech of some speakers, such as missing agreement with feminine subjects like immi mā ḫallāni … yibki “my mother didn’t let me … (s)he used to weep” instead of ḫallatni “she let me” and tibki “she used to weep,” and finally the topic of this paper, the beginnings of contact-induced shift in word order from Semitic verb-initial to Persian verb-final.
Prior studies on word order shift in Arabic dialects in contact with Persian
Arabic-Persian language contact goes back to pre-Islamic times, at least to the Sassanid period. The amount of influence exerted by one language on the other has varied through history according to which of the two cultures and languages was dominant.Footnote 10 The strongest influence has been on the lexicon, which has received most attention in the literature.Footnote 11 Structural or pattern borrowing Footnote 12 in Arabic-Persian language contact has been less extensively investigated than matter borrowing. Structural convergence phenomena have been described for Khuzestani Arabic, among them a beginning shift to OV word order. In Khuzestani Arabic, verb-final structures seem to be restricted to copulae and auxiliaries, specifically after an active participle, as shown in example (1).Footnote 13 In fact, auxiliaries seem to be more prone to word order restructuringFootnote 14 than lexical verbs, and clause-final copulae and auxiliaries are one of the areal features identified for western Asia.Footnote 15
(1a) Khuzestani Arabic
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab1.png?pub-status=live)
(1b) Persian
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab2.png?pub-status=live)
The other type of supposedly restructured constituent order noted by Matras and Shabibi involves a pre-verbal object with clitic doubling on the clause-final verb, as in example (2).Footnote 16
(2) Khuzestani Arabic
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab3.png?pub-status=live)
Although clitic-left dislocation is a very common topicalization construction in all Arabic varieties,Footnote 17 Matras and Shabibi emphasize that these cases do not involve topicalization. However, Arabic varieties typically allow more than one clitic-left dislocated noun phrase (NP)Footnote 18 and multiple topics.Footnote 19 An analysis of the clitic in terms of agreement was offered by Ratcliffe for equivalent structures in Bukhara Arabic as instances of agreement with the object.Footnote 20 Benmamoun presents morphological, syntactic, and phonological arguments that in Arabic varieties (Standard, Egyptian, Jordanian and Moroccan Arabic), non-subject clitics are weak pronouns as opposed to subject marking on the verb that is analyzed as a genuine case of agreement.Footnote 21
Anyway, the Khuzestani Arabic data suggests the beginnings of a shift in word order at best. Leitner reports the same instances in her data and explicitly notes that apparent instances of OV with a lexical verb always involve a resumptive enclitic pronoun.Footnote 22 Similarly, Arabic dialects in Anatolia exhibit an enclitic copula despite the retention of the basic VO order.Footnote 23 Similar facts are found in the dialects east of Tigris in Northern Iraq,Footnote 24 when it is not inherited from Arabic.
In Bukhara Arabic, however, the shift seems to have been completed.Footnote 25 Ratcliffe notes that these dialects—although conservative as far as phonology and lexicon are concerned—exhibit a word order that radically differs from non-peripheral Arabic varieties and resembles that of the neighboring languages, Uzbek (Turkic) and Tadjik (Iranian), specifically in their syntax.Footnote 26 Additionally, these dialects have acquired relative clause–noun order on the model of the contact languages and use it alongside the inherited noun–genitive and noun–adjective word order.
To the best of our knowledge, there is no study on the word order in the varieties spoken in Khorasan, which were preliminarily described by Ulrich Seeger. Seeger lists a number of phonological, morphological and syntactic peculiarities of these dialects and compares them to those found in the Central Asian varieties of Uzbekistan and Afghanistan.Footnote 27 Seeger does not comment on syntactic phenomena. However, judging from the texts he published, the unmarked word order in Khorasani Arabic, just like in the other Central Asian varieties, is clearly OV. There is a pronominal clause-final copula as in the qǝltu dialects of Anatolia and Northern Iraq.Footnote 28 And the handful of VO clausesFootnote 29 with a lexical object that occur may be explained in line with the factors mentioned in the following sections.
Other factors affecting word order
Word order is affected by a number of semantic, syntactic and pragmatic factors. We will discuss three such factors that have been observed to affect word order in the languages of the world in general and in Arabic in particular, specifically in the Gulf dialects that presumably are the varieties most closely related to the SIA dialects.
Constituent order seems to be influenced by the semantic type of the constituents involved. In their recent volume on Western Asian languages, Haig and Khan state that, in general, the inherited order of O and V in a certain language remains stable.Footnote 30 There are, however, some cases of radical change found in Northeastern Neo-Aramaic (NENA) dialects, particularly the trans-Zab Jewish dialects of northern Iraq, and less radical changes in neo-Mandaic dialects.Footnote 31 Evidence from western Asia thus suggests that OV is largely retained if it was the original constituent order of the language and may be the outcome of conduct-induced change, whereas the opposite direction of change is not attested.Footnote 32 These facts only pertain to lexical noun phrases; however, pronominal objects are necessarily placed post-verbally, as for example in Kumzari,Footnote 33 a mixed language with western Iranian and Semitic roots. While lexical objects are obviously prone to a change from VO to OV, goals seem to resist such a change. According to Haig, goals of motion verbs and verbs of caused motion mostly follow the verb in the languages of Western Asia, whether the basic word order is VO or OV.Footnote 34 A post-verbal goal argument is the statistically preferred option in colloquial Persian.Footnote 35 Haig also mentions other argument types such as “recipients, addressees of ‘tell’ and final states of change-of-state predicates” that are placed after the verb in a northern Kurdish dialect despite its basic OV order.Footnote 36 Finally, as already noted above, the clause-final position of copulae is a prominent areal feature of western Asian languages, whether the feature be inherited or acquired through contact, and copular constructions seem to be the syntactic structure that is most readily borrowed.Footnote 37 In sum, judging from the descriptions of western Asian languages collected by Haig and Khan, argument and predicate type seem to exert a major influence on word order and word-order change.
In Classical Arabic, definiteness is a strict grammatical constraint on the order of an indefinite noun phrase and a predicate that consists of a prepositional phrase. While subject-predicate order as in zaydun fī l-masǧidi (Z. in the-mosque) “Zayd is in the mosque” is the unmarked option, an indefinite noun phrase obligatorily induces locative inversion: fī d-dāri mra’atun (in the-house woman) “a woman is in the house.”Footnote 38 Definite noun phrases typically refer to identifiable referents and, as it is cognitively more plausible to start a message with an identifiable referent than with an unidentifiable one, the above order is also the unmarked one in modern spoken Arabic varieties (see below).
Definite marking and identifiability are intimately related to information structure, specifically the cognitive accessibility of referents.Footnote 39 Information structure is another major factor that we believe to influence sentence form in every language. The linear order of constituents is an important aspect of the interaction between information structure and syntax.Footnote 40 We point out three aspects of information structure that influence constituent order: information status, theme-rheme partitionFootnote 41 and focus. The information status of discourse referents, that is whether they are given or new, is not only related to their identifiability, but additionally to their activation state,Footnote 42 which in turn largely determines definiteness marking and referential choice. There is a universal tendency to place given before new information in the natural flow of discourse and a tendency to express only “one new idea” per information unitFootnote 43—at least in spoken discourse. An independent but related principle is the theme-rheme partition of sentences, in the sense that the theme or topic of a sentence (or what the sentence is about) is mostly given. These two principles predict that (i) the theme will usually precede the rheme. The fact that subjects are prototypical topics in turn predicts that (ii) the object of an SVO or SOV clause will typically constitute the rheme together with the verb and will usually carry the sentence accent. Thus, the object argument also typically constitutes the focusFootnote 44 of a clause. If the focus is the only new information in a clause, which is usually called narrow focus, some languages use a specific focus position in the clause or a marked constituent order. It has been shown cross-linguistically that focus tends to occur in the peripheries of a clause (or intonation phrase, for that matter) coinciding with the prosodic nucleus,Footnote 45 frequently inducing deviations from unmarked word order such as focus fronting.Footnote 46 Some languages even seem to have a dedicated syntactic focus position, an example being the pre-verbal slot in Hungarian.Footnote 47 As the unmarked word order in Hungarian is verb-subject-object (VSO), the pre-verbal position is a deviation from unmarked word order, it additionally induces the typical marked “eradicating” prosody, that is a low flat intonation contour after the focus with no noteworthy prominences.Footnote 48 In some languages, nouns are more readily accented than verbs, or in other terms, verbs constitute a prosodic phrase together with an adjacent nominal element, usually the object of the clause,Footnote 49 whereas other languages do not make such a distinction between word categories.Footnote 50 Standard Persian is a language of the former type and has been described as assigning the nucleus to the final prosodic phrase of the clause.Footnote 51 In Persian, the distinction between specific/definite and non-specific/generic object determines whether object and verb constitute one phrase or two separate phrases: (ketáb xund)ϕ “s/he read a book” and (ketabo)ϕ (xúnd)ϕ “s/he read the book” with the nuclear accent on the verb.Footnote 52 In the pre-verbal domain, every constituent can be narrowly focused and realized solely by prosodic prominence.Footnote 53
Holes,Footnote 54 following Ingham’s proposal for Najdi Arabic,Footnote 55 identifies two basic sentence types for Gulf Arabic: the uninodal and the binodal sentence. The distinction is in fact one of information structure. In the light of the above described system, we may conceive uninodal sentences as fully rhematic and binodal sentences as expressing a theme-rheme partition. In the following I will follow the current terminology of general linguisticsFootnote 56 to characterize the sentence types. The rhematic sentence is typically mapped to a single intonation phrase and exhibits VSO order, which is the unmarked word order in Gulf Arabic for narrative predications. When noun-initial, the first NP is usually unspecific and frequently indefinite, i.e. not a topic (or theme). A further variant of the rhematic sentence type are existential phrases with a preposed prepositional phrase functioning as a “dummy verb”Footnote 57 as in cid-na bagar (at-1pl cows) “we have cows.”Footnote 58 An antitopic,Footnote 59 that is a post-posed topic, may occur as the “tail” after the rheme as in ǝmḥaḍrat-inn-ǝh n-naḫīl (shield.ap-f-inn-3m.sg def-palm tree) “the palm trees shield it.”Footnote 60 By contrast, the hallmark of a theme-rheme sentence is its bipartiteness. It usually involves two intonation phrases, a rising one (theme) and a falling one (rheme). Holes also notes the occurrence of “marked” word order in verbless sentences to indicate focus as, for instance, in galīl hāḏi s-sawālif (few dem.f def-things) “Such things were RARE.”Footnote 61
Hypotheses, Speakers, Data and Annotation
Given the small database, our study can only be viewed as a first approximation to the topic of word order in the SIA dialects. Due to the stronger influence of Persian in most regions than in Khuzestan, which is still more profoundly Arab, we expect to find more XV structures in our data than in Khuzestan. At the same time, it might turn out that there are differences between our speakers that reflect their socio-linguistic situation, which will be briefly described in the next paragraph and according to which the speakers in Kish and perhaps Nakhl-e Taqi should be more conservative than in Jazire-ye Shomali. On the other hand, age may likewise play a decisive role, predicting a stronger Persian influence in the speech of younger people than in the speech of older ones.
The study is based on four short conversations of varying length recorded at the four sites mentioned above (see Figure 1) in March 2017. Although the speech data was obtained from six different speakers, we will only refer to the four sites in our analysis because the number of tokens from two speakers (one in Nakhl-e Taqi and one in Kish) is far too small to permit reliable observations. The first text is a short interview with an old fisherman in Jazire-ye Shomali (HJM), who reports that his family is originally from Kangan. When he was young he lived in Kuwait for a while, but couldn’t find work and came back to work as a fisherman. The second text is an interview with an elderly female speaker in Kangan (MKF) who had been raised and lived all her life in the same place. Although her Arabic is fluent, she considers Persian to be her dominant language. It is also the language of intimate situations. For instance, the words of the lullabies and wedding songs she sang were Persian. While these two interviews produced longer monologic texts answering our questions, the other two recordings, specifically the one from Kish, are more dialogic. The interview in Nakhl-e Taqi was with an elderly male speaker (INM); his son was also present during the recording. The father, who had worked as a pearl diver in his youth but switched to trade after the revolution, is the main speaker of our recordings, but the son occasionally contributes to the conversation. These speakers seem to exhibit a stronger Arab identity and the father apparently only became a fluent speaker of Persian as a young soldier in the army. The fourth text is a conversation between a middle-aged and a fairly young male Arab in Kish (JKM). One of the speakers (the older one) is the main narrator and the other one mainly asks questions about the history of a museum they had established to preserve local traditions, which was also the recording site. The speakers are either illiterate (HJM and MKF) or have only reached some basic level of education. In this study, we are not yet able to draw any conclusions about the effect of the social factors due to the small number of speakers and amount of data we have. The whole database comprises 905 intonation units. We made 505 annotations of all verbal complements, adverbial phrases, modifiers and nominal predicates of copular clauses and their occurrences before the verb (XV) or after the verb (VX). Based on the factors outlined in the introduction, we annotated syntactic functions, semantic roles, definiteness, information status, word class and verbal aspect. Then we labeled each of the target items according to whether it occurred before or after the verb. For the frequency counts, which are presented in the following sections, we collapsed some of the labels. This simplified annotation is presented in Table 1. The annotation labels that are not listed here were, however, taken into consideration for the analysis.
Table 1. Simplified annotation scheme for the data
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab4.png?pub-status=live)
Syntactic functions. Object refers to direct objects, whether lexical NPs or pronouns, in which case they were suffixes. Oblique comprises all other arguments (indirect objects and oblique objects). Due to the difficulty to decide whether a prepositional phrase is an oblique object or an adjunct in some cases, we did not differentiate between these in the simplified annotation. Complement is a miscellaneous category comprising heterogeneous constituents like the nominal predicate of a copular clause with the verb kān/čān “be” or change-of-state verbs such as ṣār/istiwa “become,” the complement of the pseudo-verb of a possessive construction like cind-Suffix “at-person” and of an existential marker, which in the present data was always fīh (EXT), adverbs and the like.
Semantic annotation. Undergoer stands for a semantic macrorole that comprises patient, theme and stimulus in our data. Goal refers to complements of verbs of motion or caused motion, source refers to the origin of movement. Other roles, among which beneficiary/recipient was the most frequent, were grouped together under the heading other. Locative arguments like “in Iran” in “I live in Iran” and locative adjuncts were labeled as locative. Constituents labeled as temporal express the duration or point of time of an event. Modifiers were all types of adverbials that modified the semantics of the predicate.
Verbs. The labels for the verbs are relevant for the investigation of the internal order of the two components auxiliary and lexical verb in periphrastic constructions.
Information status. Following Chafe, three labels were chosen to describe information status.Footnote 62Given indicates that the referent was mentioned in the previous sense unit, accessible is used for any prior mention as well as for very easily inferable referents such as hyperonyms from hyponyms. Every other concept that was mentioned for the first time was labeled as new. Information structural categories such as theme, rheme and focus are hard to annotate and in our opinion can only be interpreted. To keep the annotation operationalizable, we decided only to annotate information status, but we will come back to the other information structural categories in the course of the interpretation of our results.
Results
Syntactic functions and semantic roles
As a first step, we counted pronominal objects, which amounted to 131 cases, and excluded them from further analysis as pronominal objects are necessarily post-verbal, being suffixed to the verb. This leaves us with only independent lexical NPs and adverbials for the frequency counts. Among these, a first count showed a preponderance of VX structures for all speakers. However, while the three speaker sites HJM, MKF, exhibit XV in 35–40 percent of the cases, INM only has 11 percent XV structures. There are also speaker- or dialect-specific differences within the individual categories, which will be discussed in due course. The separate count of syntactic constituents (Figure 2) shows that the relationship between VX and XV is approximately the same for direct and oblique objects/adjuncts across speakers. In the complement category, the proportion is roughly equal. This is mainly due to the final position of the copulae and to a lesser extent to XV order in existential constructions (Figure 3), but also to some other factors that will be discussed in the semantic analysis of the data.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0002.png?pub-status=live)
Figure 2. Distribution of VX or XV order over syntactic constituents.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0003.png?pub-status=live)
Figure 3. Distribution of VX or XV order in copular, existential and possessive clauses.
Interestingly, the number of preposed direct objects differs among speakers. As expected from the general picture, speaker INM did not prepose objects at all. On the other hand, speaker MKF exhibited the highest number of preposed objects (30 percent). The most interesting observation concerns the semantics of the verbs that allow preposing of the object. Apart from two instances of (4), the only predicates with pre-verbal objects were semantically light verbs such as sawwa “make” or ḥaṭṭ “put” (3). Example (4) comes from the second fairly young speaker in Kish, whose speech shows a very strong Persian influence on all levels, as exemplified by the fricative reflex of /q/Footnote 63 and the missing glottal stop in the name of the Holy Book.
(3) MFK_151
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab5.png?pub-status=live)
(4) JKM_B_64
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab6.png?pub-status=live)
As the “oblique” category also contained a small number of adverbs (n = 19), we analyzed them separately from the prepositional phrases. The result shows that adverbs appear in pre-verbal position (approx. 30 percent of the cases) relatively more frequently, while prepositional phrases containing a lexical noun occur in pre-verbal position only in 13 percent of the items. This may be due to phonological weight, as according to Behaghel’s law phonologically heavier constituents tend to occur toward the end of a sentence.Footnote 64 But it may also be due to the high number of adverbs in the speech of one speaker (HJM), half of whose “obliques” exhibit an XV structure, regardless of whether they are adverbs or prepositional phrases.
Speaker-specific differences can also be observed regarding copula position. The diagram in Figure 3 suggests that a copula is preferentially realized clause-finally. This, however, blurs the distinction between an even stronger preference for two speakers/sites (JKM and MKF) and the fact that speaker INM again prefers the copula in initial position. In general, however, there seems to be a shift to clause-final position. Examples (5) and (6) show instances of a clause-final copula (5) and a clause-final instance of istiwa “become,” which is frequently used synonymously with kān/čān “to be” and preferred to the latter, when the verb is used in the imperfective.
(5) MKF_04
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab7.png?pub-status=live)
(6) MKF_17
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab8.png?pub-status=live)
The diagram in Figure 4 shows that the distribution of XV/VX structures is not uniform across speakers regarding possessive and existential constructions. Whereas JKM exhibits a high number of XV both in copular constructions and the other two types, (8) MKF uses more VX in the other two construction types (7) despite her clear preference of XV in copular clauses and INM and HJK never prepose the prepositional phrase. In the next section we will argue that XV is still the marked option in this construction type and may be attributed to discursive consideration.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0004.png?pub-status=live)
Figure 4. Distribution of VX and XV structures in copular, existential and possessive constructions across the four fieldwork sites.
(7) MKF_04
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab9.png?pub-status=live)
(8) JKM_114
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab10.png?pub-status=live)
Finally, as regards the internal order of an auxiliary and the lexical verb (finite or non-finite), the pseudo-verb for “have” or the existential in periphrastic predicates also points to a shift to auxiliary-final as the proportion of lexical verb—auxiliary is 40 percent (see (17) for an example).
As this preliminary analysis shows, there seems to be a shift towards copula-final constructions, especially for those speakers whose dominant language is Persian (MKF) as well as young speakers.
Turning now to the analysis of semantic roles, our data is in accordance with the observation made in Haig and Khan concerning the preferred post-verbal position of recipients and specifically goals. We find no incidence of a motion verb with a preposed goal like a hypothetical ?giṭar riḥna (Qatar go:1pl.pfv) “We went to Qatar,” but only post-verbal occurrences after verbs of motion (9a,b).
(9a) HJM_02 (9b)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab11.png?pub-status=live)
Recipients and goals also often appear with three-place predicates such as ḥaṭṭ “put” or čabb “pour.” In these cases, the direct object sometimes, if rarely, appears in front of the verb (10). It is, however, more common that both constituents follow the verb. By contrast, a reference to the source, which is usually not an obligatory complement, typically precedes the verb. Example (11) shows a pre-verbal topicalized objectFootnote 65 and a pre-verbal source with the recipient expressed by a pronominal suffix post-verbally which is co-referent with the topic.
(10) MKF_42
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab12.png?pub-status=live)
(11) MKF_159
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab13.png?pub-status=live)
The diagram in Figure 5 also points to a certain tendency to prepose modifying expressions, most instances of these were again from speaker MKF (see 12); (13) gives a rare instance from the younger speaker in Nakhl-e Taqi who also uses final copulae, contrary to his father.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0005.png?pub-status=live)
Figure 5. Distribution of XV/VX structures across semantic roles.
(12) MKF_55
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab14.png?pub-status=live)
(13) INM_B_55
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab15.png?pub-status=live)
The analysis of the locative arguments and adjuncts and the temporal phrases is not informative without considering the information structure and will therefore be discussed in the next section.
Definiteness, information status and information structure
As outlined above, definiteness, information status and information structure are independent but strongly interacting features. A bare count of definite/indefinite and given/accessible/new items does not reveal any interesting aspect as the values of both categories are evenly distributed over the VX/XV instances. However, when we look into some of the unexpected results from above, for instance the marginal cases of preposed goals, we find that they are all given, meaning that they had been just mentioned. The context of example (14) is the following: the speaker had told us how she cooks the fish and has now started to talk about cooking the meat. Now she says that she puts all kinds of herbs and spices, among them garlic, in with the meat. Upon saying this, she immediately notices her mistake and utters the sentence with a negation, which is the focus of the sentence, being the only new (corrective) information in the message.
(14) MFK_60
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab16.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab17.png?pub-status=live)
This utterance is a good example of how the interpretation in terms of information structure sheds light on the strategies behind word-order phenomena. It is an example of the strategies mentioned in the introduction. One is theme-rheme partition, which is also related to what the Prague School called communicative dynamism,Footnote 66 namely moving forward from the less informative to the more informative. The theme here consists of the object and the prepositional phrase, both given constituents, associated with a rising contour,Footnote 67 followed by the rhematic part with a narrow focus on the negation, expressed by strong accentuation of the negative marker: MAnḥaṭṭīh. Furthermore, the utterance contains a topicalized object baṣalsūm with clitic doubling on the verb.
As in every Arabic variety, such clitic-left dislocations are commonplace in the data. Such utterances constitute about a quarter of the above excluded instances of suffixed pronouns. Due to their frequency, Brustad called Arabic a “topic-prominent language.”Footnote 68 Given the high representation of this construction in the data and the marginal and restricted occurrences of preposed objects, we conclude that the clitic is an object pronoun in its own right and not an agreement marker. It represents the object of the clause with the lexical noun serving as an extra-clausal topic. Furthermore, the presence of the clitic is not obligatory and should thus not be regarded as fully grammaticalized agreement, as convincingly argued by Haig.Footnote 69 As in other Arabic varieties, the clitic is an unambiguous index of a topic object, whereas pre-verbal objects without a clitic are predominantly rhematic foci. That is, the choice of the alternating indexation is pragmatic. What is different between SIA and non-peripheral Arabic varieties is the, albeit marginal, occurrence of non-rhematic unindexed pre-verbal objects as in (11).
Another example is given in (15) with a new topic that had not been mentioned before. Note that the resumptive pronoun does not agree with the topic in the expected way (i.e. feminine singular for non-human heads) but takes on a default masculine form.Footnote 70
(15) JKM_22
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab18.png?pub-status=live)
The other information structural feature we introduced in the introductory section is narrow, specifically contrastive, focus. On close inspection, most of the preposed objects and other pre-verbal constituents in the data may be interpreted as cases of narrow focus, often with a contrastive function. This is also true for some of the copula-final utterances as in example (6) above. Example (16) presents a series of utterances that present unambiguous evidence for a contrastive focus interpretation of the pre-verbal object, the big pot is contrasted with the small pot. Similarly, riḥna (“ourselves”) is a narrow focus in (17) as it constitutes the only new and relevant information of the clause.Footnote 71 This interpretation is also supported by the fact that in all these cases the nominal constituent is phrased together with the clause-final verb (see Figure 6) with an intonation contour that is obviously influenced by the intonation of equivalent cases in Persian, as we have noted in the introductory section.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0006.png?pub-status=live)
Figure 6. F0-track of example (17) with narrow focus on second occurrence of riḥna “we/ourselves.”
(16) MKF_79
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab19.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab20.png?pub-status=live)
(17) MKF_9
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab21.png?pub-status=live)
Finally, we turn to the discussion of locative and temporal phrases. As noted above, it was not always easy to differentiate between oblique arguments and adjuncts, which is why we initially collapsed the two categories. In this section, we venture an analysis in terms of information structure. Informationally speaking, a temporal and locative adjunct may either be used as a frame for the predication (theme) and consequently appear clause-initially, or as a tail of the rheme, that is in final position. It may, however, also constitute the rheme itself. Note that in spontaneous speech speakers usually make short utterances that do not contain a lot of new information, preferentially only one. Thus, it is frequently possible to identify the cases where the locative or temporal information is the relevant one, which is what we did in a second annotation procedure. It has been shown for Egyptian Arabic that frames are more often temporal than locative, whereas locative phrases usually occur at the end of the utterance.Footnote 72 We thus predict two things: (i) that temporal phrases are more often pre-verbal than locative phrases and (ii) that within the rhematic phrases the proportion of XV structures should be higher because the information contained in the adverbial phrase is often the only new information in the clause and thus can be regarded a narrow focus. These hypotheses are indeed borne out by the results. Firstly, the only frames that occur are temporal, and locative phrases are relatively more often post-verbal; secondly, being rhematic and potentially narrowly focused raises the likelihood of an adverbial to be realized pre-verbally (Figure 7). Example (18) presents an example of a (contrastively) focused adverbial in pre-verbal position as the speaker had been talking about settling down in various places in the previous utterances and now asserts that finally, he settled THERE. By contrast, example (19) presents a temporal frame that expresses the presupposition of the utterance. That is, the speaker who recounted a chronological sequence of events, among them military service, in the previous utterances now asserts that they FINISHED military service, taking the year 54 (1975) as a starting point. The main focus is indicated by capital letters in examples (18) and (19).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_f0007.png?pub-status=live)
Figure 7. Distribution of VX/XV structures over all locative and temporal constituents compared to rhematic locative and temporal constituents.
(18) HJM_26
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab22.png?pub-status=live)
(19) INM_64
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab23.png?pub-status=live)
To sum up this section, we may say that information structure seems one of the main factors in determining constituent order in SIA dialects. The qualitative analysis of the data suggests that VO is still the basic word order in these varieties, which may be overridden by narrow focus and/or contrast. The theme-rheme partition may be responsible for positioning given material at the beginning and new information at the end of the utterance. In the case of pronominal objects, this is counteracted by their morphosyntactic properties. In the case of lexical objects, SIA still preferentially resorts to clitic-left dislocation, although some cases of thematic objects without resumptive pronoun occur. Temporal and locative constructions may occur clause-initially and serve as frames, they may also follow the rheme as a tail. Alternatively, they may constitute the rheme of an utterance. In these cases, the phrase might also be realized in the pre-verbal slot to indicate narrow focus.
Discussion and Conclusion
A tentative answer to the question to what extent word order has shifted from OV to VO in SIA dialects as a result of language contact, maintains that the basic word order is still VO for all four field sites we have investigated. This is true for direct objects as well as oblique objects and adjuncts. We thus do not consider the high occurrence of object-initial clauses with clitic doubling as instances of OV word order, but rather as topic-comment structures with an extra-clausal topic and the resumptive pronoun as the regular syntactic object of the clause. It could, however, be observed that OV structures do occur, specifically in the context of contrast and/or narrow focus. Such constructions have been reported for Gulf Arabic by HolesFootnote 73 and also for other spoken varieties as remote as Morocco,Footnote 74 but are no option in Egyptian Arabic, for instance.Footnote 75 Furthermore, the pre-verbal position of the object can be interpreted as an instance of deviation from default word order, an information structural strategy that is often employed cross-linguistically as the example from Hungarian in the introductory section illustrates. Thus an explanation in terms of contact-induced word order change in these cases does not seem to be necessary.
However, as not all of the OV occurrences in the data can be explained as narrow foci, it is conceivable that the construction is gaining ground in SIA. A case in point is the instances of O-V-GOAL order which may well be influenced by Persian which has basic OV order with post-verbal goals as the statistically preferred option.Footnote 76 Whether OV is actually more common in terms of frequency and pragmatic contexts in SIA varieties due to Persian influence than in the Gulf dialects on the Arabian peninsula, could only be assessed by a comparison of their frequency and contexts.
At any rate, two phenomena clearly point to an influence of Persian on constituent order in SIA, clause-final copulae (and pseudo-verbs) and clause-final auxiliaries. A comparison with data from Khuzestani Arabic shows that our prediction is indeed confirmed. The use of the copula seems to be more widespread in SIA than in Khuzestani dialects.Footnote 77 SIA shows final auxiliaries after participles and after imperfective lexical verbs. Although this is also the case in Khuzestani Arabic,Footnote 78 the construction seems to be more widespread in SIA. Finally, the SIA data exhibits the characteristic prosody of verb-final languages with a sentence accent on the last rhematic nominal and a de-accented verb, a salient property that is likely to spread and promote the spread of the associated syntactic structure.
Finally, our results suggest that a change in word order might actually be more advanced in those areas where the proportion of Arabic-speaking population is smaller, but age is most probably also an important factor. Unfortunately, the data sample of this pilot study is too small to permit any reliable statements concerning the influence of social factors like regional provenance, age, gender or education. In future research we intend to investigate these issues thoroughly.
By way of conclusion, this study has yielded a number of interesting preliminary findings that will be a good starting point for a quantitative study of word order phenomena in SIA dialects on a larger scale and a comparative study with other Arabic varieties. Such a study would not only permit us to assess the influence of the socio-linguistic variables, but also to explore the effect of the interacting semantic, pragmatic and syntactic factors by way of statistical analysis.
We would like to thank Stephan Procházka, Bettina Leitner and two anonymous reviewers for their valuable comments and suggestions. All remaining errors are, of course, our own.
Appendix: Abbreviations
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20211210101412786-0863:S0021086200040482:S0021086200040482_tab24.png?pub-status=live)