1. INTRODUCTION
The Danish speech community is a very standardized community. During the last hundred years a strong standardization process has taken place in all areas of the country. The change has happened rather fast, and in many dialect areas there has been a rapid shift from one generation to the next, where local linguistic forms have been substituted by standard forms. This standardization process is almost entirely led from Copenhagen, the capital of Denmark (e.g. Kristensen Reference Kristensen1977; Jørgensen Reference Jørgensen, Hansen and Lund1983; Nielsen & Nyberg Reference Nielsen and Nyberg1992; Jørgensen & Kristensen Reference Jørgensen and Kristensen1994; Pedersen Reference Pedersen2003, Reference Pedersen, Auer, Hinskens and Kerswill2005; Schøning & Pedersen Reference Schøning, Pedersen, Dufresne, Dupuis and Vocaj2009; Maegaard et al. in press).
Even though Denmark is a very standardized speech community, there are still differences between linguistic practices in different parts of the country. Dialect variants of the past participle are among the most commonly noted linguistic differences between speakers from the western and eastern parts of Denmark, and it is also considered to be one of the most resistant dialect features in Jutland, the peninsula constituting the western part of Denmark.
In many Jutland dialects, the past participle is formed by means of the suffix -en for a group of strong verbs (e.g. blive ‘become’, komme ‘come’, and løbe ‘run’; the rest of the verbs have the suffix -et), whereas in Zealand dialects and in modern standard Danish past participles of all verbs have the suffix -et when functioning as part of composite past forms such as the various perfect forms, e.g. har løbet ‘has run’.
According to the structuralist descriptions of the traditional dialects the -en and -et suffixes are combinatorial allomorphs in complementary distribution determined by the verb stem, but today all verbs occurring in -en form also occur in -et form also among Jutland speakers. Even though the -en form is undoubtedly still used in Jutland, it is a form that is under pressure from standard Danish -et, and, as we will show in this paper, it is a form that has undergone change during the last generations.
The -et suffix has (at least) two phonetic realizations in modern Danish: [əð] and [ə]. The [ə
] pronunciation is generally considered an eastern Jutland variant, whereas the [əð] pronunciation is considered part of the standard language (and Zealand dialects) (see Section 2 below). A study of past participle forms in Jutland is therefore also a study of potential processes of regionalization or standardization, as the speakers have to choose not only between -en and -et but also between a regional and a standard pronunciation of the -et suffix.
This paper builds on data from two Jutland locations, supplemented by data from Copenhagen. The Jutland locations are the small town of Vinderup in western Jutland (3,000 inhabitants), and the larger town of Odder in eastern Jutland (12,000 inhabitants). We will take a closer look at these two geographic areas in the following.
2. THE PAST PARTICIPLE IN WESTEN AND EASTERN JUTLAND, AND IN STANDARD DANISH
In Old Danish, the strong verbs formed masculine/feminine past participle by means of the suffixes -in, -en, and -æn, and neuter past participles by means of -it, -et, and -æt. Originally, past participles showed agreement in case, gender and number, but quite early the nominative took over the most of the functions of the other case forms.Footnote 1 Later, a further simplification took place: The original feminine/common gender singular form -en replaced the neuter and plural forms in Jutland, whereas the neuter singular form -et became dominant in the Zealand dialects (Brøndum-Nielsen Reference Brøndum-Nielsen1973:189f.; Pedersen & Sørensen forthc.). Though standard Danish and the Copenhagen vernacular eventually followed the same trend as the rural Zealand dialects, the former varieties have had a larger tendency for -en forms than the latter (Skautrup Reference Skautrup1947:356; Brink & Lund Reference Brink and Lund1975:662; Pedersen Reference Pedersen and Hjorth2004:31). Whereas the change towards using the neuter form -et with all verbs was completed in Zealand dialects around the mid-19th century, Copenhagen speakers continued to use the -en form for a longer period. However, in their study of spoken Copenhagen (including ‘standard Danish’), Brink & Lund (Reference Brink and Lund1975:662) find that the change was almost completed at the end of the period they investigate (speakers born in the years 1870–1950). The language historian Skautrup (Reference Skautrup1968:90) also notes that in the standard language, past participle is always expressed in -et form in ‘recent Danish’, i.e. the years 1870–1950. To our knowledge, no larger empirical investigations of the use of participle forms in Copenhagen/standard Danish speech have been carried out since Brink & Lund (Reference Brink and Lund1975). Judging from the studies mentioned, -en forms are most likely no longer a significant part of Copenhagen speech, but to test this impression we have included a group of Copenhagen informants in the study presented in this article (see Section 4 below).
The verbs that do not form past participle with -en in traditional Jutland dialects follow regularities that are shown on the map in Figure 1 (on next page). Vinderup is situated at the border between two dialect areas: one has the form [әɾ] (with a flapped r), the other the form [ә].Footnote 2 Odder is situated in the large eastern area which has the form [ə].
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-18456-mediumThumb-S0332586512000182_fig1g.jpg?pub-status=live)
Figure 1. Dialect map of the pronunciation of the past participle form -et in different Jutland areas (Rasmussen et al. Reference Rasmussen, Sørensen, Arboe, Hansen and Grøftehauge2000–: Map K 6.1).
The origin of the [ə] form has been discussed by several scholars. Figure 1 shows that the most reduced form, [ә], is found in the periphery, both to the north and to the south, while the [ə
] form is found in the large central, eastern area of Jutland. This pattern of the distribution of [ə
] resembles an innovation, and this is how it has been interpreted by several researchers (Gudiksen Reference Gudiksen2000:135; Horneman Hansen Reference Horneman Hansen2001:61; Pedersen Reference Pedersen, Berns and van Marle2002). Thus, the existence of the [ə
] form in eastern Jutland has been interpreted as the result of a restitution process, since Old Danish -t had been reduced prior to this development. Restitution in this context is not to be understood as a conscious act of speakers to return to an old form, but a development that nevertheless led to the [ə
] form. It has been common to interpret the existence of the [ə
] form as a strengthening (ð → Þ → t) (Lyngby Reference Lyngby1863; Brøndum-Nielsen Reference Brøndum-Nielsen1968:Section 289). This strengthening has, in the words of Pedersen (Reference Pedersen, Berns and van Marle2002:274), been seen as ‘a regular sound change, a case of final sharpening’, but according to Pedersen, this is a very unlikely development. The matter has been discussed in several papers (mentioned above), and we refer to these for more details on the subject. Our purpose in this paper is to examine variation and change in use of the en form in Jutland within the last generations, which makes it sufficient to conclude that the [ə
] form can be regarded as part of the traditional dialect in a large area in eastern Jutland, regardless of the fact that it is probably the result of a restitution process at some earlier point in time.
The map in Figure 1 also shows two areas where the form [әð] is found. These areas – like the [ә] areas – are peripheral and not coherent, indicating that [әð] in these areas should probably be seen as relic forms, representing a reduction that happened before the restitution to [ə] in eastern Jutland.
Variation between [əð] and [ə] in Odder has been described in an earlier study (Nielsen & Nyberg Reference Nielsen and Nyberg1988). In this study, the [əð] form is seen as the standard form, and [ə
] as the original dialect form. Nielsen and Nyberg (Reference Nielsen and Nyberg1992:107–108) find that the standard form is used more by female than male speakers, and that, in certain subgroups of their corpus, more by speakers living in urban areas than by speakers living in the countryside. However, these results are not directly comparable to our analysis, since Nielsen & Nyberg have analysed all instances of the suffix [əð] or [ə
], not distinguishing between instances of the past participle suffix and the neuter form of the postfixed definite article.
In his study of the spoken language in Århus, Denmark's second largest city, which is located 20 kilometres from Odder, Nielsen (Reference Nielsen1998) finds a relatively high frequency of [ə], also among younger informants. More than half of the informants use this variant, and again, the frequency of the standard form [əð] is higher among female than male speakers, and higher among speakers of higher than lower socio-economic status.
Variation between [əð] and [ə] has not previously been examined in Vinderup, since the [ə
] form has not been considered a local dialect form (Kristensen Reference Kristensen1977, Reference Kristensen1980).
While the [ə] form can be seen as eastern Jutlandic, [əð] is Zealandic and the standard Danish pronunciation. The [ə
] form is not part of any Zealand dialects, and it does not appear in descriptions of Copenhagen dialect. However, there is one exception: Some scholars hold that if the preceding sound is [ð], the -et will be pronounced [ə
], also in standard Danish: ‘The rule is: ə
after ð, əð in other cases’ (Hansen Reference Hansen2006:14; our translation). This would apply to participle forms of verbs like sidde ‘sit’, ride ‘ride’, and tude ‘howl, cry’. The dictionary of Danish pronunciation (Brink Reference Brink1991) also mentions this pronunciation but reports it as ‘older pronunciation’. In the present study, this concerns the participle forms gledet ‘slid’, gnedet ‘rubbed’, redet ‘ridden’, skredet ‘slipped’, svedet ‘scorched’, and vredet ‘twisted’ (see Section 5, on methodology, below). Thus, the pronunciations of these words with [ə
] are not necessarily Jutland forms; they might in fact be standard Danish forms. However, none of these words occur in our material, and consequently do not have any bearing on the results presented here.
3. REGIONALIZATION
On the basis of the above review, we consider the [ə] form an eastern Jutland form. At the same time, this form is apparently a feature that has emerged as the result of a restitution process. In view of the large distribution area in Jutland (see Figure 1 above) it would not be unreasonable to imagine this feature having the potential to spread to other areas in Jutland. The relatively high frequency of this feature in the Århus study (Nielsen Reference Nielsen1998) also indicates that it might be a dialect feature that is not decreasing in use. If the use of the [ə
] form is in fact increasing in Jutland, then the change would be a very interesting process of regionalization. In modern Danish variation research, there are very few cases where regional or dialectal features have been documented to spread from one region to another (Pedersen Reference Pedersen2003:26), and in these cases they only spread to other neighbouring regions, never to the speech in the rest of the country.
If, on the other hand, we find that the [əð] forms are increasing in use in Odder and/or Vinderup, we will interpret this as evidence of standardization, regardless of the fact that [əð] is also part of the dialects in some areas in north-western and north-eastern parts of Jutland (see above). We argue for this interpretation due to the history of the [əð] forms in these areas (relic forms) and especially due to the fact that these areas are rather small and sparsely populated (especially the western area). There is a theoretical possibility that the [əð] forms spread from these areas, but we find it unlikely that linguistic change should take place with these smaller, sparsely populated areas acting as centres of change.
The processes of regionalization have been modelled by Auer & Hinskens (Reference Auer and Hinskens1996). The model illustrates dialect convergence and distinguishes between horizontal dialect–dialect convergence and vertical dialect–standard convergence. Regionalization is a horizontal process since the influence is from one traditional dialect to another, whereas standardization is a vertical process since the two involved varieties form a hierarchical relationship, with the standard variety being more prestigious than the traditional dialect. The important point that distinguishes horizontal from vertical convergence is whether or not there is an asymmetrical relationship between the two involved varieties. However, this definition is problematic if language change is regarded as primarily driven by socio-psychological factors (Kristiansen, Garrett & Coupland Reference Kristiansen, Garrett and Coupland2005; Maegaard et al. in press). In that case, most instances of convergence would be motivated by perceived prestige differences. As Røyneland (Reference Røyneland, Auer and Schmidt2010:259–260) writes: ‘given that most cases of convergence may be traced back to asymmetrical relationships between varieties – that is, to differences in status or prestige – one could argue that all cases of convergence might in that sense be labelled vertical’. In spite of this, Røyneland maintains that the distinction between the two types of convergence is useful in analyses of processes of change. For the purpose of this paper, the important point is the distinction between dialect–dialect convergence and dialect–standard convergence, where the former was probably a very common type of convergence around 1900, but is hardly ever seen in descriptions of recent linguistic change in Danish. Lisse (Reference Lisse1964) and Pedersen (Reference Pedersen and Ringgaard1987) describe developments where speakers in one dialect area take over a non-standard form from another dialect area. This seems to have happened to the past participle forms, where the [ə] forms were probably taken over from Jutland by speakers from the island of Funen (Pedersen Reference Pedersen and Ringgaard1987). Standardization processes seem to have happened differently around 1900, than they have during the last decades. The kind of standardization that took place around 1900 was probably more horizontal than vertical in nature, whereas the standardization processes seen in Denmark during the last decades seem to be more vertical in nature. Therefore, the development of the [ə
] form is especially interesting in a regionalization/standardization discussion.
4. THE DATA
The results presented in this paper are based on analyses of the use of the past participles among different groups of informants from the LANCHART corpus. The LANCHART corpus is structured by generation: generation 1 (born 1942–1963), generation 2 (born 1964–1973) and generation 3 (born 1987–1996). For each of these subgroups at each site (including Copenhagen, Odder and Vinderup), the informants are distributed evenly with regard to gender and social class, i.e. working class (WC) vs. middle class (MC) (see Gregersen Reference Gregersen2009 for a detailed description of the LANCHART corpus).
In this paper we use interview data from three out of the six different LANCHART sites: Copenhagen, Odder and Vinderup. The sites differ not only with respect to location (see Figure 2) but also size: The greater Copenhagen area covers most of north-eastern Zealand comprising up to 2 million inhabitants, Odder is a small town with a population of 12,000, and Vinderup is an even smaller rural town, with 3,000 residents.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-16668-mediumThumb-S0332586512000182_fig2g.jpg?pub-status=live)
Figure 2. The LANCHART communities Copenhagen, Odder, and Vinderup.
In the Odder (and Copenhagen) data, all three generations are represented, whereas generation 1 is not represented in Vinderup. The distribution of informants in Odder and Vinderup is shown in Table 1. The informants in generation 1 and 2 have been recorded twice, in the period 1978–89 and again in 2005–07 while generation 3 has only been recorded once, in 2005–07. In total, the Jutland data comprises 172 transcribed recordings in which the informants produce 1.124 million words (including incomplete words and filled pauses). The mean age of the generation 2 informants is a little higher in Vinderup than in Odder (the mean year of birth in Odder is 1967 while it is 1964 in Vinderup), and as generation 1 is furthermore not represented in Vinderup, we treat age, expressed by year of birth, as a continuous variable in this study.
Table 1. Informants in Odder and Vinderup.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151127084719138-0814:S0332586512000182_tab1.gif?pub-status=live)
The early studies were carried out as separate studies, by different researchers, and with different goals and research interests. Consequently, data from these studies differ regarding both informant types (especially with respect to age) and conversation types (especially with respect to length, and the diversity of types of conversation). On the other hand, all studies systematically include younger and older, male and female informants, of different socio-economic background, recorded in sociolinguistic interviews. For the LANCHART corpus, informants have been chosen from the older studies to represent an equal distribution, both regarding gender and socio-economic background. Thus, even though the corpus is not enormous, the speakers are selected in a way that makes it reasonable to see their linguistic behaviour as a representative pattern of the linguistic community they are part of. In the analyses we present in this paper, some speakers (generation 1 and 2) are included twice, once in a recording from the 1970s or 1980s and once in a recording from the 2000s. This furthermore strengthens our claims about developments in the community.
Copenhagen is only included in the study in order to test empirically the assumption that en- and [ə] forms are very infrequent in Copenhagen. We therefore only include the informants from generation 1 (i.e. the oldest informants) and only the recordings from the 1980s in the study, as this is the part of the Copenhagen data where we would expect the occurrence of en forms to be highest. The data from Copenhagen included in the study comprises 24 recordings with the corresponding number of informants.
The composition of informant groups makes it possible to examine changes in language use in (at least) three ways: First of all, in generation 1 and 2, we are able study change across the lifespan of the individual speaker by comparing language use in the old recordings to language use in the new recordings. This is a panel study, where the same group of speakers are recorded twice with a relatively long time span in-between. Secondly, we can examine differences between the generations at a certain point in time (an apparent time study). Finally, we can compare the variation in past participles in Vinderup and Odder in order to study how changes spread or have spread geographically.
5. METHOD
On the basis of the descriptions of the traditional Jutland dialects (Lyngby Reference Lyngby1863; Sandvad Reference Sandvad1931; Lund Reference Lund1932; Bjerrum Reference Bjerrum1942; Kristensen Reference Kristensen1942; Jensen Reference Jensen1956), we compiled a maximal list of lemmas which may be expected to occur with the past participle suffix -en. The list comprises 58 lemmas. All occurrences of the past participle forms of these lemmas (et as well as en forms) were marked up automatically in the Odder and Vinderup data, and the tokens were later categorized by hand with respect to the grammatical and phonetic criteria described below.
It turned out that only 26 of the 58 lemmas actually occurred in en form in the data. Therefore, only the occurrences of these lemmas will be described in the following, as the other lemmas must be considered to be outside the envelope of variation in modern Jutlandic.
In total there were 5,414 occurrences of past participle forms of the 26 lemmas in the Jutland data. Some of these were excluded as they occur in non-completed constructions (i.e. clauses interrupted before an interpretable intentional meaning has been expressed); the rest were categorized as to whether their syntactic function is verbal or not. Verbal function (i.e. part of periphrastic perfect, pluperfect, få- or passive construction) is illustrated in (1):
(1)
Non-verbal function (e.g. predicative or noun phrase adjunct) is illustrated by the following examples:
(2)
In a few cases it was not possible to determine whether the function was verbal or not; these tokens were assigned to a separate category. Table 2 shows the results.
Table 2. Grammatical functions of past participle forms, Jutland data.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151127084719138-0814:S0332586512000182_tab2.gif?pub-status=live)
The 5,059 tokens with verbal function were categorized phonetically (irrespective of whether they were originally transcribed as et or en forms). Each token was independently coded by two coders and assigned to one of the following seven categories:
(i). -et with ‘soft’ d ([əð]), e.g. ‘kommet’ [ˈkhʌməð]
(ii). -et with ‘hard’ d ([ə
]), e.g. ‘kommet’ [ˈkhʌmə
]
(iii). -en, e.g. ‘kommen’ [ˈkhʌmən], [ˈkhʌmnˌ],
(iv). zero or the suffix schwa ([ə]), e.g. ‘kommet’ [ˈkhʌm], [ˈkhʌmmˌ], [ˈkhʌmə]
(v). indeterminable because of the phonetic context, e.g. ‘kommet tilbage’ [ˌkhʌmətseˈbɛː] which can be either (ii) or (iv), or ‘kommet ned’/‘kommen ned’ [ˌkhʌməˈneð] which can be either (iii) or (iv)
(vi). combinations of (i) and (ii), e.g. [ˈkhʌməð
]
(vii). combinations of (iii) and (ii), e.g. [ˈkhʌmən
]
In case of the two coders not agreeing, the final coding was determined by a third coder.Footnote 3
The results are shown in Table 3. As can be seen from Table 3, in more than 3% of the cases the pronounced form is either indeterminable or ‘double’. These occurrences are of course interesting, but as they are very infrequent in the data, they will not be included in the analyses presented in this paper. These will thus only include past participles with verbal function with the suffix -et (pronounced [əð] or [ə]), -en or zero/schwa.
Table 3. Pronounced forms, Jutland data.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-05736-mediumThumb-S0332586512000182_tab3.jpg?pub-status=live)
The data from Copenhagen were treated just like the data from Odder and Vinderup, with the exception that the list of lemmas tagged automatically in addition to the lemmas included in the list used for the Jutland data also included a number of extra lemmas which occur in Høysgaard's (Reference Høysgaard and Bertelsen1979 [1747]:350–360) list of verbs with past participle suffix -en. This was done in order to take into account the possibility that lemmas that do not occur in en form in Jutlandic dialects may occur in that form in Copenhagen/standard Danish. It turned out that this is not the case, however. The results from Copenhagen will be described in Section 6.5 below.
6. RESULTS
6.1 En forms
Figure 3 shows the distribution of en forms on the 26 lemmas occurring in en form in the corpus. It is very obvious that there is a huge variation with respect to the share of en forms; for example, the lemmas knibe ‘pinch’, slippe ‘let go’, and springe ‘jump’ more or less always occur in en form, while the lemmas gå ‘walk’, tage ‘take’, and være ‘be’ almost never occur in en form. It is also obvious that the lemmas occur with very different frequencies in the corpus, and in this respect, the lemma være ‘be’ constitutes a special problem as it nearly categorically does not occur in en form (there is only six tokens of en forms uttered by only three different speakers), while at the same time it is by far the most frequent lemma, comprising more than half of the tokens of past participles in the data. As the lemma være is therefore outside the envelope of variation for the overwhelming majority of speakers in the study (and is used only very sparsely in en form even by the three speakers who use it) we have chosen to exclude it from the quantitative analyses presented below.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-46138-mediumThumb-S0332586512000182_fig3g.jpg?pub-status=live)
Figure 3. Distribution of en forms on lemmas, Jutland data. The plot is made using the Design package for R, version 2.3–0.
Figure 4 shows the distribution of en forms with respect to the non-linguistic factors included in the study (excluding tokens of the lemma være ‘be’). The order in which the factors are shown does not reflect their importance as predictors, and, as will be explained below, the proportions should only be seen as a very rough (and possibly misleading) estimate of the effect of the factor.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-22920-mediumThumb-S0332586512000182_fig4g.jpg?pub-status=live)
Figure 4. Distribution of en forms on non-linguistic factors, Jutland data. The plot is made using the Design package for R, version 2.3–0.
The bottom row in Figure 4, ‘Overall’, shows that the total proportion of en forms is 0.28, i.e. 28%. This is a very low share when taking into account that we have excluded those lemmas that do not occur in en form in the corpus even though they are reported to occur in en form in the traditional dialects of the areas. In other words, the overall result indicates in itself a dedialectization of the language use in the areas compared with the traditional dialects. The results with respect to Generation and Recording indicate that this process is continuing, as the younger informants clearly use less en forms than the older informants, and as there are more en forms in the recordings from the 1980s than from the 2000s.
However, the raw results may be misleading, in three respects. Firstly, when we look at each factor one by one, we cannot see how they influence each other. For example, the relatively high share of en forms in Vinderup may primarily be caused by a very high use in the old recordings hiding that the Vinderup informants actually have a lower use than the informants from Odder in the new recordings. Multivariate analyses that take potential interactions between the different factors into account are therefore necessary in order to assess the distributional patterns of en forms in time and space. Another possibly misleading aspect of the simple picturing of proportions is that the individual informants do not contribute equally with tokens. The contributions actually vary between two and 53 tokens (with a mean of 19), partly due to the fact that some speakers are represented in both the old and the new recordings, while others (the youngest) are represented only in the new recordings. In other words, the number of tokens is unbalanced with respect to individuals, and as some individuals might favour a linguistic outcome while others might disfavour it, over and above what their gender, age, social class, geographical origin and the time they were recorded would predict, this may cause us to under- or overestimate the external factors (Johnson Reference Johnson2009). A similar problem relates to the lemmas. As Figure 3 above shows, they seem to have very different individual tendencies for occurrence in en form, they occur with very different overall frequencies, and they are probably not distributed evenly with respect to the non-linguistic factors.
We will therefore assess the factors described above via multivariate analyses, more specifically, generalized linear mixed effects models, including the speaker (Participant) and the lemma as random factors/effects.Footnote 4 A generalized linear model is a formalized mathematical model of the relationship between the (binary) dependent variable, in this case en vs. non-en past participle suffix, and the independent variables, such as the factors shown in Figure 4 above. The term ‘mixed effects’ refers to the fact that this type of modelling in addition to the so-called ‘fixed effects’ or ‘non-random variables’ such as Gender and Recording, also includes ‘random effects’, such as the individual speaker (Participant) and the lemma (Lemma). The latter effects are characterized by being non-repeatable: They are sampled randomly from populations of speakers and lemmas (i.e. they are intended to represent a larger population), and if we were to replicate the sampling we would have to choose other speakers and lemmas. Hence, this type of statistical model takes into account the non-repeatable effects of the individual speaker and lemma by assigning a baseline mean (called an intercept) to each speaker and lemma with respect to the dependent variable, as an adjustment for the fact that the behaviour of individuals cannot be expected to be completely determined by the (non-random) social and linguistic factors. A mixed effects model is therefore more conservative than other types of multivariate analyses as the social and linguistic factors (the fixed effects) are only chosen as significant when they are strong enough to rise above the inter-speaker and inter-lemma variation (Baayen Reference Baayen2008; Johnson Reference Johnson2009).
In addition to the factors included in Figure 4, we also included in our analysis the frequency of the lemma (in past participle form, see Figure 3) in the corpus as a fixed effect. Figure 3 shows that the distribution of this variable is skewed, also after excluding the lemma være ‘be’, with a few high-frequency outliers and most of the rest occurring rather infrequently. To remove some of this skewness and transform the data to resemble a normal distribution, the frequencies were therefore logarithmically transformed (logFrequency) (Baayen Reference Baayen2008:71).Footnote 5 The variable ‘speaker age’ is included as the numeric variable year of birth of the speaker (Birthyear).Footnote 6
Through the analysis and comparison of possible models and the use of model criticism (see Baayen Reference Baayen2008) we reached the model which most accurately describes the variation in the data material given the factors we included in the study (see Table 4). The analysis shows that all the factors considered except the gender of the speaker have a significant influence on the choice of past participle suffix, i.e. social class and year of birth of the informant, time and place of the recording and the overall frequency of the lemma. Figure 5 shows the partial effects, i.e. the probability of en form estimated by the model as a function of a given (statistically significant) factor when all the other factors are kept constant. In case that model includes an interaction between two factors, they are shown together.
Table 4. Best model, -en vs. other variants.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-84255-mediumThumb-S0332586512000182_tab4.jpg?pub-status=live)
Number of observations: 2233; Participants: 115; Lemmas: 25
Random effects:
Participant (intercept), Variance 2.8252, Standard deviation 1.6808
Lemma (intercept), Variance 1.6757, Standard deviation 1.2945
The goodness of fit of the model is very good with a concordance statistic C of 0.9378599 and a Somers’ Dxy of 0.8757198.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-26876-mediumThumb-S0332586512000182_fig5g.jpg?pub-status=live)
Figure 5. Mixed model effects, en forms vs. other variants. The predicted probabilities presuppose that all the other factors are kept at their default levels: Locality = Odder, Recording = 1980s, Socialclass = MC. The plots are made using the plotLMER.fnc-function (Package languageR version 1.2).
Figure 5 shows that the most powerful predictor in the model is the frequency of the lemma, and its influence is negative: The less frequent the lemma, the higher the probability that it will occur in en form. We will discuss this phenomenon in Section 6.4. With respect to the non-linguistic factors, the model confirms that there was a dedialectization process going on in the period from the 1980s to the present: The effect of increasing the year of birth is negative, showing that the younger informants use less en forms than the older informants. This suggests an ongoing change in the direction of less use of en forms, and this is confirmed by the effect of time of recording, which shows that there is a higher probability of en forms in the recordings from the 1980s than in the 2000s. The factor Recording interacts with the factor Locality showing that even though the use of en forms has declined both in Odder and Vinderup, the decline is steeper in Vinderup than in Odder. This has the effect that the informants from Vinderup in the 2000s have caught up with the informants from Odder: In the 1980s, the informants from Odder use considerably less en forms than the informants from Vinderup, but in the 2000s there is no statistically significant difference between Odder and Vinderup.
The Odder informants were, in other words, ahead of the Vinderup informants with respect to the decline in en forms in the 1980s. This may have at least two explanations: According to the ‘wave model’, diffusion of linguistic innovations happens in more or less concentric waves from a centre to increasingly peripheral environments (e.g. Bailey et al. Reference Bailey, Wikle, Tillery and Sand1993), and as Vinderup is located further away from Copenhagen than Odder, this would explain the delay with respect to the development in question. However, the reason for Odder's lead may also be its greater population density and location closer also to urban centres other than Copenhagen, most notably Århus. According to the ‘cascade’ or ‘urban jumping’ model, innovations descend down a hierarchy of large city > small city > large town > small town > village > country (Trudgill Reference Trudgill1974; Britain Reference Britain, Auer and Schmidt2010:148). With the data available in this study we cannot determine whether physical distance or population density (and related factors such as degree of urbanization) is most important with respect to the pace in which en forms disappear, as the factors are strongly collinear in the case of Odder and Vinderup: Vinderup has fewer inhabitants, is less urbanized and located further away from Copenhagen (and other places of high population density) than Odder.
Also the factor Socialclass interacts with Recording; the model indicates that working class speakers have remained more or less stable during the period, and that the decline in the use of en forms is driven by the middle class speakers.
The model presented above is not intended to be exhaustive with respect to what governs the choice of suffix. As regards linguistic factors, it only takes into account the lemma of the verb, but of course past participles do not occur in isolation. The syntactic construction may therefore also play a role, for example which auxiliary verb constitutes the finite verb of the clause, whether or not the clause has an object, and where it is located in relation to the participle (Veirup Reference Veirup, Dal, Glahder, Nielsen, Skautrup and Veirup1964). This may be further complicated by the occurrence of more or less fixed expressions with an affinity to one or the other suffix form. Further studies of these matters would definitely be interesting and valuable, but the purpose of the study presented in this article has primarily been the role of time and place, and we have therefore only included the linguistic factor we believe to be (by far) the most important, namely the verbal lemma.
6.2 What replaces en forms?
The next question is this: Which forms take over when the en forms disappear? A first approximation can be seen in Figure 6, which shows the distribution of forms over time of recording and locality. En forms are excluded, and included are only forms of the same 25 lemmas as in the analysis of what determines the choice of en forms vs. other forms described above.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-83024-mediumThumb-S0332586512000182_fig6g.jpg?pub-status=live)
Figure 6. Distribution of past participle forms (excluding en forms) over time and locality.
It is clear from the diagram in Figure 6 that it is primarily [əð] forms which are used instead of en forms, and that the use of both schwa/zero and [ə] forms are decreasing during the period from the 1980s to the 2000s. Definite conclusions with respect to time and locality cannot be drawn from the raw figures, however, as the age distribution of the informants is not the same at the two locations (the mean age is higher in Odder than in Vinderup), and as the different lemmas, which have very different dispositions for occurring in e.g. [əð] forms, are probably not evenly distributed with respect to time and locality. The question of what replaces the disappearing en forms will therefore be assed via multivariate analyses taking the lemma (and the speaker) as random effects. As there are three possible forms, schwa/zero, [ə
] and [əð], we will assess this question in two separate analyses: first schwa/zero vs. other forms ([ə
] and [əð]), and then [ə
] vs. [əð].
6.3 Schwa/zero forms
The best model as regards the use of schwa/zero form in the data material does not include the frequency of the lemma (i.e. this factor does not have a significant effect on the choice of schwa/zero vs. [ə]/[əð] suffixes), but in contrast to the analysis above, it does include the gender of the speaker as a significant factor (see Table 5). Figure 7 shows the model effects.
Table 5. Best model, schwa/zero forms vs. [ə] and [əð] forms.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-36880-mediumThumb-S0332586512000182_tab5.jpg?pub-status=live)
Number of observations: 1617; Participants: 115; Lemmas 24
Random effects:
Participant (intercept), Variance 0.87403, Standard deviation 0.9349
Lemma (intercept), Variance 1.23940, Standard deviation 1.1133
The goodness of fit of the model is good with a concordance statistic C of 0.8638805 and a Somers’ Dxy of 0.7277610.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-67348-mediumThumb-S0332586512000182_fig7g.jpg?pub-status=live)
Figure 7. Mixed model effect, schwa/zero forms vs. [ə] and [əð] forms. The predicted probabilities presuppose that all the other factors are kept at their default levels: Locality = Odder, Recording = 1980s, Socialclass = MC. The plots are made using the plotLMER.fnc-function (Package languageR version 1.2).
Figure 7 shows that the variation with respect to schwa/zero forms is very similar to en forms: The variant is very conspicuously in decline – though there is no statistically significant change from the old to the new recordings in Odder but only in Vinderup, the younger informants use it less than the older informants both in Odder and in Vinderup – and it is used much more in Vinderup than in Odder, though the difference between the two localities decreases during the period.
The schwa/zero variant cannot unambiguously be seen as dialectal (representing the northwestern Jutlandic [әɾ]/[ә] suffix of the weak verbs, see Figure 1 above) as it may also be a reflex of general spontaneous reduction processes, and this probably explains why it is also present in Odder where it is not part of the traditional dialect according to the earlier descriptions.Footnote 7 But the results clearly show that it is at least partly geographically conditioned, and that the pattern of variation looks very much like a pattern of dedialectization, also when it comes to the social distribution where we see that the development is spearheaded by middle class women. This pattern is generally seen in studies of language change involving standard and non-standard variants (see e.g. Hudson Reference Hudson1996:193f.), and previous studies in both Vinderup and Odder have also shown that women and middle class speakers generally speak closer to the standard norm (and use less dialect features) than men and working class speakers (Kristensen Reference Kristensen1977:64–68, Reference Kristensen1980:86f.; Nielsen & Nyberg Reference Nielsen and Nyberg1992:65f., 75f.; Schøning & Pedersen Reference Schøning, Pedersen, Dufresne, Dupuis and Vocaj2009).
6.4 [ə
] forms
The second analysis regarding what replaces the en forms concerns the choice of [ə] versus [əð] forms (i.e. excluding both en and schwa/zero forms). Here, all the factors included in the analysis are selected as significant (see Table 6 below). Figure 8 shows the model effects. The model shows that also [ə
] forms are in decline: The younger speakers use them less than the older speakers, and in Vinderup there is a significant decrease during the period also in real time. In Odder there is no change in real time.
Table 6. Best model, [ə] versus [əð] forms.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-73278-mediumThumb-S0332586512000182_tab6.jpg?pub-status=live)
Number of observations: 1097; Participants: 114; Lemmas 23
Random effects:
Participant (intercept), Variance 2.2842, Standard deviation 1.5113
Lemma (intercept), Variance 1.9906, Standard deviation 1.4109
The goodness of fit of the model is very good with a concordance statistic C of 0.9657656 and a Somers’ Dxy of 0.9315312.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-94318-mediumThumb-S0332586512000182_fig8g.jpg?pub-status=live)
Figure 8. Mixed model effects, [ə] versus [əð] forms. The predicted probabilities presuppose that all the other factors are kept at their default levels: Locality = Odder, Recording = 1980s, Socialclass = MC. The plots are made using the plotLMER.fnc-function (Package languageR version 1.2).
The fact that the [ə] forms are present in Vinderup at all must be a reflex of earlier regionalization as they are not part of the traditional dialect in Vinderup, but there is no sign of ongoing regionalization during the period from the 1980s to the 2000s. On the contrary, the [ə
] forms are ousted by the Copenhagen/standard Danish [əð] forms. This means that in a period before the time of the first recordings in our study, a regionalization process must have taken place, which was later overcome by the Copenhagen influence. Thus, the investigation of a potentially ongoing regionalization process shows a pattern where it is clear that the [ə
] forms have been increasing in use earlier on, but during the last 20–30 years they have been decreasing in use, and instead the standard [əð] has taken over its domain.
The effects of Recording time and Locality are not as pronounced as for en and schwa/zero forms, however, and the effects of the frequency of the lemma and of gender and social class of the speaker are clearly more important. Regarding the former, the results show that the less frequent lemmas are more disposed towards occurring in [ə] forms than the more frequent lemmas, which is the same pattern as with en forms: The more frequent lemmas are the first to be standardized.
Previous research on frequency effects has shown that, in general, high frequency is a conservative force in analogical change, such as regularization processes: High-frequency structures are often resistant to analogical levelling (e.g. Bybee Reference Bybee2006; Diessel Reference Diessel2007). Arguably, the reason is that high frequency strengthens the representation of a linguistic expression in memory, a phenomenon labelled ‘entrenchment’:
High-frequency sequences become more entrenched in their morphosyntactic structure and resist restructuring on the basis of productive patterns that might otherwise occur. Thus among English irregular verbs the low-frequency verbs are more likely to regularize (weep, weeped) while the high-frequency verbs maintain their irregularity (keep, kept). My proposal to explain this tendency. . . is that frequency strengthens the memory representations of words or phrases, making them easier to access whole and thus less likely to be subject to analogical reformation. (Bybee Reference Bybee2006:715)
The disappearing en forms may be analysed as a regularization in which the strong verbs with respect to past participle suffixes come to follow the productive pattern of the weak verbs. Therefore one should expect that the most frequent lemmas had the highest share of en forms. However, the results of this study clearly show that this is not what happens in the case of past participle suffixes in Danish. What we see is exactly the opposite: High-frequency sequences seem to be less conservative than low-frequency sequences. A likely reason for this is that – both with respect to the development in the direction of using et forms instead of en forms, and, when using et forms, of choosing the [əð] variant instead of the [ə] variant – it is not an ‘intra-linguistic’ development, but a standardization process in which the speakers accommodate to the language use of a norm centre. In line with Thelander (Reference Thelander1979:106), this may imply that the strength of the influence from the standard language is proportional to the frequency of the individual linguistic features in the norm source. In her study of dialect levelling in the two Norwegian dialect areas Røros and Tynset, Røyneland (Reference Røyneland2005:259, 288, 300, 338) reports frequency effects with respect to four of the 16 linguistic variables studied: For three of the variables high-frequency words are less resistant to levelling than low-frequency words similar to the results presented in this article, whereas high-frequency words are more resistant to levelling with respect to the fourth variable. The effect of frequency in standardization or levelling processes thus seems to be complex, and the reason may be that ‘intra-linguistic’ analogical reformation, in which frequency acts as an conservative force, may take place at the same time as contact-induced change, in which frequency (in the norm source) may play a progressive role.
With respect to gender and social class, the analysis shows that there is an interaction between the two factors: Among middle class speakers, male speakers use more [ə] forms than women, while there is no statistically significant gender difference among the working class speakers.Footnote 8
6.5 Language use in Copenhagen
In the Copenhagen data, only three lemmas occur in en form. The en forms thus have a very restricted distribution in Copenhagen compared with the Jutland data, and even among the three lemmas the proportion of en forms is much lower than in Odder and Vinderup (compare Figure 9 with Figure 3 above).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160712072402-72724-mediumThumb-S0332586512000182_fig9g.jpg?pub-status=live)
Figure 9. Distribution of en forms on lemmas, Copenhagen 1980s. The plot is made using the Design package for R, version 2.3–0.
When including all the 25 lemmas which occur in en form in Odder and Vinderup (excluding være), the share of en forms in Copenhagen is 2 %, which is very low compared with the overall proportion of 28% en forms for the same lemmas in the Jutland data. This result is even more salient when taking into account that the Copenhagen data are based on informants of the same age as the oldest informants from Odder and Vinderup (born 1942–63) and only the recordings from 1980s. This is the part of the Jutland data where the proportion of en forms is highest (see Figure 4 and Figure 5 above), and thus considerably higher than the 28% mean.
Figure 10 shows the proportions of forms other than en forms in the Copenhagen data. Comparing Figure 10 with Figure 6 above, the most important difference is that [ə] forms are virtually non-existent in Copenhagen. This confirms that this form is indeed a Jutland variant, and that the occurrence of it in Vinderup must be seen as the result of a regionalization emanating in eastern Jutland. Schwa/zero forms, on the other hand, do occur in Copenhagen at a level comparable to the new recordings from Odder (but they are much less frequent than in the old recordings from Vinderup).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160802154007-44115-mediumThumb-S0332586512000182_fig10g.jpg?pub-status=live)
Figure 10. Past participle forms in Copenhagen (excluding en forms).
Overall, the language use of the oldest Copenhagen informants in the recordings from the 1980s is quite similar to the language use of the youngest informants from Jutland in the recordings from the 2000s. This further supports the hypothesis that the changes with respect to past participle forms which have taken place during the last three decades must be seen as a standardization process with Copenhagen as the norm centre.
7. CONCLUSION
In this study, we have investigated the use of the traditional past participle form -en in two locations in Jutland over time. We have shown that the dialect form has decreased in use in the most recent generations, so that young people in Odder and Vinderup in the 2000s use very few en forms. In Vinderup the change started later (or happened at a slower pace) than in Odder, but in the 2000s there is no longer a statistically significant difference between the levels of en forms used in the two localities. By comparing the Jutland data with the 1980s recordings of the older speakers in the Copenhagen dataset, we have confirmed that en forms are indeed very rare in Copenhagen Danish, as these speakers use en forms even less than the youngest informants in Odder and Vinderup in the new recordings. This supports the hypothesis that the change is led from Copenhagen (see Maegaard et al. in press) for more details on similar patterns of change).
We have interpreted the linguistic change under study as a standardization process, with its centre in Copenhagen. However, in Vinderup we saw evidence of an earlier process of regionalization, where the [ə] form was used in spite of the fact that this form is not part of the traditional dialect in the area. The use of [ə
] forms is also in decline in the period studied, however, and thus the present study documents that the previous regionalization with respect to past participle forms has stopped. The use of forms other than en forms in the speech of the youngest informants in the new recordings from Jutland is actually quite similar to the use of these forms by the oldest Copenhagen informants in old recordings. This further supports the hypothesis that the changes with respect to past participle forms which have taken place during the last three decades must be seen as a standardization process with Copenhagen as the norm centre.
In all, the analyses have shown a pattern of dedialectization, where local dialect forms are substituted by standard forms more or less from one generation to the next (but also during the life span of the individual informants), and where factors such as age, gender, social class and the frequency of the lemma all play a part in the process of change.
ACKNOWLEDGEMENTS
We would like to thank Kristina Kolling-Wedel, Cecilie Meldgaard Goth, Christina Storm Hansen, Albert Worsøe, Nicoline Christine Olsen, Sara Andersen, Vanessa Wolter and Kirsten Sif Lundholm for their work at different stages of the coding process. Thanks also to Inge Lise Pedersen, Karen Margrethe Pedersen and three anonymous peer reviewers for valuable suggestions and comments to the manuscript. This article is based on research funded by The Danish National Research Foundation.