INTRODUCTION
Over the last 50 years, a wave of municipal mergers has swept the developed world. From Scandinavia to New Zealand, reforms have redrawn the map of local government, combining small units to form larger ones. Reformers have had several objectives, including reinforcing democracy and building local government capacity (Baldersheim and Rose Reference Baldersheim, Rose, Baldersheim and Rose2010b, 242–5). But the main motivation has been economic—to reduce costs by capturing economies of scale. Among the industrialized democracies, this trend has affected all types of regimes—from decentralized federations to unitary states—and countries of all sizes—from Luxembourg to the United States.
For such a widespread phenomenon, municipal amalgamation has undergone surprisingly little systematic evaluation. In part, this reflects the difficulty of disentangling effects given endogeneity in the process. In most cases, the choice of which local governments are merged is not random: sometimes central politicians decide, sometimes leaders of the municipalities themselves. Either way, this may cause the merged units to differ from the unmerged ones, complicating the evaluation.
The enthusiasm for enlarging local districts is surprising given the weakness and conditionality of the theoretical rationale. Economies of scale are only one of the likely consequences of increased jurisdiction size. Such benefits may be offset by the loss of effects that favor small units—greater ease of local monitoring, more effective accountability mechanisms, or greater Tiebout-style competition for mobile voters and capital. At the same time, the savings from economies of scale will depend on the initial and postamalgamation sizes of the units and will also vary across the types of public services supplied, which have different cost functions. The net benefits are likely to be indeterminate.
In this article, we examine the consequences for the cost of providing public services of an amalgamation reform that occurred in Denmark in 2007. In this reform, 239 municipalities—essentially all those with populations under 20,000 people—were combined to form 66 new units. An additional 32 municipalities were left untouched (Mouritzen Reference Mouritzen, Baldersheim and Rose2010).
For several reasons, the Danish reform is particularly well-suited to test the effects of increasing jurisdiction size. First, the universal nature of the change effectively ruled out selection: all municipalities below a certain size were required to merge with others, and 98 percent complied. Second, the 32 municipalities that were left untouched (and which had populations similar to those of the 66 new units) constitute a control group for comparisons. Third, the governments in question matter: Danish municipalities play important roles in managing schools, child care, infrastructure, environmental regulation, social spending, and culture. Finally, Denmark's official statistics are accurate and detailed, with broad coverage of local unit characteristics.
A previous article examined the effect of this reform on administrative costs—mostly wages of municipal employees and maintenance of administrative buildings—and found that these fell after consolidation (Blom-Hansen, Houlberg, and Serritzlew Reference Blom-Hansen, Houlberg and Serritzlew2014). That might seem at first to vindicate the enthusiasm for mergers. However, administrative costs amount to less than 10 percent of total municipal spending. We focus here on the other 90 percent and ask: Do municipal mergers decrease the costs of provision of public services such as schools, roads, and infrastructure?
We find no clear and systematic effects from amalgamations. We replicate the finding of Blom-Hansen, Houlberg, and Serritzlew (Reference Blom-Hansen, Houlberg and Serritzlew2014) that administrative costs declined. We find also that spending on road maintenance per kilometer of road fell in the merged units, although we cannot say whether this represents greater efficiency or skimping on repairs. However, the economies of scale in administration and (possibly) road maintenance were offset by diseconomies of scale for labor market programs. In most policy areas—including elder care, schools, daycare, and caring for children with special needs—jurisdiction size did not seem to matter at all. Aggregating the effects, the net impact was null. If the pattern in Denmark holds more generally, the global amalgamation wave is unlikely to yield the savings its proponents anticipate. We interpret our null finding as supporting the position of skeptics who contend, on theoretical grounds, that the quest for an optimal jurisdiction size is futile (Dahl and Tufte Reference Dahl and Tufte1973; Treisman Reference Treisman2007).
The article is organized as follows. The next section provides background on the global wave of municipal amalgamations of recent decades. The third section discusses theoretical arguments about the effects of jurisdiction size. The fourth section outlines the Danish reform. The fifth section describes the data and methods used in the analysis. The sixth section presents results, and the final section concludes.
THE GLOBAL MERGER WAVE
Since the 1950s, reforms to enlarge jurisdictions have transformed the structure of local government across the developed world. As societies modernized and built more extensive welfare states, the local government units inherited from earlier periods were often thought too small to capture economies of scale in service provision (Baldersheim and Rose Reference Baldersheim, Rose, Baldersheim and Rose2010a; Reference Baldersheim, Rose, Baldersheim and Rose2010b, 242; Fox and Gurley Reference Fox and Gurley2006, 8; Keating Reference Keating, Judge, Stoker and Wolman1995, 118; Newton Reference Newton1982, 191; Vetter and Kersting Reference Vetter, Kersting, Kersting and Vetter2003, 19).Footnote 1 Almost everywhere, projects to merge municipalities were debated—and, in most cases, adopted.
These reforms spanned the globe. Table 1 briefly reviews the main cases, the dramatic scope of which may have escaped nonspecialists.
TABLE 1. Local Government Amalgamations in Developed Countries since 1950

From such a survey, the extent of the phenomenon becomes obvious: municipal merger mania has swept the developed world. Reforms have varied in their radicalism: in some nations, e.g., the UK, the local government system has been comprehensively restructured; in others, e.g., France, the changes have been more limited. Countries started—and ended—at quite different points. While in Mexico, Ireland, New Zealand, Denmark, and Japan, the average municipal population is now more than 40,000 residents; in France, Turkey, Switzerland, Austria, and Iceland, it is still below 5,000 (OECD 2010, 207). Even where mergers were not rapidly implemented, demands for them dominated the intellectual agenda. This is all the more intriguing given an opposite tendency among many developing and postcommunist countries, where democratization has often prompted the division of administrative units into ever smaller pieces (Swianiewicz Reference Swianiewicz2010). In Sub-Saharan Africa, for instance, 29 countries saw the number of administrative units grow by at least 20 percent between 1990 and 2012. Brazil's roster of municipalities also increased by 50 percent after the transition from military rule, and there were major increases in Indonesia and Vietnam (Grossman and Lewis Reference Grossman and Lewis2014, 196).
LOCAL JURISDICTION SIZE: THEORY AND EMPIRICAL SURVEYS
The optimal scale of local government jurisdictions—or of government jurisdictions in general—has been debated since the time of Plato. Although the search for an ideal size that can be identified on theoretical grounds, independent of context, has consumed enormous intellectual energy over the years, we believe that, for several reasons, it is a vain quest. We briefly review the main arguments and explain why they fail to yield general implications. We suggest that, without knowing the particular mix of tasks assigned to local governments and their technologies, it is impossible to predict whether, on balance, enlarging municipalities will have positive or negative effects.
Most scholars have conceptualized the optimal scale of local government as a tradeoff between certain effects that favor large size and others that favor smaller units (Dahl and Tufte Reference Dahl and Tufte1973; Hooghe and Marks Reference Hooghe and Marks2009; Treisman Reference Treisman2007). Oates (Reference Oates1972), in a famous analysis, saw the main conflict as that between the more precise matching of services to local tastes that is possible when jurisdictions are small and the economies of scale attainable when they are large.
Since economies of scale are the most commonly cited advantage of large size—and the dominant argument for amalgamations—we discuss them in some detail. In both the private and the public sector, returns to scale are thought to increase for two main reasons (Boyne Reference Boyne1995; Hirsch Reference Hirsch1959; Sawyer Reference Sawyer1991, 47–70). First, there are fixed costs associated with providing various kinds of public service, so the marginal cost will fall with output, at least up to a certain point. Some public goods have elements of nonrivalry in consumption, so the marginal cost is zero (Bergstrom and Goodman Reference Bergstrom and Goodman1973; Borcherding and Deacon Reference Borcherding and Deacon1972). For instance, disease surveillance, water quality control, and restaurant inspections may not cost more to provide for multiple residents than for just one (Santerre Reference Santerre2009). Second, increasing the scale of service provision makes possible a more fine-grained division of labor, yielding the associated benefits of specialization.
However, above a certain level, such benefits of larger size are offset by problems of communication and control. As output grows, so does the need to transmit information through more layers of management. Large production processes often suffer from bureaucratic congestion (Williamson Reference Williamson1967). Consequently, production processes normally exhibit first increasing, then constant, and finally decreasing returns to scale: the typical cost curve is U-shaped. It follows that there is an optimal size—at the bottom of the U-shaped curve—at which unit costs are lowest. Advocates of municipal amalgamation usually suppose that this optimum occurs at a relatively high local population.
Influential as this approach has been, it does not in fact yield any clear implication about the optimal size of municipalities. There are two key problems. First, most local authorities provide a range of services, each with unique production characteristics. Economies of scale are specific to the particular technologies and goods or services produced. Thus, there is not one optimal size but many, one for each of the services provided. Of course, if all municipal services had minimum cost points at high population levels, then amalgamating small units might improve things on average. But, in fact, the technologies for different common local services differ a great deal (Bish Reference Bish2001). To produce all at optimal scale, one would need to replace municipalities with multiple, overlapping single-purpose units—which, besides being highly complex, would itself lead to redundancy of administrative personnel (Ostrom Reference Ostrom1972). For municipalities that provide multiple services, the efficiency consequences of amalgamation will depend on the initial and final size of their jurisdictions and on the particular portfolio of tasks assigned to them and their associated production technologies. Efficiency might either increase or decrease, and a great deal of information is needed to predict which it will be in a particular case.
The second problem is even more fundamental. Most debates relate the size of municipal districts to the cost structure for provision of particular services—for example, primary education. But it is not municipal governments that educate children, it is schools that do so. The most relevant cost effects relate to the size of the school, not that of the school district. The same is true of child care centers, libraries, and residential homes for the elderly—in each case, smaller organizations are the direct providers of services, and it is primarily the scale of these smaller organizations that determines efficiency. The distinction parallels that in the private sector between plant-level and firm-level returns to scale (Boyne Reference Boyne1995, 220; Sawyer Reference Sawyer1991, 50–1; Scherer and Ross Reference Scherer and Ross1990). Any scale economies at the level of direct service providers such as schools and child care centers—and these seem to be meager at best, according to a review of the empirical literature by Walker and Andrews (Reference Walker and Andrews2015, 111–2)—can be harvested without altering local government jurisdictions since one can resize the organizations and their service areas within—and even across—existing municipal boundaries. For a subset of local government functions the costs of which occur at the firm level (most notably administration), increasing jurisdiction size may confer economies of scale (see Blom-Hansen, Houlberg and Serritzlew Reference Blom-Hansen, Houlberg and Serritzlew2014). But, since enlarging municipal districts does not in itself affect the size of individual schools, hospitals, or other plant-level organizations, amalgamation will not affect plant-level efficiency at all.
In short, even setting aside Oates’ (Reference Oates1972) argument that scale economies are offset by less precise matching of services to local tastes, the existence of economies of scale does not imply any direct and universal prescriptions for the design of local government systems, except perhaps in the case of certain single-purpose service providers. For municipalities—or other multipurpose entities—there is simply no good reason to expect that larger size will generally lead to cost savings.
A second argument in favor of amalgamations is that larger jurisdictions may be able to capture not just economies of scale but also economies of scope. It may be more efficient to produce certain related services—say, sewerage and recycling of water, cf. Dollery and Fleming (Reference Dollery and Fleming2006)—jointly than to produce them separately. This does not in itself dictate larger jurisdictions—it concerns the range of services produced, not the scale of production—but if some of the services have a minimum efficient scale, then achieving the bundle of economies could require increasing government size. In fact, the relationship between economies of scale and scope is far from clear. They may complement each other or conflict. But they may also be unrelated (Dollery and Fleming Reference Dollery and Fleming2006). Given this, we should not expect increased size to lead to cost reductions for this reason either.
A third effect traditionally seen to favor larger size concerns externalities—the imposition by one individual of costs or benefits on others that are not compensated via the market. Allocative efficiency is increased when government regulates, taxes, or subsidizes activities so that individuals internalize such effects. However, if the externalities affect mostly individuals outside the given government's jurisdiction—which is more likely to be the case when jurisdictions are small—the government's incentive to address them is weaker. When units are larger, local governments will be motivated and able to tackle more of the prevailing externalities. A similar problem affects not acts of individuals but government policies. If the positive effects of a local government's policies spill over into the neighboring jurisdictions rather than accruing to the citizens that the given government represents, the government will undersupply this policy.
The only way to eliminate all such cross-border influences would be to expand jurisdictions without limit, not just enlarging local governments but merging them into the central government. Of course, such a “solution” would forego all benefits of smaller size. A more sensible approach is to assign service responsibilities to tiers of government in a way that balances the benefits of small size against the cost of externalities. The optimal balance will be specific to particular services. As pointed out by Olson (Reference Olson1986) and Tullock (Reference Tullock1969), among others, different public services produce different externalities. Consequently, any attempt to address externalities—like attempts to capture scale economies—will involve tradeoffs.
Thus, on close examination, the arguments that favor large municipal jurisdictions will only hold in particular contexts. At the same time, other effects could render smaller jurisdictions more efficient (Boyne Reference Boyne2003, 370–2). Various scholars argue that citizens will monitor government more actively in smaller communities, resulting in greater bureaucratic effort and less waste (Dahl and Tufte Reference Dahl and Tufte1973; Denters, Goldsmith, Ladner, Mouritzen, and Rose Reference Denters, Goldsmith, Ladner, Mouritzen and Rose2014). If yardstick competition is part of the system for evaluating local governments, this may work best when there are more competing units (Allers Reference Allers2012), although some studies have failed to find empirical confirmation for this (Boyne Reference Boyne2003, 382). Meanwhile, if the costs of moving to another jurisdiction increase with distance, Tiebout-style (Reference Tiebout1956) competition among local governments to attract residents or mobile capital through government efficiency and responsiveness will be stronger when units are smaller. Competition among a large number of small jurisdictions may also serve to constrain them fiscally, forcing them to supply services efficiently (Brennan and Buchanan Reference Brennan and Buchanan1980, 168–86). Finally, Oates’ argument that smaller jurisdictions enable governments to more precisely tailor public services to local tastes has found echoes in subsequent analyses (Alesina and Spolaore Reference Alesina and Spolaore2003; Oates Reference Oates1972).
Just as with the arguments for large scale, the logic behind these various effects is not always as clear as it might seem (Treisman Reference Treisman2007). But even ignoring this, it is clear that the advantages of large and small size will aggregate and offset each other in context-specific ways. Rather than a presumption that amalgamation will generally increase efficiency, we hypothesize that amalgamation should have no general effects: it will increase efficiency in some contexts and decrease it in others (Fox and Gurley Reference Fox and Gurley2006; Treisman Reference Treisman2007, 53–73). In short, the most plausible hypothesis is a null one.Footnote 2
If the theoretical literature in public finance and political science provides no compelling, general reason to expect efficiency gains from municipal mergers, does the empirical literature detect such gains in practice? Numerous studies have sought to estimate the cost functions for local services. A number of articles have surveyed their results (Bish Reference Bish2001; Boyne Reference Boyne1995; Byrnes and Dollery Reference Byrnes and Dollery2002; Derksen Reference Derksen1988; Fox and Gurley Reference Fox and Gurley2006; Holzer et al. Reference Holzer, Fry, Charbonneau, Van Ryzin, Wang and Burnash2009 Martins Reference Martins1995; Ostrom Reference Ostrom1972;). The main conclusion from these reviews is that there is no consistent evidence on economies of scale in local government. Some studies detect a tendency for very small municipalities to be inefficient (e.g., Breunig and Rocaboy Reference Breunig and Rocaboy2008; Solé-Ollé and Bosch Reference Solé-Ollé and Bosch2005), and some have found administrative efficiency gains from larger size (Blom-Hansen, Houlberg, and Serritzlew Reference Blom-Hansen, Houlberg and Serritzlew2014), but the general finding is that the evidence is inconclusive. Most studies report that optimal scale varies across different services—while a few, such as water and sewage, have considerable economies of scale, others, such as schools, may exhaust such economies at populations under 10,000 (e.g., Fox and Gurley Reference Fox and Gurley2006).
To explicate the findings of these review studies in more detail we look more closely at those of two of the most recent and comprehensive ones. The first is Byrnes and Dollery (Reference Byrnes and Dollery2002) who review 24 international studies and eight Australian ones. They find that, among the international studies, 29 percent find evidence of U-shaped cost curves, 39 per cent find no statistical relationship between per capita expenditure and size, 8 percent find evidence of economies of scale, and 24 percent find diseconomies of scale. The eight Australian studies they survey also reach mixed findings. On this basis, Byrnes and Dollery (Reference Byrnes and Dollery2002, 405) conclude that “considerable uncertainty exists as to whether economies of scale do or do not exist.”
The second review study is Holzer et al. (Reference Holzer, Fry, Charbonneau, Van Ryzin, Wang and Burnash2009), who examine 65 studies from a broad range of countries. They find that there is little evidence for a relationship between size and efficiency for municipalities with populations between 25,000 and 250,000. Among municipalities with populations under 25,000, they find some suggestions that efficiency increases with size, but only in certain contexts. At the same time, they note that much of the literature argues that small municipalities are not less efficient, except in specialized services. On this basis, they conclude that “[t]he literature provides little support for the size and efficiency relationship, and, therefore, little support for the action of consolidation, except as warranted on a case-by-case basis” (Holzer et al. Reference Holzer, Fry, Charbonneau, Van Ryzin, Wang and Burnash2009, 1).
In sum, the empirical literature on the effects of municipal mergers has failed to identify systematic patterns that hold across time and space. From our vantage point, this state of affairs is unsurprising. Since the advantages of large and small size depend on context, and since plant-level and firm-level scale effects are, at best, weakly related, the absence of systematic consequences of jurisdiction size is what one should expect. Our re-examination of the theoretical arguments suggests why empirical researchers have come up empty-handed.
Another lesson from the existing studies is that it is difficult to study scale effects. Even a strong correlation between size and costs must be treated with caution when studies are based on observational data (Boyne Reference Boyne2003, 388). A problem with observational studies is that the size of jurisdictions is nonrandom. Their scale is determined by a variety of factors that also affect the cost of public services. Regional subcultures and local political histories will influence both jurisdiction size and also levels of corruption and bureaucratic efficiency. When large cities are poorly run, districts sometimes secede to form smaller autonomous municipalities (Anderson Reference Anderson2012). At the same time, central reformers, eager to see a successful outcome to their reform, may choose to amalgamate municipalities that are already, for other reasons, more efficient, leading to an association between size and performance.
A solution to this endogeneity problem is the experimental approach (Walker and Andrews Reference Walker and Andrews2015, 126). We use a recent Danish municipal reform, which we introduce in greater detail in the next section, to address this problem. As will become clear, we find evidence consistent with our hypothesis that no general relationship exists between jurisdiction size and public service spending. Even after accounting for endogeneity far more precisely than is usually possible, the finding is—as expected—null.
THE DANISH MUNICIPAL REFORM
On January 1, 2007, a major reform of Danish local government changed the size of most of the country's municipalities.Footnote 3 Denmark, a small unitary state with a large welfare state (see Arter Reference Arter2012), has three levels of government. Before the reform, the lowest level consisted of 271 municipalities. From 2007, large scale mergers left just 98 municipalities, with an average population of 57,000 inhabitants.Footnote 4
Each municipality is governed by a city council, elected every four years, with day-to-day administration left to standing committees under the city council and to the mayor, who is elected by the city council. The municipalities provide basic welfare services, distribute various social transfers, and administer aspects of utilities, culture, and recreation. In our analysis, we focus on eight major policy areas: schools, daycare, elder care, children with special needs, roads, culture, administration, and labor markets. In Lowi's (Reference Lowi1972) terms, all of these involve distributive policies.
Municipal spending accounts for more than half of all public expenditure in Denmark. The local governments fund their activities from various income sources, the most important of which is the local income tax. This tax finances about half of all municipal spending, with the remainder coming from user charges and central government grants. The average local income tax rate was 24.9 percent of citizens’ personal income in 2014. In principle, the municipalities are free to decide their own income tax rate, but in practice the central government has imposed a number of controls over local taxation. Nevertheless, compared to other countries, Danish municipalities still enjoy considerable autonomy (Blom-Hansen and Heeager Reference Blom-Hansen, Heeager, Loughlin, Hendriks and Lidström2011).
The 2007 reform was quick and radical. Before 2002, municipal restructuring had not made it onto the Danish political agenda. When the idea of a centrally imposed reform was floated in a parliamentary committee discussion, the government firmly rejected it. Yet, in 2004 a government-commissioned report recommended amalgamations. One year later, in the spring of 2005, the national parliament approved a semivoluntary merger program, which had been forced through with the backing of a narrow majority (Bundgaard and Vrangbæk Reference Bundgaard and Vrangbæk2007; Christiansen and Klitgaard Reference Christiansen and Klitgaard2010; Mouritzen Reference Mouritzen, Baldersheim and Rose2010).
The reform had two main elements. The first was a reshuffle of functions across tiers involving income tax assessment, services for handicapped, rehabilitation, health promotion, primary education for children with special needs, environmental protection, and regional roads. Although this list may sound impressive, spending on the new functions amounted to only about 8 percent of the municipalities’ previous budgets. The reallocation of functions did not involve the traditional municipal core tasks related to welfare and public utilities.
While the reshuffle of functions included all municipalities, the second element—the municipal amalgamations—did not. This part of the reform left 32 municipalities that were already above the size threshold intact, but required the other 239 to merge into 66 new larger entities. The reform stipulated that municipalities with fewer than 20,000 citizens were to be combined with neighbors to form new units that should aim for the target size of about 30,000 citizens. The only way that municipalities with fewer than 20,000 inhabitants could avoid amalgamation was by concluding a cooperative arrangement on service provision with a large neighboring municipality. This proved very difficult in practice, and only five of the 239 units took this path. Three small municipalities—Farum, Holmsland, and Hvorslev—failed to make arrangements for themselves and were subjected to intervention by the central government, which then organized their amalgamations.
METHODS AND DATA
We use the 2007 Danish municipal amalgamation reform as a source of exogenous variation in jurisdiction size to address the problem of endogeneity. We treat the case as a quasi-experiment. A quasi-experiment shares many features with other types of experiment (Cook and Campbell Reference Cook and Campbell1979, 56; Dunning Reference Dunning2012, 15–21). It has, at least in the ideal situation, experimental and control groups as well as pre- and post-treatment measures of relevant variables. In this case, the “control group” consists of the 32 municipalities that were already above the size threshold and so did not undergo amalgamation. Their jurisdictions experienced only negligible demographic changes. The “treatment group” consists of the 66 municipalities, formed by the exogenously decreed amalgamation of smaller units.
In contrast to other experiments, assignment to experimental and control groups is not randomized in quasi-experiments. This raises the possibility that differences in results might be caused by preexisting differences between the groups, rather than by the experimental intervention, so such differences need to be carefully controlled. Still, compared to traditional observational studies, quasi-experiments have the great advantage that the main independent variable is determined by some process that is exogenous to the one under study.
Although the impetus for amalgamation in the Danish program was clearly exogenous to the individual municipalities—all small ones were required to undergo reform—the precise choice of partner, and thus the exact size of the new merged unit, were left to local decisions. The reform gave the local governments six months to settle the amalgamations. The key issue for our research design is whether service provision costs played any significant role in shaping the individual municipalities’ choices.
In fact, the evidence clearly suggests that costs of administration and services were not very important to amalgamation patterns. Case studies reported in Mouritzen (Reference Mouritzen2006) of specific amalgamations demonstrate that other factors such as local identity and local politicians’ ambitions for office in the future affected how municipalities were amalgamated. Bhatti and Hansen (Reference Bhatti and Hansen2011) show in a quantitative study of all municipalities that social connections (measured as commuting patterns) between municipalities had a significant effect on the chance of amalgamation. All this increases confidence that considerations of service provision costs played little role in the outcomes. We therefore proceed on the assumption that service provision costs were exogenous to the amalgamations.
In Table 2 we compare the growth in size for amalgamated (treated) and nonamalgamated (control) municipalities. The size of the nonamalgamated municipalities in the control group changed little, but in the amalgamated municipalities the changes were dramatic.
TABLE 2. Size of Municipalities in Control Group and Treatment Group, before and after Reform (percent)

The reform took effect in 2007. Our data span 2003–2014, i.e., four years before the reform and eight years after. To allow for pre- and postreform comparison, we impose the postreform structure on the prereform structure by aggregating prereform municipalities that would eventually be amalgamated to their postreform size.Footnote 5 The municipalities of København, Frederiksberg, and Bornholm had prereform status as both county and municipality and were therefore excluded. This leaves us with 1,140 observations (95 municipalities over 12 years). Of these 95 municipalities, 29 did not experience a change in borders (the control group), and 66 resulted from mergers (the treatment group).Footnote 6
Hence, we have 116 prereform and 232 postreform observations for the control group (29 units over four and eight years, respectively), and 264 prereform and 528 postreform observations for the treatment group (66 municipalities over four and eight years, respectively). Studying changes in service costs for the treatment group alone would confound the effect of changes in size with the general trend in service costs over time. Following Blom-Hansen, Houlberg, and Serritzlew (Reference Blom-Hansen, Houlberg and Serritzlew2014) we use the difference-in-difference (DiD) approach to isolate the causal effect of size, comparing data for the treatment group and the control group.
The logic is this: The difference in service costs for the treatment group, before and after the reform, is an estimate of the combined effect of changes in size and time. The difference in service costs for the control group, before and after the reform, is an estimate of the effect of time, but not of changes in size. The difference between these two differences constitutes the DiD estimator, which estimates the average effect of the changes in size on service costs for the treated units (or, the average treatment effect for the treated, ATT). The DiD-estimator can be obtained from the following regression analysis:

where Yi is a measure of service costs for municipality i, TGi is a dummy variable taking the value 1 if municipality i belongs to the treatment group (0 otherwise), Ti is a dummy variable taking the value 1 if the observation is measured post reform (0 otherwise), and TGi × Ti is an interaction term. It can easily be shown that β3 is the DiD estimator (see Wooldridge Reference Wooldridge2009, or Lassen and Serritzlew Reference Lassen and Serritzlew2011, or Blom-Hansen, Houlberg, and Serritzlew Reference Blom-Hansen, Houlberg and Serritzlew2014, for a similar application). Furthermore, β1 is an estimate of the differences between the treatment and control groups, before the reform. If municipalities were assigned randomly (which, of course, they are not), this should be close to zero. β2 is an estimate of the general trend in service costs over time. This may be positive or negative, depending on factors such as the development in available technology, changes in prices and wages, or changes in service provision.
Equation (1) operates with only two periods, one pre- and one postreform. However, reforms have an inherent temporal component. Reaction to shocks can be slow (O'Toole and Meier Reference O'Toole and Meier1999, 514); and there may be a delay between the time at which a change is implemented and that at which employees and organizations perform differently (Oberfield Reference Oberfield2014). To see how effects develop over time, we expand (1) with dummy variables T2003i − T2014i and corresponding interaction terms to estimate changes in service costs over time for the span of data available. We also include a set of control variables that capture changes in factors relevant to service costs (other than size) that may change differently for the control and the treatment group.
Our dependent variable is a number of different specifications of spending per capita. As noted by Holzer et al. (Reference Holzer, Fry, Charbonneau, Van Ryzin, Wang and Burnash2009, 19) and Boyne (Reference Boyne1995, 219–20), this measure is used throughout the literature. And seen from the taxpayer's perspective it is probably the most relevant concept to focus on. But it should be treated with caution. It does not measure effectiveness or efficiency (cf. Boyne Reference Boyne2002, 17–8). No valid general indicators of service quality or effects on formal policy objectives are available and, accordingly, our analysis cannot estimate size effects on quality or effectiveness. Furthermore, spending per capita does not measure efficiency since population is a poor proxy for service outputs (Boyne Reference Boyne1995, 219). However, to facilitate comparison with previous literature, we use spending-per-capita measures in our main analysis, but we also present a robustness analysis that breaks down spending per capita into its two components: quantity of output and unit costs. The latter is closer to measuring efficiency.
To be more precise, the dependent variable is net current expenditure per user in eight policy areas, measured in DKK in 2014 prices. These eight policy areas include all major services that the municipalities provided both before and after the 2007 reform. New functions transferred to the municipalities as part of the reform as well as some minor functions are excluded.Footnote 7 We include only current expenditure, since capital expenditure in Denmark is fully accounted in the year of investment (the cash flow principle). We use net expenditure in order to focus on the expenditures financed by the municipality itself. Hence conditional grants from the central government, user fees, and cross-municipal payments for services provided to other municipalities are subtracted. Table 3 presents the eight policy areas in more detail. For precise operationalizations, please refer to Appendix Table A1 in the online supplementary material.
TABLE 3. Policy Areas

As is evident from Table 3, total expenditures included in the analysis amounted to 245.5 billion DKK in 2014. This constitutes 85 percent of all municipal expenditure that year.Footnote 8 Daycare, schools, elder care, and labor market activities (including income transfers) are the major expenditure areas, while roads, culture, and children with special needs constitute minor expenditure areas.
Since assignment of municipalities to treatment and control groups is not randomized, we include a set of social, economic, environmental and political control variables (Andrews et al. Reference Andrews, Boyne, Law and Walker2005) used in previous policy analyses of Danish municipalities (Blom-Hansen, Houlberg, and Serritzlew Reference Blom-Hansen, Houlberg and Serritzlew2014; Serritzlew Reference Serritzlew2005; Økonomi- og Indenrigsministeriet 2012). First, we include two indicators for spending needs: dispersed settlements and socioeconomic expenditure needs. Dispersal of settlements is a potentially time-variant structural condition influencing costs. Socioeconomic expenditure needs is an index measure used in the national equalization scheme for municipalities, constructed from a number of objective indicators, such as the number of unemployed, the number of children of single parents, etc. We also control for location on an island; this is a time-invariant, but very important, determinant of spending needs. Second, an indicator of fiscal pressure (an estimate of expenditure needs relative to the tax base) controls for variations in economic potential among the municipalities. Finally, we control for two political factors that might influence local policy. Greater political fragmentation, as captured by the effective number of political parties, could increase government spending if government resources are seen as common property, subject to overuse by fragmented decision-makers (Velasco Reference Velasco2000). Meanwhile, a higher proportion of socialist seats in the council might predispose the municipality to spend more (Boyne Reference Boyne1996). The precise specifications of the control variables also appear in Appendix Table A1 in the online supplementary material.
RESULTS
Before turning to the DiD-based regression analyses, we present a first view of the data in Figure 1, which shows the development over time in expenditure per user in different functional areas for amalgamated and nonamalgamated municipalities. The first eight panels in the figure are the eight expenditure areas, while the last panel shows the sum of all expenditures (per inhabitant). These graphs present the raw data, without any control for factors other than amalgamations. Still, they illustrate findings that we later confirm.

FIGURE 1. Group Means on Dependent Variables, by Year
First, Figure 1 shows parallel trends for amalgamated and nonamalgamated municipalities before the reform. This is crucial for the DiD-analyses presented below. The different groups of units were evolving along similar paths. Second, if the amalgamations affected spending, we should expect to see different trends for amalgamated and nonamalgamated municipalities after the reform. In fact, we see no consistent differences. For example, in the school area, amalgamated municipalities spent less per pupil than nonamalgamated ones, both before and after the reform. But the trends over time appear to be the same for the two groups. Municipalities that were merged in 2007 neither converged with—nor diverged from—the unmerged units. Indeed, the 2007 reform seems to have left no mark.
This makes sense, given the distinction we noted between firm level and plant level characteristics—here, the size of the municipality and the size of schools within it. Even if larger schools were more efficient, amalgamating municipalities would not in itself decrease spending unless it somehow led to the amalgamation of schools. A similar pattern is found for spending per user on daycare and elder care. These policy areas are in many ways comparable to public schools in the Danish system. Daycare is provided mainly in public kindergartens, and elderly care in nursing homes and sheltered housing. Each municipality has several of these institutions to serve different geographical areas. Amalgamating a municipality does not in itself increase the size of the plant level institutions. Culture and total expenditure per inhabitant also follow this pattern.
In some areas, the time trends for the two groups of municipalities do diverge after 2007. For instance, in the road area, amalgamated and non-amalgamated municipalities had similar expenditure trends until 2007. But then a gap appears, and the amalgamated municipalities start to spend less than the nonamalgamated ones until 2012, before converging in 2013, but then diverging again in 2014. Danish municipalities are responsible for the maintenance of local roads and make decisions about quality levels. Some of the work is carried out by municipal maintenance divisions, some is contracted out to private providers (Blom-Hansen Reference Blom-Hansen2003). The same time pattern is also seen in the area of administration, where no subsequent convergence occurs.
The opposite pattern—in which amalgamated municipalities start to spend more than nonamalgamated ones after 2007—is found in two other areas: care for children with special needs (municipalities are responsible for preventive activities such as counseling and pedagogical support of families at risk, as well as for the forcible removal of children from their homes) and labor market policy (municipalities distribute income transfers such as sickness benefits, run job centers, and administer eligibility for social benefits).
Based on the graphs, it appears that in most functional areas the municipal amalgamations had no effect on spending per potential user. In other areas, mergers seem to have either reduced or increased spending relative to the control group. However, these conclusions are preliminary. One needs to check that the same results obtain holding constant other factors that might have influenced expenditure trends.
We therefore now turn to the results of the DiD analyses. Table 4 first compares the average prereform expenditure levels to the average postreform levels in, respectively, the amalgamated and nonamalgamated municipalities. This table contains only one prereform and one postreform observation for each municipality. The estimation method is OLS with clustered standard errors. The upper panel in Table 4 includes only a dummy indicating units that underwent amalgamation in 2007 (the treatment variable) and a time dummy indicating whether observations are made pre- or postreform. According to the DiD logic, the reform effect is identified by the interaction of the treatment variable and the post-reform time measure. The variable post-reform*amalgamated is therefore our DiD estimator. Since no controls are included in the upper panel in Table 4, it basically reproduces the graphs in Figure 1. It confirms that, in most areas, the amalgamations left no mark, but in some areas they seem to have induced either increases or reductions in spending.
TABLE 4. Two-period Estimates for Eight Policy Areas, With and Without Controls

Notes: Robust standard errors in parentheses (clustered at each municipality).
*** p<0.01, ** p<0.05, * p<0.10.
The lower panel in Table 4 introduces our control variables. None of them have effects in all analyses, but several are important for understanding expenditure developments in individual areas—note the jump in R-squared in all cases. However, the DiD estimator still indicates that in most areas, the amalgamations left no mark. But, again, in some areas they seem to have either increased or reduced spending. More precisely, in the areas of children with special needs, daycare, schools, and elder care there is no evidence that the amalgamation reform mattered. In the areas of roads and administration, the impression from the graphs in Figure 1 is confirmed: Amalgamations seem to have led to lower spending. In the area of labor market services (and to a limited extent culture) the opposite is the case. Summing across all policy areas no amalgamation effect is found for total spending. Our results thus parallel those of Allers and Geertsema (Reference Allers and Geertsema2014), who also failed to find any systematic effects on spending of municipal amalgamations in the Netherlands.
Table 5 presents a more detailed analysis. While Table 4 compared average pre- and postreform expenditure levels, Table 5 includes all our yearly observations—that is, four prereform years and eight postreform years for all municipalities. This analysis thus makes it possible to identify the exact timing of a reform effect. Since a reform effect is not likely to materialize immediately after the reform, Table 5 can show whether it occurs with a time lag. In addition, we introduce one more methodological adjustment. Since our data are expenditure allocations from the same overall budget to different policy areas, they are not likely to be completely independent across policy areas. We therefore run the analyses as seemingly unrelated regressions (SUR) (Zellner Reference Zellner1962). Table 5 is therefore also a robustness check of the results in Table 4.
TABLE 5. Single Year Estimates in Eight Policy Areas, SUR Regressions (except model 9 which is an additive of the eight areas)

Notes: Standard errors in parentheses. For model 9 robust standard errors (clustered at each municipality) and R-squared is adjusted R 2.
Level of significance is marked by asterisks after the parameter estimate: *** p<0.01, ** p<0.05, * p<0.1.
Level of significance, Bonferroni-corrected for ten simultaneous tests: ††† p<0.01, †† p<0.05, † p<0.1.
Again, according to the DiD logic, reform effects are identified by interaction terms of the treatment variable (amalgamation) and post-treatment time measures. In Table 5 the DiD estimators are, consequently, Amalgamated*2007, Amalgamated*2008, Amalgamated*2009, Amalgamated*2010, Amalgamated*-2011, Amalgamated*2012, Amalgamated*2013, and Amalgamated*2014.
Table 5 confirms the results from Table 4. In the areas of daycare, schools, elder care, and children with special needs, there is no evidence that the amalgamation reform made a difference to spending. In the areas of roads and administration, mergers seem to have led to lower spending, while the opposite is the case in the area of labor market services. The suggestion in Table 4 of higher spending on culture is not reproduced. In contrast to Table 4, Table 5 allows the timing of these reform effects to be identified. In the road area, reform effects start in 2008 and grow over the following years, until the effect ceases to be statistically significant in 2013. In the administrative area, they do not materialize until 2009, but then also grow over the following years.Footnote 9 In the labor market area, permanent negative reform effects appear already in 2007.
To briefly comment on the remaining findings in Table 5, the year dummies estimate the general time trend, including changes in how functional responsibilities are assigned, for each year relative to the initial year, 2003. As is evident, these dummies are statistically significant in most analyses indicating that the municipalities experience common influences over time. This confirms the impression from the graphs in Figure 1, which showed parallel expenditure trends for the amalgamated and nonamalgamated municipalities. Turning to the control variables, municipalities on small islands face extraordinary diseconomies of scale in the provision of services for daycare, schools, roads, children with special needs, and administration. The variable dispersal of settlement shows that thinly populated municipalities spend more on elder care, roads, and administration, but less on all other areas. Fiscal pressure leads to lower spending in all policy areas—except the labor market, probably because fiscal pressure is partly caused by unemployment. Next, socioeconomic expenditure needs are cost drivers in all policy areas. Finally, expenditure in Danish municipalities may also reflect political factors. Both party fragmentation and party ideology measured as the share of socialist seats have nontrivial, but unsystematic, effects across policy areas.
The results reported in Figure 1 and Tables 4 and 5 constitute our core findings. However, before drawing final conclusions, we conduct three robustness checks. First, in Appendix Table A2 in the online supplementary material, we break down our dependent variable—spending per potential user—into its two components—the quantity of outputs supplied (per potential user) and the cost of each unit of output. Lower spending per user might indicate either a reduction in supply (fewer units) or an increase in efficiency (lower cost per unit), rendering the previous results a little ambiguous. In the six functional areas for which such breakdowns are possible,Footnote 10 we find no evidence of any change—either positive or negative—in the efficiency of provision after amalgamation.Footnote 11 As for the amount supplied, this is significantly higher for labor market activities and roads, but it is significantly lower for elder care. In the case of roads, this reflects a greater transfer of regional roads to the newly merged municipalities than to the control group municipalities, and not some municipal decision. It is hard to think of any general logic that would explain this pattern. For children with special needs, we observe an interesting change: There is some tendency for amalgamated municipalities to supply more units (that is, to forcibly remove more children) after the reform. Since we control for socioeconomic expenditure needs, this is unlikely to reflect disproportionate changes in the composition of citizens in amalgamated and nonamalgamated municipalities. This could be produced by a tendency for smaller units (i.e., later-amalgamated municipalities before the reform) to hesitate to forcibly remove children because the major, long-term expense of this intervention can have serious budgetary consequences for a small municipality.Footnote 12 This is offset by a statistically insignificant tendency for unit costs to be smaller, resulting in the net result that expenditure does not change. In sum, increased jurisdiction size seems to have had mixed effects, if any, on spending levels, and no discernible effect on efficiency.
Second, in Appendix Table A3 in the online supplementary material, we rerun the analysis for subgroups of municipalities of different (prereform) sizes. Although most studies find that the evidence on economies of scale in local government is inconclusive, some find a tendency for very small municipalities to be inefficient (e.g., Bodkin and Conklin Reference Bodkin and Conklin1971; Breunig and Rocaboy Reference Breunig and Rocaboy2008; Solé-Ollé and Bosch Reference Solé-Ollé and Bosch2005). We therefore investigate whether small municipalities gain more from amalgamation than somewhat larger ones. Appendix Table A3 reports results rerunning Model 9 of Table 5 for just those amalgamated municipalities whose prereform size averaged, respectively, less than 10,000 citizens, less than 12,000 citizens, and less than 15,000 citizens. In each case, the results were not systematically different from those of our main analysis (for amalgamated municipalities with prereform average size of up to 20,000 citizens).
Third, in Appendix Table A4 in the online supplementary material, we report results for two groups of municipalities, based on the similarity of their prereform spending levels. The first group consists of pairs of amalgamating municipalities that had relatively similar spending levels, while the second contains pairs with more different prereform spending levels. The aim is to see if the results could be driven by a tendency for municipalities with similar spending to merge. For pairs of municipalities with very different spending levels, one might imagine that spending in the low-spending municipality would converge upward to that of its high-spending counterpart. However, we find that results are very similar in the two groups.
DISCUSSION AND CONCLUSION
Since the 1950s, a wave of municipal amalgamations, motivated largely by a belief in readily attainable economies of scale, has expanded the jurisdictions of local governments across the developed world. Exploiting the exogenous imposition of a reform to amalgamate all Danish municipalities with populations under 20,000 inhabitants, and using a difference-in-differences design to compare these merged municipalities with other relatively large ones untouched by the reform, we provide stronger evidence than previously available about the effects of jurisdiction size on spending.
We show that increasing local governments’ jurisdiction size had no systematic consequences on spending. In one or two functional areas, amalgamation led to lower spending; in one, it led to higher spending; and in most areas, spending was unaffected. From the local taxpayers’ perspective, total spending per capita is probably the most salient variable. But spending per capita can also be usefully decomposed into two component parts—the number of units supplied (per capita) and the cost per unit. Although, like the rest of the literature on this topic, we lack compelling, across-the-board indicators of service quality, cost per unit can serve as a reasonable proxy of efficiency. In none of the service categories for which we could estimate cost per unit did larger jurisdiction size result in either significantly higher or lower efficiency, measured in this way.
Our design does not allow us to see exactly why this is so. The lack of an effect certainly does not mean that fixed costs are irrelevant to production in the eight policy areas studied or that no economies of scale exist. On the contrary, previous literature suggests that fixed costs can be considerable (Boyne Reference Boyne1995; Hirsch Reference Hirsch1959; Sawyer Reference Sawyer1991). A more plausible interpretation is that the relevant kind of fixed costs are difficult to reduce by municipal amalgamation. Some of the most expensive public services are produced at units within local government jurisdictions such as schools, kindergartens, and nursing homes. Increasing the scale of local governments does not automatically increase the scale of such service providers (Boyne Reference Boyne1995; Sawyer Reference Sawyer1991). As in private production, firm size does not equate to plant size. Besides, multipurpose governments can almost never be optimally sized for all the services they provide, since different services have different production functions and externalities (Olson Reference Olson1986; Tullock Reference Tullock1969). Any systematic effect in one area may be offset by countervailing effects in another (Treisman Reference Treisman2007). These empirical findings are consistent with the weakness of the theoretical rationale for consistent scale effects.
We have abstracted here from the direct costs of amalgamation reforms. Various evidence suggests these can be large, not just because of the transition costs, but also—and probably more importantly—because municipalities about to merge often indulge in a last-minute flurry of spending (Blom-Hansen Reference Blom-Hansen2010; Hansen Reference Hansen2014; Hinnerich Reference Hinnerich2009; Jonsson Reference Jonsson1983; Jordahl and Liang Reference Jordahl and Liang2010). If mergers have no general positive effects, the costs of implementing them should give pause to reformers. We conclude that, if Denmark's experience is typical, the global amalgamation wave will probably not result in real savings. This has policy implications. Prospective reformers of the architecture of government should not build plans to consolidate local government upon an expectation that larger size will lead to cost reductions.
This result may also have implications for how the question of optimal size should be investigated empirically. If jurisdiction size has no unequivocal effect on costs for multipurpose units, it makes little sense to look for a unique, context-free answer. The optimal scale for a political entity depends on what services it provides. Consider, for example, Australia, where local government is only “engaged in the most minimal property-oriented services (primarily “roads and rubbish”)” (Boadway and Shah, Reference Boadway and Shah2009, 276). It may well be that the economically optimal size, in such a case, is small, perhaps 5,000 inhabitants (the Australian municipalities are, in fact, larger than that). Or imagine another country in which local governments are responsible for elementary schools, elderly care, and child care. How large municipalities are is not very relevant to the costs of providing these goods, since what matters most is the size of schools, retirement homes, and daycare centers. Of course, this does not mean that one should ignore scale effects. Rather, it suggests the need to direct attention to questions that are likely to have answers, such as the optimal size of a particular service at the plant level. The accumulation of knowledge on such questions promises both academic and policy payoffs.
Drawing lessons from one country's experience requires care. The quasi-experimental nature of the Danish reform offers unusual opportunities to identify causal relationships, but the results cannot be generalized without caution. First, the world of municipalities is diverse. Some countries (for example France, Austria, and Switzerland) have very small municipalities, well below the smallest included in the data analyzed here. Although we expect that a similar logic applies to them too, we cannot rule out that some municipalities are so small that amalgamation would in fact produce economies of scale across the board. Since the variance in the pre- and postreform size of Danish municipalities is limited—with only a few below 5,000 or above 100,000 citizens—it will require further research to see whether the results extend to systems with much smaller or larger units. Second, Danish municipalities are—as in most countries—multipurpose service providers. However, in some countries—especially the USA—single-purpose entities are also important. In such cases, the difficulty of aggregating optimal scales for multiple services disappears, although one is still left with the disconnect between firm and plant level costs (e.g., those of the school and those of the school board).
Further research will also be needed to pin down why economies of scale failed to materialize, in this case and in others. If one key factor is—as we conjectured—the disconnect between firm size and plant size effects, then we might expect to see consistent divergences in the effect of amalgamations on plant level costs (for instance, of schools and hospitals) and firm level costs (for instance, of administration in city hall). These will not necessarily correlate, and, of course, enlarging municipal jurisdictions will not make the schools and hospitals within them either bigger or smaller. At the same time, analyses of this question must take seriously the endogenous way in which local government jurisdictions evolve. If future, well-designed studies of additional countries also fail to find clear evidence for scale effects, this will deepen doubts about the wisdom of the global movement for municipal amalgamation.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S0003055416000320.
Comments
No Comments have been published for this article.