Introduction
In recent years, there has been a substantial increase in the application of spatial econometrics to political science problems. The strength of these models comes from their ability to demonstrate how outcomes in observations are linked, either through spillovers, learning, externalities, emulation, coordination, competition, or any number of diffusion processes. However, with these models comes a cost in the form of increased complexity of interpretation. Interpreting the estimated causal effects from spatial econometric models requires carefully incorporating the impact of the covariates, the strength of the spatial dependence, and the spatial profile of each observation.
The spatial autoregressive (SAR) model is the most common spatial model used in political science. This model allows scholars to examine outcomes that are theorized to be co-dependent across space (or unit boundaries). The most common practice—what we call the coefficient interpretation approach—is to interpret the sign of the spatial dependence and then provide only a cursory interpretation of the explanatory variables with little focus on quantities of interest.Footnote 1 This approach is problematic because the coefficients misrepresent the actual direct effects, thereby making it impossible to determine the total effects (TEs) of a covariate solely from examining the coefficients. Since every observation likely has a different configuration of spatial relationships, the SAR coefficients cloud in-sample variation in effect sizes.Footnote 2 Our survey of the literature suggests that these inferential errors are likely widespread, since the majority of scholars interpret the coefficients from SAR models as if they are from a non-spatial model.
We extend a general approach to interpreting spatial econometric models that was developed in the regional science literature (LeSage and Pace 2009). The general approach uses information on the effects of covariates, the strength of dependence, and the connections among observations to calculate a partial derivatives matrix. This matrix contains the TEs of a change in a covariate on the observation itself (i.e., direct effects) and the other observations (i.e., indirect effects). We also introduce summary measures based on this matrix that provide the average total, direct and indirect effects. These values can be partitioned into higher orders of connections (or neighbors) such that it can then be determined to what extent the effect of a change in a covariate for one observation spills over to observations beyond its neighbors.
The general approach offers two significant advantages over the coefficient interpretation approach. First, the partial derivatives matrix improves model interpretation overall by increasing the accuracy and breadth of the inferences derived from the models. Second, these methods offer a method of comparing effect sizes across different specifications of spatial models. Adopting this approach produces more clear inferences, as scholars can directly compare how assumptions that underpin different specifications result in different effect sizes and conclusions about the data-generating process. We then demonstrate these advantages by estimating a selection of simple spatial models of democratic diffusion, and by illustrating a variety of techniques for depicting meaningful quantities of interest.
Spatial econometric models
The growth in the use of spatial econometric models in political science is partly a consequence of improvements in the techniques used to estimate and interpret increasingly complex specifications (e.g., Beck et al., Reference Beck, Gleditsch and Beardsley2006; Franzese and Hays, Reference Franzese and Hays2006; Ward and Gleditsch, Reference Ward and Gleditsch2008; Darmofal, Reference Darmofal2015). Though the theories used to justify spatial models are varied (see e.g., Shipan and Volden, Reference Shipan and Volden2008, for policy diffusion), these models typically focus on three types of spatial processes (Cook et al., Reference Cook, Hays and Franzese2015): clustering in the outcomes (i.e., when y i influences y j and vice versa), clustering in the unobservables (i.e., when ε i is correlated with ε j), and clustering in the observables (i.e., when x i influences y j). Most political science research has focused on two particular models, the SAR and the spatial-X (SLX) models.
The SAR model features clustering in the outcomes. In this specification, the outcomes are correlated (to the degree determined by the global spatial autocorrelation parameter, ρ) based on the connections specified in the N × N weights matrix, W, with each element w ij representing the degree of connectivity between observations i and j. For a given cross-sectional observation of interest, y i the SAR models takes the form of:
This can be also summarized in matrix form as:
By construction, a shock in the outcome of one observation i affects the outcome in neighboring observations j, which continues to influence j's neighbors including i (this is known as feedback effects). In the case of the SAR model, each β in β represents the direct effect of x i on y i (without considering feedback effects that arise at higher orders of neighbors).
The SLX model features clustering in the observables. This model is specified for a given cross-sectional observation, y i, as:
and in matrix form as:
In this case, changing a covariate in one observation (x i) directly influences the outcomes in neighboring observations (y j). In contrast to the process of clustering in the outcomes, the outcomes are not endogenous, as y i cannot influence y j. As a result, the estimation is much simpler and can be done via ordinary least squares instead of inherently more complicated techniques such as maximum likelihood or generalized two-stage least squares (as is the case with SAR). Further, SLX variables can be incorporated into settings with categorical dependent variables with ordinal logit (Fortunato et al., Reference Fortunato, Swift and Williams2018), count models, and more. The primary difference between Equations 2 and 4 is the absence of the spatial autocorrelation parameter ρ, and the addition of a θ depicting the effect of x i on y ¬i. The SLX is a more flexible specification, as the weights matrices can vary across covariates (including no weights matrix at all) with each one having a different estimate of the indirect effect (i.e., θ).
While there are a variety of models that include more than one spatial parameter (see Halleck Vega and Elhorst (Reference Halleck Vega and Elhorst2015) for a review), we focus on the SAR and SLX models for three reasons. First, as we outline below, these models are the most common in political science. Second, it is easy to extend the insights of our general approach to more complicated models. Third, other spatial econometric models (such as the spatial error model, or SEM) assume that the errors (though spatially correlated) are mean 0 in expectation, which implies that the effects of the covariates do not differ across space. For those models, the calculation of quantities of interest differs only slightly from a non-spatial OLS model.
The widely-used coefficient interpretation approach to spatial econometric models
We surveyed the use of spatial econometric models in political science journals to get a sense of the various strategies scholars use to derive inferences from spatial econometric models. We identified 155 published works (as of May 2015) on Web of Science that cited one or both of two seminal spatial econometrics articles aimed at a political science audience: Beck et al. (Reference Beck, Gleditsch and Beardsley2006) and Franzese and Hays (Reference Franzese and Hays2007).Footnote 3 Of those works, 94 estimated some version of a spatial econometric model. That is, 60 percent of these papers reported results from a SAR model as their main test of their spatial theory while 33 percent reported results from an SLX model.
Focusing on interpretations of SAR models, we identified two troubling patterns concerning interpretation of quantities of interest. First, our survey indicates that researchers have mostly ignored the simultaneous, endogenous nature of the SAR specification when they have interpreted the results from their models. This has included interpreting estimates of β as though they are regular OLS coefficients instead of direct effects, failure to acknowledge higher-order effects, and failure (with a few notable exceptions) to interpret the spatial effects beyond an assessment of the direction and statistical significance of ρ. More specifically, only 42 percent of papers with SAR models interpret the coefficients beyond saying that they represent “pre-spatial” effects and only 15 percent provide graphical depictions of the diffusion process.
Second, our survey shows that scholars employing the SAR model rarely compute the proper post-spatial marginal effects and associated estimates of uncertainty. This means that they are interpreting the results from their SAR models incorrectly. While interpreting the ρ gives a sense of the estimated direction of spatial dependence, it is impossible to characterize how the effects of x on y operate through the spatial multiplier without generating the appropriate quantities of interest (with uncertainty). Without the quantities of interest, we are left with a statistically significant spatial autocorrelation coefficient but no understanding as to whether the spatial effects are substantively significant. None of the published works we surveyed divided the effects of covariates into direct and indirect effects, or showed the partition of those effects at higher orders (including feedback effects). In short, no studies in our survey have used the partial derivatives matrix to explore the catalog of covariate effects.
The coefficient interpretation approach to deriving inferences from spatial econometric models in political science is to interpret the coefficients themselves, in isolation. Typically this involves estimating a spatial model, interpreting the coefficient of interest by noting its statistical significance and stating whether the ρ value is consistent with expectations of positive or negative spatial dependence. We believe that the practice of interpreting the coefficients in a SAR model comes from scholars' experiences with non-spatial OLS models. In non-spatial models, counterfactual shocks to the covariates—the effects of a change in x on y—are identical and consistent across observations. Since there is no spatial parameter (such as ρ) estimated in non-spatial models, the effects are entirely homogeneous and there are no indirect effects.
We argue that this practice of interpreting the coefficients in spatial econometric models cannot provide a complete and accurate picture of the underlying data-generating process. To understand why the coefficient interpretation approach is inappropriate, consider what the coefficients in a SAR model actually mean. The coefficient that receives the most attention in these spatial models is ρ, the global spatial autocorrelation parameter. The coefficients for the covariates, or βs, reflect the zero-order effect of a change in x i on y i. This is also called the direct effect and it is constant across observations, regardless of the spatial profile of the observations.
As we will demonstrate, the coefficients in a SAR model cannot provide accurate inferences about the effects of changes to the covariates because the effect size depends on the coefficients themselves (i.e., β), the strength of the spatial dependence (ρ), and each observation's spatial profile (as specified in W). The result is that scholars often make sweeping inferences about the effects of covariates on observations without actually calculating the effects for those observations. This would not be problematic if the effect size was identical across observations.Footnote 4 However, since the effect size depends on each observation's unique configuration of neighbors, the TE of x varies across observations. The coefficients (β and ρ) cannot convey x's total effect and they neglect the considerable variation in effect sizes across observations.
The general approach to interpreting spatial econometric models
In this section we introduce the general approach to interpreting spatial econometric models. This approach builds on the works discussed above and provides an accessible roadmap for interpreting spatial effects. In comparison to the most common practices in political science, the general approach provides more informative inferences that reflect the spatial processes at work.
Since the SAR model (Equation 2) is endogenous, it is more instructive to examine its reduced form equation:
Consider the simple example of W as a 3 × 3 contiguity matrix where observations 1 and 2 and 2 and 3 are connected but observations 1 and 3 are not (as shown by the 0 values in the lower-left and upper-right elements):
If we want to infer the effect of a single x on y, then one approach is to examine the partial derivatives matrix (LeSage and Pace, Reference LeSage and Pace2009):
where the resulting N × N matrix contains both the effects of x i on y i, or direct impacts (along the main diagonal), and the effects of x i on y j, or indirect impacts (the off-diagonal elements).
It should be noted that the presence of the spatial multiplier means that changes to x i for one observation influence not only neighbors but all observations with at least one neighbor (i.e., non-isolate observations) through higher-order effects. Some of these effects feed back and influence the originator of the impulse, observation i. This is illustrated by looking at the effects of β through the infinite series expansion of the spatial multiplier (IN − ρW)−1:
By design, the SAR model incorporates higher-order spatial effects. The partitioning of these effects (LeSage and Pace 2009, 40–41) becomes clear when we consider each component of the infinite series expansion (when multiplied by β) as representing the n-order effects (both direct and indirect). Table 1 presents the partitioned effects for the SAR model.
For example, the 0-order effects arise from INβ and only feature direct effects (due to the diagonal of 1s in the identity matrix):
The 1-order effects arise from ρ Wβ, and contain only indirect effects (since an observation is not a neighbor to itself, there are only 0s along the diagonal of the W, by construction):
Starting at the second-order (as determined by W2), there is the possibility of feedback effects, as the effects go from observation i to j and back to observation i (as shown along the main diagonal).
In this matrix, the non-zero values along the main diagonal represent those effects that feed back to the originator of the impulse. There are feedback effects on all three observations which are twice as strong in the case of observation 2 because of its connectivity with more observations (1 and 3). Higher-order effects beyond the second order are calculated similarly with the formulas in Table 1.
The partitioning of the higher-order effects in this manner can illuminate interesting features of the spatial process. First, the indirect effect of changes in x i on higher-order neighbors can be calculated so that one can observe the pattern with which those effects decline. Second, this allows for an assessment of relative magnitude of direct to indirect effects at each order of magnitude. Finally, partitioning the effects reveals how much of the TEs result from direct and feedback effects.
If there is a strong theoretical prior, it is appropriate to substitute the higher-order effects (i.e., W2) for second-order contiguity matrices (i.e., neighbors of neighbors). For example, in the following second-order contiguity matrix there are only zero values along the diagonal, which precludes the possibility of feedback effects. In this case, any higher-order effects that arise come solely from (true second-order) neighbors of neighbors, with the restriction that effects cannot feed back to the originator. In the application below we provide an example of this type of W.
The partial derivatives matrix (shown in Equation 7) produces an N × N matrix containing the direct and indirect effects for each observation. Since we calculate a different partial derivatives matrix for each covariate in the SAR model, the amount of information can quickly become overwhelming. One strategy for dealing with this proliferation of effects is to focus the inferences on a specific subset of observations and demonstrate the spatial processes for those observations. Another strategy is to calculate summary statistics of the effects, or average total, direct and indirect effects (LeSage and Pace, Reference LeSage and Pace2009). The average direct effect is the average change in y i given an increase of 1-unit in x i, and is calculated with the following:
The average indirect effect is the average change in y ¬i given an increase of 1-unit in x i, and is calculated with the following:
Essentially, this sums all elements of the partial derivatives matrix and then subtracts the direct effects (the trace). The average TE is the sum of these two summary statistics. These summary statistics are informative because they reveal the actual effect of a covariate, and can be used to make inferences about the percentage of the TE that arises due to spatial dependence.
Another possibility is to explore how observations respond to a shock in the outcome of a neighboring observation. Ward and Gleditsch (Reference Ward and Gleditsch2008) demonstrate how shocks in the outcome that are not attributable to the systematic component in observation i influence j.Footnote 5 We use the following formula:
If Δy is N × 1, the resulting matrix is also N × 1, and rows are the N responses to the change depicted in Δy. If Δy is a scalar, the resulting matrix is N × N, and rows are the N responses to the change depicted in Δy for the N observations (columns). For example, assume that Δy is an N × 1 column vector of 0s for all observations except for the 10th observation, which has a value of 1. The resulting N × 1 column vector shows how a 1-unit increase in y 10 influences all N observations. If Δy is a scalar with a value of 1, the resulting N × N matrix contains the previously mentioned information, for all N observations. In other words, the second row, fourth column element of the N × N matrix depicts how y 2 changes in response to a 1-unit increase in y 4. These types of effects do not occur in the SLX model because other observations can only be influenced by their neighbors through covariates (not the outcome).
Now consider the second most popular specification for spatial econometric models, the SLX (shown in Equation 16). We calculate the effect of x on y with the following:
In contrast to the infinite series expansion found in the SAR spatial multiplier, the indirect effect (θ) is only represented at the first-order because the diagonal elements of W are 0 by construction (since an observation is not a neighbor to itself). This has important consequences for the overall impact of x on y, since there are no indirect effects beyond the first-order. In other words, there is an impact of x i on i's first-order neighbors (j), but that is where the effect ends; the effect does not continue to second-order neighbors (k), and there is no feedback to observation i.
Fortunately, by altering the model specification in some minor ways, scholars can incorporate higher-order effects into their SLX models of spatial processes (Gibbons and Overman, Reference Gibbons and Overman2012). For instance if there is a theoretical justification for expecting higher-order effects up to the third-order, then one would estimate y = Xβ + θ1WX + θ2W2X + θ3W3X + ε with the following formula for the TEs for a single independent variable, x:
with θ 1 representing the first-order indirect effect of x, θ 2 representing the second-order indirect effect of x, and so on. Keep in mind that the restrictions dealing with the order of effects can be tested, so one strategy is to start with an n-order specification and then use hypothesis tests to pare down the model. One word of caution is that if there are few neighboring observations in the weights matrix, then higher-order representations of the W can induce multicollinearity and make it tougher to reject the null hypotheses. Dealing with this problem is not unlike that of multicollinearity in a traditional OLS framework. If possible, we recommend increasing the sample size. Given that we are often dealing with limited data, however, other potential options include carefully specifying the model based on theoretical expectations. If, for example, one expects that the higher-order parameters equal zero (i.e., θ n = 0) then just leave them out of the specification altogether. In either case it is important to know this problem can arise and to proceed with caution in the presence of correlated predictors, whether explicitly spatial parameters or not.
We present the partitioned effects for the third-order spatial-X specification (Equation 17) in Table 2. The partial derivatives matrix at each order of W provides the direct, feedback and indirect effects (if available). The direct effects (the effect of x i on y i) only occur at order 0 (represented by β). Indirect effects occur at the first-order and beyond (found along the off-diagonal), eventually generating nth-order feedback effects (found along the off-diagonal of the partial derivatives matrix). We should also note that in this example, by construction, there are no fourth-order effects (see Equation 17).
It is important to note that these quantities of interest are all based on the estimates from the models, and are themselves uncertain (King et al., Reference King, Tomz and Wittenberg2000). Since each of these quantities would require different formulas for the analytically derived standard errors, an easier solution is to use simulation-based methods. For example, this would entail drawing 1000 simulations from the multivariate normal distribution based on the parameters and variance-covariance matrix from the estimated model. We then generate quantities of interest based on the estimates of the explanatory variables and spatial dependence parameters, and use the percentile method to generate confidence intervals (Carsey and Harden, Reference Carsey and Harden2014).
The partial derivatives matrix therefore provides a simple way to easily compare the effects across spatial econometric model specifications. Assume, for instance, that we want to determine the effect of x i on y i (or the direct effect). In an SLX, this is β. As outlined above, for the SAR model the β only depicts the zero-order direct effect, and it ignores the feedback that occurs through other observations at higher orders. Likewise, determining the indirect effect of x i on y j from the coefficients is impossible in an SLX (without calculating θ W) and SAR (without calculating the partial derivatives matrix).
Advantages of the general approach
In addition to the advantage of comparing the effects sizes across multiple spatial model specifications, the general approach allows for a more accurate interpretation of the effects than the coefficient interpretation approach. In this section we provide mathematical demonstrations of how scholars use the two approaches to derive inferences about effect sizes. The coefficient interpretation approach produces two inferential errors in the process of interpreting SAR models: first, it is impossible to determine the TEs, and second, the β misstates the true direct effect.
To explore the advantages of the general approach, consider the following simple illustration. First we set β = 0.5 and ρ = 0.1, and create a 4 × 4 weights matrix depicting simple contiguity (W1). With the use of the general approach and Equation 7, we calculate the following partial derivatives matrix:
The largest weakness of the coefficient interpretation approach is that it is impossible to determine the TE of x on y (depicted in the column vector on the right side of Equation 18) because we cannot assess the indirect effect (the off-diagonal elements of the partial derivatives matrix), or its accompanying feedback effect (the diagonal elements of the partial derivatives matrix) without the weights matrix.
On average, the only thing that can be inferred from the coefficients is the direct, zero-order effect of x i on y i, represented by the β (in this illustration, β = 0.5). Even this, though, actually misstates the true effect of x i on y i because it ignores the feedback effects. The feedback effect, or the effect of i on itself through its neighbors, is the difference between the coefficient (β = 0.5) and the direct effect (the diagonal along the partial derivatives matrix). The feedback effects make up a small portion of the overall effects, 0.005 for the first and fourth observation, and 0.01 for the second and third. If there is positive spatial dependence (i.e., ρ > 0), the β understates the direct effect, making it closer to zero than it should be; if there is negative spatial dependence (i.e., ρ < 0), the β overstates the true effect.Footnote 6 In our example we can summarize the impact of x by noting the average TE (0.590), the average direct effect (0.508), and the average indirect effect (0.082).
The inferences from the coefficient interpretation approach grow even more disconnected from those warranted based on the spatial model when examining the effects of individual observations. The coefficients themselves provide no sense of how x influences each observation via indirect effects, so we are unable to see that the second and third observations experience the largest TEs of x (column vector in Equation 18). The consequence is that interpreting the coefficients ignores the often considerable variation in effect size across observations. If the interest is in the effects of the covariates under circumstances observed in the sample, then examining the individual elements of the partial derivatives matrix is most appropriate.
Next, consider a different weights matrix that measures the varying degrees of connectedness between observations. It depicts relative distance on a single dimension from 0 to 10, where the W is the inverse absolute distance (W2), or 1/abs(p a − p b).Footnote 7 The four observations are located at 0.5, 4.5, 5.5, and 7. Note that there are three observations that are in close proximity to each other (observations 2, 3, and 4), and the remaining observation (Equation 1) is more isolated.
On average, 13.6 percent of the overall effect of x on y can be attributed to spatial dependence.Footnote 8 The partial derivatives matrix shows that the more isolated observation has much smaller indirect effects and TEs because the spatial processes are not as powerful given the smaller values of the elements of W2 for that case (meaning observations that are farther away).
A common strategy is to row-standardize the weights matrix by dividing each element by the row total. This example shows that row-standardizing the weights matrix—often done by default, without much theoretical considerationFootnote 9 —dramatically changes the inferences that are made about the in-sample variation in effect sizes. The row-standardized version (not shown) forces the average total effects to be the same; each observation is equally influenced by other observations, whether they are closely clustered or relatively isolated. For example, a row-standardized geographic weights matrix would force New Zealand to have the same total spatial effect as Belgium. For a variety of substantive research questions, this assumption might not make sense.
It is also illuminating to examine theoretically interesting elements of the partial derivatives matrix. This strategy involves calculating effects for particular regions or time periods so that it can be determined which observations influence other observations and to what extent. We suggest using this as a complement to overall summary statistics representing the average because there are some situations in which average statistics will lead to inaccurate inferences, most notably the case of isolates (sometimes called “islands”). As explained above, isolate observations have no neighbors in the W. For example, consider W3 which is identical to W1 except that observation #1 is now an isolate.
The summary statistics—the average total (0.559), direct (0.505), and indirect effects (0.054)—are completely unrepresentative of the isolate. Isolates are not influenced by other observations (i.e., 0s in the first row) and do not influence other observations (i.e., 0s in the first column), so the effect of x on y for such cases comes from only the direct effect.Footnote 10 Having an isolate also drags the average total, direct, and indirect effects closer to zero (in the case of ρ > 0), making those summary statistics less accurate for the non-isolates as well. Thus, when there are isolates in the weights matrix (or any other observations with extreme values of the W relative to the other observations), it is important to view the actual components of the partial derivatives matrix in combination with the summary statistics.
An applied example of spatial interpretation
In this section we illustrate how the coefficient interpretation approach and the general approach apply to spatial models of the diffusion of democracy. It has been argued that actors in one country observe successful political actions in other countries and emulate their strategies, with each successful case easing and hastening transitions in other countries Huntington (Reference Huntington1991). Most often, these processes are theorized to occur through geographical proximity (Brinks and Coppedge, Reference Brinks and Coppedge2006; Gleditsch and Ward, Reference Gleditsch and Ward2006), but they have also been theorized to occur through other mechanisms (e.g., communications technology—see Pierskalla and Hollenbach (Reference Pierskalla and Hollenbach2013) and Garcia and Wimpy (Reference Garcia and Wimpy2016) for recent examples).Footnote 11
Building on these examples, we estimate a SAR model of Freedom House's 14-point democracy score (reversed so that a value of 14 represents the most political rights and civil liberties) on logged GDP per capita (from the Penn World Table). In an effort to make the illustration of these interpretive methods as simple as possible, we estimate our model on a single year (1994), assume that there are no temporal autoregressive effects (i.e., no need for a lagged dependent variable), and assume that wealth only has concurrent effects (i.e., no lagged values). Furthermore, we specify that the pattern of spatial diffusion is primarily geographic and is one that occurs through contiguous neighbors (either land contiguity or <150 miles over water). To control for common shocks that influence all countries within a region similarly, we include region fixed effects. In short, this model is quite oversimplified and therefore should not be used as the basis for inferences about democratic diffusion.
We present four spatial models of the impact of wealth on democracy: one SAR and three SLX. The SAR model follows the simple specification, where x represents logged GDP per capita, W is a row-standardized geographic contiguity weights matrix, and ρ is the spatial autocorrelation coefficient. The three SLX models vary in two meaningful ways: whether higher-order effects are possible (Model 3 and 4), and whether feedback effects are possible (Model 3).
Table 3 shows the results from the four models. As we expected, democracy exhibits positive spatial autocorrelation (i.e., ρ > 0) and wealth has a positive effect on democracy (i.e., β > 0). However, in the discussion above, we identified a number of reasons why interpreting the SAR coefficients does not provide a satisfactory way of making comparable inferences about the relationships between x and y or the diffusion process. Moreover, it is difficult to compare the coefficients of logged GDP per capita across the SAR and SLX models. Put simply, each observation has a different spatial profile and the various model specifications partition the spatial diffusion process in different ways.
Note: *: p-value < 0.1; all models include regional fixed effects.
We first explore these effects in tabular format (Table 4). Since each observation has a different set of neighbors, the effects of an explanatory variable will vary based on each observation's spatial profile (potentially producing N sets of effects). A simple, yet generalizable manner of making these effects comparable is to calculate the mean across all observations; each row of Table 4 shows the mean total, direct and indirect effects at each order of “interconnectedness” (in this case, geographic contiguity). The bottom row shows the mean sum of the total, direct and indirect effects.
Note: The 90% confidence intervals in parentheses were calculated with the percentile method based on 1000 simulations.
Partitioning the effects in this manner offers a number of inferences. First, direct effects occupy the vast majority (almost 82 percent) of TEs of logged GDP per capita on democracy (shown in the final row of Table 4). An increase in wealth in country i has the most impact on country i, though there is still a substantively important average indirect effect (0.26). Note that adding up across the four orders approximates—but does not equal—the mean sum of the TEs at the bottom of the table. Recall that the TEs are the result of the infinite series expansion, so adding up across the displayed orders ignores the higher order effects that are quite small. Second, the diffusion effects dissipate quite quickly as the order of neighbors increases. In this case, almost 80 percent (0.207/0.26) of the indirect effects occur among first-order neighbors of observation i. Third, we can view the amount of direct effects that are caused by feedback through neighbors of neighbors; in this case, only 1.05 percent of the average direct effect is attributed to feedback to the originator of the shock, country i (i.e., 0.012/1.144).
Graphical methods are also useful for portraying the relationships across the connectivity criterion (in this case, geography).Footnote 12 In the case of Figure 1, we illustrate the effects of a 1-unit increase in logged GDP per capita in Bolivia in 1994 on its neighbors' democracy scores (SAR Model 1). This reveals both the direct (0-order) and indirect (first-order and higher) effects. Darker shades indicate larger, more positive effects. The far-left map shows the 0-order effect, or the direct effect (i.e., the impact on Bolivia), which is identical to the coefficient for logged GDP per capita in Model 1. These values are found along the diagonal of the Iβ matrix, and are the same regardless of the originating country (see Equation 9). The second map reveals how Bolivia's wealth expansion influences its first-order neighbors: Peru, Brazil, Paraguay, Argentina, and Chile. Of those first-order neighbors, Bolivia has a larger impact on Paraguay and Chile because they have fewer total neighbors than the other states. Put differently, since all states are equally influenced in total by their neighbors (because of row-standardization), and Paraguay and Chile have fewer neighbors, Bolivia has a larger influence on them. It is also important to note that Bolivia has a value of 0 (along with non-neighbors) because these first-order effects are found in the ρ Wβ matrix, which has 0 values along the diagonal (see Equation 10).
The next map reveals Bolivia's influence on its second-order neighbors. Since every country in South America is at least a second-order neighbor with Bolivia (i.e., neighbors of a neighbor of Bolivia), then every country's democracy score responds to Bolivia's wealth expansion. It is also the case that Bolivia experiences feedback from influencing its neighbor's democracy scores, which feeds back to further influence its own democracy level. The higher-order effects are quite small, which is a result of the ρ values being raised to higher orders.
The last three columns of Table 3 present three versions of the SLX model. Note that the presence of one zero-order W term means that i only influences its neighbors (first-order indirect effects). Based on the general approach, it is easy to see both the direct (β) and indirect effect (θ 1) of logged GDP per capita. As expected, increasing observation i's logged GDP per capita by 1 unit increases its Freedom House score by 0.93 points; a similar increase in all of observation i's neighbors results in a shift of 0.82, on average, in the Freedom House score for observation i. One benefit of an SLX model is clearly the ease of interpretation.
On the other hand, Model 2's specification does not allow for the higher-order effects that SAR models offer. In the second and third SLX models we change the model specification to allow for first-, second-, and third-order effects, though they differ in their patterns in meaningful ways. The second model uses the squared and cubed versions of the W:
The third model uses second- and third-order contiguity matrices:
The difference between these two sets of matrices is demonstrated by Equations 11 and 12, and this difference leads to distinct patterns of diffusion. In the first case, the effects of logged GDP per capita are allowed to feedback to influence observation i through second- and third-order effects; in the second case, the effects are restricted to only those who are second- and third-order neighbors. Though the considerable multicollinearity in this model (e.g., the variance inflation factor exceeds 80—note our discussion of this above) inflates the standard errors, we can see that there are positive patterns of diffusion in the first- and second-order neighbors, but a large retrenchment in the third-order. Model 4 offers more precise higher-order contiguity matrices which results in smaller indirect effects. To assist in the interpretation of the quantities of interest and to aid in comparison of effects across model specifications, in Table 5 we provide the average total, indirect and direct effects for the three SLX models.
Note: 90% confidence intervals in parentheses.
At first glance, the coefficient interpretation approach might seem useful, given that the TEs at each order (as well as their sum represented by the overall TEs) are represented by the coefficients in Table 3. Beyond that, however, the coefficients become less intuitive. In fact, the coefficients for Model 3 in Table 3 cloud important patterns in the spatial diffusion process because they are unable to partition the direct and indirect effects at higher orders. Simply interpreting the coefficients (θ 2 and θ 3) in Model 3 overstates the degree to which wealth shocks in country i influence other countries because it understates the role of feedback effects (i.e., those in the “Direct” column at higher orders). This is not a problem at the lower orders, 0 and 1, since those effects can only be direct and indirect, respectively, or in Model 4, by construction.
The general approach also offers direct comparison of effects across SAR and SLX models in a manner that the coefficients in Table 3 cannot. The first inference derived from the model comparison is that diffusion plays a larger role in the TEs in the SLX model than the SAR model; while indirect effects are about 18 percent of the TEs in the SAR model, in the SLX these effects are much higher (47, 30, and 43 percent, for Models 2–4). The implication is that a shock in wealth for one country produces greater spillover into other countries, but less direct effect in the originating country, in the SLX models compared to the SAR model.
The SAR model offers an advantage over the SLX model in terms of parameter parsimony; it estimates a single parameter that encompasses the entire pattern of spatial diffusion. The downside of this parsimonious model specification is that it forces the higher-order diffusion processes to follow an exact geometrically-defined pattern. For example, the diffusion process must have the same sign at all orders, and the process declines at an exponential rate (due to the ρ value being raised to increasing exponents in the series expansion). The SLX model relaxes this assumption, and as a result we can observe a large retrenchment in the third-order of the spatial process. This is a crucial component of the diffusion process that cannot be inferred from the SAR model.
In Figure 2 we demonstrate another graphical method for depicting diffusion across states that shifts the focus away from the connectivity criterion (geography) and provides more detail about the exact size of the effects. By doing so, the size of the diffusion process can be compared across states. The top half of Figure 2 shows the TEs of the wealth increase for Bolivia across four orders (SLX Model 3); the zero order represents the direct effect and orders two and three reflect the feedback from influencing its own neighbors. We can then compare the size of those effects to the remaining South American countries (shown in the bottom half). The size of the effects depends on two factors: first, the magnitude of the increases is determined by the total number of neighbors that the country has; because Chile and Paraguay have fewer neighbors, Bolivia's increase is more impactful. Second, the particular pattern depends on whether the countries are first-, second-, or third-order neighbors with Bolivia (the only zero-order neighbor is Bolivia). The relatively few countries in the continent mean that all countries are second- and third-order neighbors with Bolivia, so the main distinction is whether they are first-order neighbors. If not, they have a value of 0 for the first-order.
Conclusion
In this paper we have introduced a general approach to spatial models of political phenomena. We offer the following three suggestions when interpreting spatial econometric models. First, we recommend that researchers go beyond a simple interpretation of the estimated coefficients. Overstating the importance of a global average (e.g., the ρ coefficient from a SAR model) does little to elucidate the nuance at work in most spatial processes. We further emphasize the importance of thinking critically about the interpretation of global coefficients and the limits of what they tell us.
Second, we suggest the use of the partial derivatives approach to identify the spatial components of the covariates. We have proposed measures to summarize the average total, direct and indirect effects. In some observations, such as those with unique spatial profiles, these summary measures may not be representative. Since each observation has its own spatial profile (and its own direct and indirect effects), it is often useful to move beyond statistics that summarize the entire sample and instead focus on meaningful selections of observations. We have also proposed that graphical methods are a powerful tool for presenting spatial effects—especially across a range of different specifications. Just as space can be more than geography (Beck et al., Reference Beck, Gleditsch and Beardsley2006), spatial graphics can be more than maps (see Fig. 2) and this approach has many applications.
Third, we recommend that researchers identify and estimate meaningful quantities for observations of interest. As with other interpretive methods (e.g., King et al., Reference King, Tomz and Wittenberg2000), the general interpretation approach starts with the assumption that the model is properly specified. Therefore, with our suggestions come the standard spatial precautions including being wary of Galton's problem (causality), local versus global theoretical expectations, model selection, and specification of spatial connectivity via the weights matrix.
There is certainly more work to be done with the application of spatial models to political science. Our argument in this paper has been that proper interpretation represents a major barrier to fully realizing the potential of these models for the applied researcher. With this in mind, we make an additional call for better replication practices in the application of spatial modeling in political science. Very few of the papers we identified in our survey included stand-alone versions of the spatial weights matrix. In some cases it was pre-multiplied to create spatial lags while in others it was not provided at all—effectively making it impossible to reproduce the results much less to try alternative specifications and robustness checks. The weights matrix is an integral part of the data used in spatial applications and should be treated with the same respect and transparency as any other aspect of the science. The replication revolution in major political science journals would be further advanced by requiring this additional information when scholars employ spatial applications.
Supplementary Material
The supplementary material for this article can be found at https://doi.org/10.1017/psrm.2019.9
Author ORCIDs
Cameron Wimpy, 0000-0002-2049-5229.
Acknowledgements
An earlier version of this paper was presented at the conference on “Modeling Politics in Time and Space” held at Texas A&M University on April 27–28, 2017. The authors thank conference participants for their helpful feedback on the paper at that conference.