Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-05T17:56:31.684Z Has data issue: false hasContentIssue false

Measuring Constituency Ideology Using Bayesian Universal Kriging

Published online by Cambridge University Press:  16 April 2021

Jeff Gill*
Affiliation:
American University, Washington, DC, USA
*
Jeff Gill, Department of Government, American University, Washington, DC 20016, USA. Email: jgill@american.edu
Rights & Permissions [Opens in a new window]

Abstract

In this article, we develop and make available measures of public ideology in 2010 for the 50 American states, 435 congressional districts, and state legislative districts. We do this using the geospatial statistical technique of Bayesian universal kriging, which uses the locations of survey respondents, as well as population covariate values, to predict ideology for simulated citizens in districts across the country. In doing this, we improve on past research that uses the kriging technique for forecasting public opinion by incorporating Alaska and Hawaii, making the important distinction between ZIP codes and ZIP Code Tabulation Areas, and introducing more precise data from the 2010 Census. We show that our estimates of ideology at the state, congressional district, and state legislative district levels appropriately predict the ideology of legislators elected from these districts, serving as an external validity check.

Type
Original Article
Copyright
© The Author(s) 2020

In the study of state politics, a constant struggle when studying representation in the American republic is finding reliable measures of public sentiment for the constituencies elected officials serve. To see the degree to which voters shape or constrain legislators’ actions, a sense of where the voters stand is critical. However, it is hard to find public opinion surveys that are taken at regular intervals, include respondents from all districts of interest, and have a large enough sample for constituency-based subsets of respondents to be big enough to obtain meaningful district-based measures of opinion. For example, if we want to consider how state-level public ideology affects U.S. senators’ behavior in roll call votes, there are scarce options for surveys that cover all 50 states, include a large sample size in each state, and are observed at regular intervals.Footnote 1 This problem led Erikson, Wright, and McIver (Reference Erikson, Wright and McIver1993) to address this issue by pooling several CBS/New York Times polls over time to create a static measure of state ideology, thus sacrificing temporal change to obtain respectable state-level sample sizes. The problem is exacerbated in studies of U.S. House members, which require coverage in 435 smaller districts, and the problem becomes an order of magnitude harder in the study of state legislators (requiring coverage in 1,972 upper chamber districts and 5,411 lower chamber districts). Thus, there is a running challenge in measuring constituency-level public opinion, particularly in smaller districts.

There are several primary strategies for dealing with the difficulty of measuring public ideology. The first is to simply subset the survey data by the unit of geographic distinction, which has the advantages of being simple and relying on direct observations of individuals. The main problem with this approach, however, is that the sample size can become quite small even at the state level, much less when looking at districts for the state legislature, and even more challenging when studying demographic subgroups. In addition, many surveys such as the American National Election Study stratify on region, so subsamples are not going to be representative at the state level or lower. A second approach is to use election returns as a proxy for ideology in a district (Ansolabehere, Snyder, and Stewart Reference Ansolabehere, Snyder and Stewart2001; Berry et al. Reference Berry, Ringquist and Hanson1998; Erikson and Wright Reference Erikson and Wright1980). This approach either uses presidential returns (with the logic being that because the candidates’ ideologies are constant nationwide, the vote share will change only in response to the median voter) or uses votes in congressional races (scaling vote shares with measures of the ideology of both incumbents and challengers). While vote-based measures use abundant data that are simple to gather, vote choice is conceptually distinct from ideology. Besides general ideology, votes might be based on regional appeals, personality traits, or economic well-being, thereby inducing added measurement error. Also, vote choice alone may be a misleading measure in that it does not account for the relative dispersion of ideological positions in a district (Kernell Reference Kernell2009). A third possibility is to use poststratification, which fits a training model based on survey data and then uses that model to forecast public opinion based on known population data. Several scholars have used weighting and forecast-based measures of public opinion over the years (Jackson Reference Jackson1989; Reference Jackson, Box-Steffensmeier, Brady and Collier2008; Pool, Abelson, and Popkin Reference Pool, Abelson and Popkin1965; Weber et al. Reference Weber, Hopkins, Mezey and Munger1972; Weber and Shaffer Reference Weber and Shaffer1972). The most recent technology is to use multilevel regression with poststratification (MRP), which finds constituency-specific random effects in the survey data (Gelman and Little Reference Gelman and Little1997; Lax and Phillips Reference Lax and Phillips2009; Park, Gelman, and Bafumi Reference Park, Gelman and Bafumi2004; Reference Park, Gelman, Bafumi and Cohen2006). Tausanovitch and Warshaw (Reference Tausanovitch and Warshaw2013) even estimate ideology using MRP in small areas, including state legislative districts and cities, and their measures perform well. The MRP idea of incorporating a constituency-specific random effect is reasonable because of all of the unobserved factors that can shape public sentiment in an area. However, it can be improved upon because the geographic variation in random effects may be even more precise than defined borders dictate.

A fourth option, which we build on, is the universal kriging approach developed by Monogan and Gill (Reference Monogan and Gill2016). Universal kriging follows a similar logic to MRP but uses covariate values measured at the most precise geographical level possible and a smoothed residual structure over geographic space to improve forecasts. Kriging is different from the approach for Selb and Munzert (Reference Selb and Munzert2011), who still use the MRP approach, but allow the random effects of bordering constituencies to correlate.Footnote 2 By contrast, the smoothed structure does not abruptly break at border definitions, and it is simpler to make forecasts from it even in constituencies without observed survey respondents. While the previous work shows that kriging produces externally valid measures of public sentiment, this study improves on that method in several ways. First, previous work ignored Alaska and Hawaii as discontiguous states. Here, we propose a solution of relocating these states next to their ideological neighbors in the contiguous 48 states to obtain measures of ideology in all 50 states. Second, the previous work erroneously located survey respondents with ZIP Code Tabulation Areas (ZCTAs), when the survey recorded respondent ZIP code. These are not equivalent, so we address this problem here. Third, we improved upon prior work by using newer data from the 2010 Census, and we specifically kriged with much more precise information. The 2010 Census reports data at the census block level, allowing us to draw simulated citizens closely in line with population density. Covariates also are now sampled from the most precise possible level—often the block level itself. Hence, our estimates should be more accurate in smaller constituencies. Fourth, we apply this method not only at the state level but also in congressional and state legislative districts. Consequently, a product of our work is that we now release for public use measures of public ideology in 2010 for the 50 states, 435 congressional districts, districts for upper chambers of state legislatures, and districts for state legislative lower chambers. We show that our measures perform well on several validity checks, as they not only correlate with other measures of district ideology but also serve as good predictors of elected legislators’ behavior. Altogether, this work represents a marked advance in the universal kriging technique.

We proceed first by reviewing the model itself, why it is substantively sound, and the data we use in this application. Second, we describe in detail the new advances that we make in how universal kriging can be applied to measuring public opinion. Third, we describe the results from our estimated model using 2008 Cooperative Congressional Election Study (CCES) data. Fourth, we present our forecasted measures of ideology and several validity checks. Fifth, we offer an example of how measures like ours can be used in an applied setting to analyze the behavior of state legislators. Finally, we describe the implications of our study and room for future work.

Point-to-Block Realignment with Universal Kriging

Our method for translating a public opinion survey into measures of constituency-level ideology follows the logic of point-to-block realignment (Banerjee, Carlin, and Gelfand Reference Banerjee, Carlin and Gelfand2015, chap. 7; Monogan and Gill Reference Monogan and Gill2016). The intuition behind this technique is to estimate a model of observations that are located at points in space (such as latitude and longitude), make several predictions from this model at a wider range of points in space using known covariate values, and then use the predictions falling within a block (or border-referenced area in space) to produce a block-level forecast. In our case, we locate survey respondents in geographic space using known information about their address (treating them as points in space), use population Census data at various geographic locales to make predictions throughout the United States, and then average all predictions falling within an electoral district to determine the average ideology of that constituency. Hence, the 50 states, 435 congressional districts, or state legislative districts form our block, or areal, units of interest in this point-to-block realignment.

Meanwhile, our middle step of using population Census data to make forecasts of several simulated citizens in districts across the nation follows a similar logic to weighting, regression, or MRP techniques, except our forecasts include a spatial error term that borrows strength from nearby observed survey respondents. To do this, the model we estimate over our training data must be a kriging model that allows for covariance among geographically proximate respondents. Kriging has had some uses in political science, both in predicting potential campaign contributions at residences (Tam Cho and Gimpel Reference Tam Cho and Gimpel2007) and the wind direction at major pollution sites (Monogan, Konisky, and Woods Reference Monogan, Konisky and Woods2017). Amos, McDonald, and Watkins (Reference Amos, McDonald and Watkins2017) also apply kriging to addressing problems of cross-cutting boundaries between voting precincts and census blocks. The two general types of kriging are ordinary kriging, which relies purely on a spatial error process to make predictions, and universal kriging which also allows spatial trend terms and even location-specific covariates to shape the prediction. We follow the universal kriging approach advanced by Monogan and Gill (Reference Monogan and Gill2016), which uses a linear prediction based on demographic predictors and a polynomial trend term, plus the spatial error prediction. The nice feature of this is that our spatial error term forms a density blanket such that we can make predictions for any constituency spanned by our respondents’ locations, even if there were no observed survey respondents within the district of interest.

Figure 1 illustrates the steps of point-to-block realignment. Step (1) is to fit a preliminary linear model with ordinary least squares (OLS). Here, we try several specifications of the OLS model to gauge the proper functional form for a trend term in longitude and latitude (or eastings and northings) based on model fit. In Step (2), we can examine the OLS residuals from the best-fitting model in the prior step and determine what the best-fitting functional form of the spatial error process is. That is, given the covariates and our chosen geographic trend term, how do our errors spatially correlate and what function best summarizes that correlation structure? Possible error process models for the residuals include (among others) the exponential, Gaussian, spherical, wave, or Matérn processes (Banerjee, Carlin, and Gelfand Reference Banerjee, Carlin and Gelfand2015, 25–30). Step (3.A) is to estimate the Bayesian model with survey bootstraps. This model treats the conditional mean of ideology as a function of individual covariates and the geographic trend term, and it simultaneously estimates the parametric error structure decided on in the previous step. Due to computational limitations, the model is estimated for subsets of the survey data, each a random draw from the larger survey sample. For each run of the model over a bootstrap sample, we compute the mean of the posterior distribution of the parameters. After estimating the model, we can proceed in two ways. In Step (3.B), we can summarize our model’s results by reporting descriptive statistics from the pooled parameter means across the bootstrap runs. Meanwhile, in Step (4.A), we can begin forecasting with bootstrapped Census data. With the forecasting, we take a set of bootstrapped results from our training model, and we forecast over a random sample of Census data. We draw a fresh sample of the Census data for each bootstrapped sample. The first half of our samples are drawn in proportion to population density, while the last half include at least one draw of each of the nation’s 23,764 census tracts. In this way, we forecast for a wide range of individuals in our kriged census sample at locations spread throughout the nation, with highly populated areas getting the most attention. In Step (4.B), our final step, we summarize our forecasts by district: we simply pool all of our kriged census individuals from bootstrapped forecasts and organize the larger pool into the districts of interest in which these simulated citizens reside. Once organized based on constituency, we compute descriptive statistics of these kriged forecasts by district. This provides us with district-level forecasts of the mean and variance of ideology—be the district a state, congressional district, state legislative district, or something else.

Figure 1. Flowchart showing the steps of point-to-block realignment.

Note. OLS = ordinary least squares.

Specifying the Model

The method of point-to-block realignment assumes that the observed point-level (person) data and the extrapolated block-level averages (districts) have a joint Gaussian distribution. We start by specifying how the training side of the model works crvey data. Define now $ \mathbf{s} $ as a set of $ n $ observed sites $ \{{\mathbf{s}}_1,{\mathbf{s}}_2,\dots, {\mathbf{s}}_n\} $, where each $ {\mathbf{s}}_i $ represents the location of a survey respondent in space—either in latitude and longitude, or in northings and eastings (as we use in this application).Footnote 3 Here, $ \mathbf{Y}(\mathbf{s}) $ is an associated collection of outcomes $ \mathbf{Y}(\mathbf{s})=\{Y({\mathbf{s}}_1),Y({\mathbf{s}}_2),\dots, Y({\mathbf{s}}_n)\} $, the survey response of interest for the survey-taker at each site. $ {\mathbf{X}}^{*}(\mathbf{s})=\{{\mathbf{x}}^{*}({\mathbf{s}}_1),{\mathbf{x}}^{*}({\mathbf{s}}_2),\dots, {\mathbf{x}}^{*}({\mathbf{s}}_n)\} $ is a collection of covariates for each survey respondent observed at his or her respective point in space. We specify a linear model as follows:

(1)$$ \mathbf{Y}(\mathbf{s})=\mu (\mathbf{s})+\omega (\mathbf{s})+\kern0.50em {\mathbf{\epsilon}} \ (\mathbf{s}), $$

where $ \mu (\mathbf{s})=\mathbf{X}(\mathbf{s})\beta $ is the mean structure based on a linear additive component (like a standard regression model), $ \omega (\mathbf{s}) $ are realizations from a mean-zero stationary spatial process that captures spatial association (closer points are more informative than distant points), and ϵ(s) is a regular uncorrelated disturbance term.

An important feature of equation (1) is that the variance is split into two disturbance terms: one that captures spatial association, and the other that is a traditional independent and identically distributed error term with homoscedastic variance. We thereby use the following distributional assumptions for these two terms: $ \omega (\mathbf{s})\sim \mathcal{N}(0,{\sigma}^2\mathbf{H}(\phi )) $ and $ \kern0.50em {\mathbf{\epsilon}} (\mathbf{s})\sim \mathcal{N}(0,{\tau}^2\mathbf{I}) $. Several of the parameters of these variance components have a substantive interpretation. From the idiosyncratic error term, ϵ, we call the variance term $ {\tau}^2 $ the nugget. This is the amount of error variance in the outcome that is independent from spatial separation. We can think of this as the variance in the error when the geographic separation between observations is negligible. Turning to the spatial $ \omega $ error terms, $ {\sigma}^2 $ is called the partial sill. The partial sill reflects the variance that can be driven by geographic distance between two observations, with the assumption being that more distant observations have a higher variance. The partial sill equals the maximum amount of variance among observations due strictly to geographic separation. In fact, the nugget plus the partial sill equals the sill, which is the maximum total variance possible among distant observations. Finally, the other parameter feeding into the spatial $ \omega $ error terms is the range term, $ R=1/\phi . $ ($ \phi $ itself is called the decay term.) The larger the range term is, the farther the distance between observations before the variance among those observations equals the sill. In other words, the range term helps gauge the distance at which error variance is maximized.

The last piece of specifying $ \omega (\mathbf{s}) $ is that we must specify the function $ \mathbf{H}(\phi ) $. This is a parametric spatial correlation function that typically only requires us to estimate the decay parameter $ \phi $. We typically assume an isotropic model, which means that the level of spatial correlation does not depend on direction but only on the distance between the observations $ {d}_{ij}=\parallel {\mathbf{s}}_i-{\mathbf{s}}_j\parallel $. In this case, we must choose a parametric model—the exponential, spherical, wave, and Gaussian are a few common options—that captures the patterns of residual association in our data. Each of these parametric models specifies both a spatial correlation function (stating simply how much observations should correlate given their distance apart) and a semivariogram function (specifying how much observations should vary given their distance apart). The two fit naturally together with a high correlation implying a low variance and vice versa. Once we determine the best-fitting parametric correlation function, the product of the correlation function $ \mathbf{H} $ with the partial sill $ {\sigma}^2 $ builds the spatial covariance structure into the joint distribution of the $ \omega (\mathbf{s}) $ disturbance terms.

When determining the exact parametric specification of $ \mathbf{H} $, we normally focus on the related semivariogram to determine the right parametric structure. We choose the right model through an empirically driven process, wherein the empirical semivariogram is calculated from the residuals of an initial model. The formula for the empirical semivariogram is (Cressie Reference Cressie1993, 69):

(2)$$ \widehat{\gamma}(d)=\frac{1}{2|N(d)|}\sum_{(i,j)\in N(d)}|z({\mathbf{s}}_i)-z({\mathbf{s}}_j){|}^2, $$

where $ z({\mathbf{s}}_i) $ is the residual term for the respondent located at site $ {\mathbf{s}}_i $ from an initial linear model, $ d $ is an approximate distance of interest (possible distance values are usually coarsened into bins), $ N(d) $ is the set of all pairs of observations such that $ |z({\mathbf{s}}_i)-z({\mathbf{s}}_j)|\approx d $, and $ |N(d)| $ is the number of pairs in the set that are separated by distance $ d $. The semivariogram equals two quantities: the variance of all observations separated by distance $ d $ when pooled together, as well as half the variance of the differences $ (z({\mathbf{s}}_i)-z({\mathbf{s}}_j)) $ between observations separated by distance $ d $. Using the empirical semivariogram, we then determine which parametric model is most appropriate for our data, choosing from the exponential, spherical, wave, Gaussian, or some other parametric semivariogram. Once we have that, we know the related spatial correlation function (Banerjee, Carlin, and Gelfand Reference Banerjee, Carlin and Gelfand2015, 28–29). In our case here, the best-fitting model is the Gaussian semivariogram, so implies that our spatial correlation function should be $ H{(\phi )}_{ij}=\mathrm{exp}(-{\phi}^2{d}_{ij}^2) $.

With all of these elements in place for modeling the responses of survey takers, we now step back and think about where this training model fits relative to our forecasting process of state, congressional district, and state legislative district ideology. As our model assumes the observed point-level data and the extrapolated block-level averages have a joint Gaussian distribution, we get

$$ f((\begin{array}{c}{\mathbf{Y}}_{\mathbf{s}}\\ {}{\mathbf{Y}}_{\mathbf{B}}\end{array})|\beta, {\sigma}^2,\phi )=\mathcal{N}((\begin{array}{c}{\mu}_{\mathbf{s}}(\beta )\\ {}{\mu}_{\mathbf{B}}(\beta )\end{array}),{\sigma}^2(\begin{array}{cc}{\mathbf{H}}_{\mathbf{s}}(\phi )& {\mathbf{H}}_{\mathbf{s},\mathbf{B}}(\phi )\\ {}{\mathbf{H}}_{\mathbf{s},\mathbf{B}}^T(\phi )& {\mathbf{H}}_{\mathbf{B}}(\phi )\end{array})), $$

:

where $ {\mathbf{Y}}_{\mathbf{s}} $ represents the vector of ideology among individual citizens, $ {\mathbf{Y}}_{\mathbf{B}} $ represents the vector of ideology in all block-referenced constituencies of interest, and defines the correlation matrix of observations as before. Note that this presents the simplified case where there is no nugget effect $ ({\tau}^2) $, but the result still holds if the variance–covariance terms do include a nugget. By standard normal theory (e.g., Ravishanker and Dey Reference Ravishanker and Dey2002), the conditional distribution of our extrapolated block averages is:

(3)$$ \begin{array}{l}{\mathbf{Y}}_{\mathbf{B}}|{\mathbf{Y}}_{\mathbf{s}},\beta, {\sigma}^2,\phi \sim \mathcal{N}({\mu}_{\mathbf{B}}(\beta )+{\mathbf{H}}_{\mathbf{s},\mathbf{B}}^T(\phi ){\mathbf{H}}_{\mathbf{s}}^{-1}(\phi )({\mathbf{Y}}_{\mathbf{s}}-{\mu}_{\mathbf{s}}(\beta )),\\ {}\kern0.50em {\sigma}^2[{\mathbf{H}}_{\mathbf{B}}(\phi )-{\mathbf{H}}_{\mathbf{s},\mathbf{B}}^T(\phi ){\mathbf{H}}_{\mathbf{s}}^{-1}(\phi ){\mathbf{H}}_{\mathbf{s},\mathbf{B}}(\phi )]).\end{array} $$

These quantities can be estimated with Monte Carlo integration:

$$ {({\widehat{\mu}}_{\mathbf{B}}(\beta ))}_k={L}_k^{-1}\sum_{\mathrm{\ell}}\mu ({\mathbf{s}}_{k\mathrm{\ell}};\beta ) $$

$$ {({\widehat{H}}_{\mathbf{B}}(\phi ))}_{k{k}^{\prime }}={L}_k^{-1}{L}_{k^{\prime}}^{-1}\sum_{\mathrm{\ell}}\sum_{{\mathrm{\ell}}^{\prime }}\rho ({\mathbf{s}}_{k\mathrm{\ell}}-{\mathbf{s}}_{k^{\prime }{\mathrm{\ell}}^{\prime }};\phi ) $$

$$ {({\widehat{H}}_{\mathbf{s},\mathbf{B}}(\phi ))}_{ik}={L}_k^{-1}\sum_{\mathrm{\ell}}\rho ({\mathbf{s}}_i-{\mathbf{s}}_{k\mathrm{\ell}};\phi ) $$

We conduct this Monte Carlo integration using the technique of Bootstrapped Random Spatial Sampling (BRSS) developed by Monogan and Gill (Reference Monogan and Gill2016). Doing so allows us to forecast the average ideology with:

(4)$$ {\widehat{Y}}_{\mathbf{B}}={\widehat{\mu}}_{\mathbf{B}}(\beta )+{\widehat{H}}_{\mathbf{s},\mathbf{B}}^T(\phi ){\widehat{H}}_{\mathbf{s}}^{-1}(\phi )({\mathbf{Y}}_{\mathbf{s}}-{\widehat{\mu}}_{\mathbf{s}}(\beta )). $$

We account for the spatial element by forecasting $ \widehat{Y}({\mathbf{s}}_{k\mathrm{\ell}};\beta, {\sigma}^2,{\tau}^2,\phi ) $ and using this quantity in our Monte Carlo integration. With the nugget effect, from Y(s) = µ(s) + ω(s) + ϵ(s), we also get $ {\mathbf{Y}}_{\mathbf{s}}\sim \mathcal{N}(\mu, \Sigma ) $, where we still require $ \Sigma ={\sigma}^2\mathbf{H}(\phi )+{\tau}^2\mathbf{I} $, $ H{(\phi )}_{ij}=\rho (\phi, {d}_{ij}) $, and $ {d}_{ij}=||{\mathbf{s}}_i-{\mathbf{s}}_j|| $. Again, for this application, the Gaussian semivariogram function was the best fitting, meaning that our correlation function is $ H{(\phi )}_{ij}=\mathrm{exp}(-{\phi}^2{d}_{ij}^2) $.

Why Proximity Matters for Public Opinion

Tobler’s First Law of Geography states, “Everything is related to everything else, but near things are more related than distant things” (Tobler Reference Tobler1970, 236). Here, we assume that this law holds for individuals’ opinions and ideology also, with more physically proximate Americans holding a more similar political outlook. While there are physical barriers, such as highways and rivers, that separate populations and therefore can change ideology dramatically, our kriging approach connects these smoothly with no sudden shift.

Proximal influence in politics is supported by Gimpel and Schuknecht (Reference Gimpel and Schuknecht2003, 2–4)who describe two different approaches to understanding regionalism in this way. First, Gimpel and Schuknecht give a compositional approach asserting that political behavior is similar within a region due to economic interests, racial origin, ethnic ancestry, religion, social structure, and other related factors (Fischer Reference Fischer1989; Garreau Reference Garreau1981; Gastil Reference Gastil1975; Lieske Reference Lieske1993). Therefore, if all of these factors could be included in an empirical model, then the variability with geographic units in the model would be small. Clearly, though, it is often impossible to measure every relevant demographic and socioeconomic variable, or even to identify every critical variable for inclusion. As some relevant inputs may be overlooked or unmeasured, we assume that neighboring individuals will have a relatively similar political outlook, even holding included covariates constant. Second, Gimpel and Schuknecht give a contextual approach that offers the idea that citizens’ political attitudes and behaviors are influenced by political socialization and by interactions with other citizens in their social network, which is supported by a large literature in political science (DeLeon and Naff Reference DeLeon and Naff2004; Djupe and Sokhey Reference Djupe and Sokhey2011; Huckfeldt and Sprague Reference Huckfeldt and Sprague1995; Putnam Reference Putnam1966; Reference Putnam1993). For instance, “the first place to look for political networks is within the immediate physical proximity of each individual” (Sinclair Reference Sinclair2012, 26). This means that we expect under the contextual approach as well that geographically proximate individuals will have relatively similar opinions, even in a general setting.

Furthermore, Erikson, Wright, and McIver (Reference Erikson, Wright and McIver1993, 48) propose that “the unique political cultures of individual states exert an important influence on political attitudes.” This idea goes as far back as Elazar (Reference Elazar1966), who proposes that U.S. states can be categorized based on an individualist, moralist, or traditionalist view of government’s role. Erikson, Wright, and McIver (Reference Erikson, Wright and McIver1993, 56–68) also demonstrate that a higher proportion of variance in ideology and partisanship is predicted by state-level dummies than by demographic information, although state-level residuals will pick up some of the effects of unmeasured individual-level variables. There also is evidence that this holds in urban areas, where political culture shapes the impact of identity on public opinion and political participation, even in cities with heterogeneous neighborhoods (DeLeon and Naff Reference DeLeon and Naff2004, 703). Our method of kriging increases the ability to accurately model the effects of political culture, omitted predictors, and social context by including weighted neighbors’ residuals in forecasts of public opinion and ideology. For example, Western Kentucky and Southeast Illinois are similar places that are likely to be populated with similar people, both culturally and in demographic terms.

Data: 2008 CCES and 2010 Census

In this study, we use the 2008 CCES as training data to estimate our model of individual ideology as a function of demographics. The 2008 CCES offers 21,849 observations spread across the American states and congressional districts. This survey asks respondents to place themselves ideologically on a scale from 0 to 100, with 0 representing the most liberal and 100 the most conservative. Our training model predicts responses as a function of age, education, race, sex, income, religion, urban–rural status, homeownership, employment status, and a geographic trend term. CCES respondents were located geographically by ZIP code. Our procedure for locating these respondents is described in greater detail in the next section.

After estimating the training model over the CCES, we used 2010 Census data to forecast the ideology of 724,814 simulated citizens throughout the continental United States (Minnesota Population Center, 2011). We simulated by census block, the most precise geospatial unit the Census Bureau keeps track of: the first 701,050 simulated citizens were drawn purely in proportion to the population of the block, and the last 23,764 were drawn with one observation per census tract (proportionally by block within tract). We used Census Bureau maps of census blocks to place simulated citizens in latitude and longitude (or more exactly in eastings and northings), and we drew predictor values based on the variables’ local distribution for that block.

The 11 million census blocks perfectly tessellate all higher level geospatial indicators of which the Census Bureau keeps track, so there are no gaps and no overlaps of areal units. Hierarchically, blocks are nested within block groups, block groups are nested within tracts, and tracts are nested within counties. When simulating covariate values for a kriged point, if a predictor was not reported at the block level, we drew from the most precise level for a given location. More specifically, we simulate age, race, sex, and homeownership based on block-level data. We simulate education and income based on block group-level data. We simulate employment status based on tract-level data. We simulate religion and urban–rural status with county-level data. By using the 2010 census block data, all of these predictors are simulated with greater geographic precision and with more up-to-date data than in Monogan and Gill (Reference Monogan and Gill2016).

Innovations in Kriging for Measuring Public Opinion

Besides using more recent and more precise data, we offer two more methodological advances for the technique of kriging to forecast public opinion. Specifically, past work did not include estimates of opinion in Alaska and Hawaii because they are not contiguous with the rest of the United States. In this article, we offer a solution to this problem and create new estimates for these states and their component districts. In addition, past work erroneously used census ZCTAs to locate survey respondents in state with their ZIP code. We discuss why that is a problem and introduce new data that resolve this issue, thereby creating better forecasts. We discuss each innovation in turn.

Moving Noncontiguous States

Any method of measuring the ideology of public constituencies ought to be comprehensive in covering all 50 states as well as all state legislative and congressional districts falling within each. A major challenge of using spatial data analysis to measure public opinion in the United States is that it is difficult to measure opinion in the noncontiguous states of Alaska and Hawaii. In fact, measuring public opinion in these two states is often difficult anyway on account of having few, if any, survey respondents in many national polls. Kriging, like many methods of spatial analysis, requires observations to have neighbors. If we attempted to train a kriging model using Alaska’s and Hawaii’s data as they are located on a map, then we would have extreme geographic outliers that could distort the estimation of our spatial error process model. This, in turn, would diminish our ability to make accurate predictions of opinion as we turned to forecasts because the partial sill would treat ocean distance as regular geographic space and create a smoothed spatial surface over broad swaths of the Pacific Ocean and Canada. When the goal is to understand and predict the opinions of American citizens, this is not a sound substantive approach.

There is no ideal solution for how to deal with the locational and data sparsity issues of Alaska and Hawaii. Every alternative has shortcomings. Omitting them is perhaps the worst option: 4 of the 100 U.S. senators and 3 of the 435 members of the House of Representatives come from these states. To omit them would leave out a substantial portion of America’s representative area in combined population (slightly over 2.1 million people total). Meanwhile, leaving their locations as is poses problems of extreme leverage (in a training model) or out-of-area forecasting (when predicting ideology of districts). We argue that it is substantively important to include all 50 states and that we can do so in a theoretically informed way by searching for the ideological neighbors of these states, in the absence of geographical neighbors. Certainly, the discontinuity of the United States presents a unique boundary value problem in cases like this. In terms of local cultural and economic nuances, there are likely to be similarities between Alaska and western Canada as well as between Hawaii and other Pacific islands.Footnote 4 However, our outcome variable of ideology is defined here in the American context as American politicians and journalists would discuss them, so the concepts would not easily traverse international borders. Given that the outcome variable is defined within the U.S. context, the ideological neighbor approach allows us to construct predictions based on similar domestic ideological contexts.

To address this, we proceed in two ways. First, when estimating the model itself, we omit Alaska and Hawaii from the training data. Their extreme outlier values could unduly affect the spatial variance components, so only the continental 48 states were included in the training data. Second, when forecasting ideology in these two states, we relocate Alaska and Hawaii to sit next to the west coast of the United States. Doing so greatly narrows the out-of-sample space that falls within the convex hull of our forecasting space. That is, the areas that are part of Canada, Mexico, or the Pacific Ocean that are included within the space of our smoothed kriging surface are shrunken dramatically relative to a model that uses these two states at their actual geographic location.

For the sake of forecasts, we locate Alaska and Hawaii near their ideological neighbors, or locales on the west coast as similar as possible ideologically. To find these states’ ideological neighbors, we estimated a training model on the continental 48 states and then chose the locations off the west coast that minimized predictive error for the two discontiguous states when using that model to predict Alaska’s and Hawaii’s observations in survey data. (Our full process is described in more depth in Online Appendix A.) Figure 2 shows the result of our procedure, illustrating how we relocated Alaska and Hawaii for the sake of the forecasting data. Specifically, that map shows a dot at the location of each census block’s centroid (a census block serving as our primary unit for sampling forecasting observations in kriging). The census blocks for the continental 48 states are in black at their original locations in eastings and northings. The census blocks for Hawaii (in blue) and for Alaska (in red) are at their new ideological neighbor locations in eastings and northings.

Figure 2. Map of census block centroids when Alaska and Hawaii are placed near their ideological neighbors from kriging forecasts.

Substantively in Figure 2, Hawaii has been relocated so that Honolulu is an ideological neighbor with San Francisco. Alaska has been repositioned so that Juneau is just south of San Diego. These positions, again, are the positions that minimize forecasting errors in the two discontiguous states, as detailed in Online Appendix A. This allows each state to have a west coast neighbor that is ideologically similar without producing any overlap between either state and the continental states. To preserve area and point-to-point distances, the original locations of census blocks (with Alaska and Hawaii at their actual locations) were reprojected from longitude and latitude into eastings and northings first. After this reprojection, Alaska and Hawaii were relocated to the positions shown on the map. This solution of finding ideological neighbors is far superior to the common solution to simply drop these two states from measurement models: according to 2010 Census numbers, dropping these states would mean ignoring over 2 million U.S. citizens (737,438 + 1,420,491 = 2,157,929) as of 2018. It is possible to model these two states separately, but that imposes the assumption that there is no influence back and forth between these two states and the contiguous 48 states. Our solution is a compromise between these two extremes that allows inclusion without deteriorating the quality of the total model.

ZIP Codes versus ZCTAs

Prior kriging work placed survey respondents on a map using ZCTAs, as computed by the census, when respondents’ geographic identifier was ZIP code. Mechanically, if a respondent is known to reside in a geographic area, he or she has to be placed at a specific coordinate using eastings and northings. This has been done by starting at the centroid of the areal unit and jittering within the radius of the unit’s area. The problem of doing this with ZCTAs when ZIP code is the true geographic identifier is that the area of ZCTAs does not exactly overlap with the areas covered by ZIP codes themselves (Beyer, Schultz and Rushton Reference Beyer, Schultz, Rushton, Rushton, Armstrong, Gittler, Greene, Pavlik, West Dale L. and Zimmerman2008; Grubesic Reference Grubesic2008; Grubesic and Matisziw Reference Grubesic and Matisziw2006). Hence, respondents could be placed at a position on the map that puts them in the wrong ZIP code, adding unnecessary measurement error to the model.

Figure 3 illustrates the problem. This figure draws the real map of the 30601 ZIP code in Georgia using data obtained by TomTom as well as the 30601 ZCTA using data obtained from census. The solid blue line shows the ZIP code boundary, and the dashed red line shows the ZCTA boundary. As can be seen, if we knew a resident lived in the 30601 ZIP code but placed them at a location in the ZCTA, we could make several mistakes. First, there are observable points in the ZCTA that are outside of the ZIP code. In the east (right) and the north (top) in particular, there are several places in the ZCTA that stray well outside of the ZIP code. If we placed a survey respondent who identified 30601 as his or her ZIP code in one of these portions of the ZCTA, we would have placed him or her in the wrong ZIP code. A second problem that emerges is that the ZCTA does not cover all of the ZIP codes. In the southeast (bottom-right) in particular, there is a large block of land where residents of the 30601 ZIP code could live. If we proceed to locate these individuals using the ZCTA, then we have no chance of putting them at the correct location on the map.

Figure 3. Map illustrating nonoverlap of an example ZIP code compared with corresponding ZCTA.

Note. ZCTA = ZIP Code Tabulation Area.

This problem emerges because of the nature of ZIP codes and how the census has had to deal as best as possible with the issue of investigators’ need of ZIP code-referenced demographic data. ZIP codes themselves are not areal units with defined borders. Rather, ZIP codes are routes defined by the U.S. Postal Service prescribing how to deliver mail efficiently. Hence, there is no official map of where one ZIP code ends and the next begins.Footnote 5 To create demographic and geographic data by ZIP code (because it is a common locator recorded for Americans), the Census Bureau created ZCTAs for the 2000 Census—mindful to warn users that ZIP codes crosscut even census blocks, the smallest geographic unit the Bureau records. As a best alternative, the census records the ZIP code that a majority of addresses in a census block use. A ZCTA then is formed as a combination of all census blocks with the same majority ZIP code. This is certainly an important tool that the Census Bureau provides, and in cases in which demographics need to be measured by ZIP code, it is the best alternative available. However, residents who have a ZIP code that is held by a minority of addresses in their census block will be placed in a ZCTA that differs from their ZIP code. Furthermore, Grubesic and Matisziw (Reference Grubesic and Matisziw2006, Table 2) show for the state of New York that ZCTAs differ substantially from ZIP codes in terms of the number of cases as well as the mean and standard deviation of the area they cover. With measurement error like this, it is much safer to use ZIP codes themselves when possible.Footnote 6

In our case, we only need ZIP code locations to locate respondents of the CCES training data. Hence, we turn to a new alternative that deals with the issue of locating the position of ZIP codes themselves in space. Specifically, we use a 2014 dataset that draws from TomTom navigation services. This map defines ZIP code boundaries based on actual addresses, drawing a border around the complete set of addresses with a particular ZIP code. We therefore were able to compute the centroid and radius of actual ZIP codes and then link this information to the CCES to place survey respondents in space. This allowed us to estimate a model over our CCES training data that allowed for spatial correlation among nearby respondents.

Importantly, we only use ZIP codes at the training stage of estimating the model. When forecasting (or kriging) ideology, we use extremely precise census block data from the 2010 Census. At the forecasting stage, we can sample from the population using any geographic unit we wish, as long as we know both the location of the unit and the distribution of demographic predictors within that unit. A census block is the smallest possible geographic unit we can sample from, with 11 million of them defined in 2010. By forecasting using census block data, we can make predictions in places that are often as small as a city block using records of the U.S. Census recorded from that small area to sample demographic predictors. This maximizes predictive accuracy and avoids the ZIP code question altogether at the predictive stage of the model. Between the TomTom ZIP code data for the training stage and the U.S. Census Bureau’s block data for the forecasting stage, we maximize the accuracy in estimation and prediction.

Training Model with CCES Data

With Alaska and Hawaii moved to sit next to the west coast of the continental United States for forecasting purposes and the survey respondents placed geographically by their actual ZIP code, we turn to the estimation of our training model. As described before, we are estimating a model over the 2008 CCES (excluding Alaska and Hawaii), which we will then use to make forecasts with 2010 Census population data throughout electoral constituencies in all 50 states. The full specification of our Bayesian model for the training data is as follows:

$$ {\mathbf{Y}}_{\mathbf{s}}\sim \mathcal{N}({\mathbf{X}}_{\mathbf{s}}\beta, \Sigma ), $$

$$ \Sigma ={\sigma}^2\mathbf{H}(\phi )+{\tau}^2\mathbf{I}, $$

$$ H{(\phi )}_{ij}=\mathrm{exp}(-{\phi}^2{d}_{ij}^2) $$

$$ \pi (\beta )\sim flat, $$

$$ \pi ({\tau}^2/{\sigma}^2)\sim Unif(6,8), $$

$$ \pi ({\sigma}^2)\sim 1/{\sigma}^2, $$

(5)$$ \pi (1/\phi )\sim Unif(0,\ 12,000). $$

Here, $ \mathbf{Y} $ refers to the individual’s self-reported ideology on a 0 to 100 scale, $ \mathbf{s} $ refers to the individual’s location in eastings and northings, $ \mathbf{X} $ refers to a vector of individual-level demographic predictors of ideology, $ \beta $ is the vector of regression coefficients, Σ is the covariance matrix of $ \mathbf{Y} $ given the predictors, $ {\sigma}^2 $ is the partial sill term, $ {\tau}^2 $ is the nugget effect, $ \mathbf{H} $ is the correlation matrix of observations, $ \phi $ is the decay term, and $ {d}_{ij} $ is the geographic distance between observations $ i $ and $ j $. Of note, the third line of the specification shows that each cell of the correlation matrix is defined by a Gaussian correlation function: this means that the correlation between observation $ i $ and observation $ j $ depends solely on the distance $ ({d}_{ij}) $ between them as prescribed by the correlation function.Footnote 7 Each coefficient $ (\beta ) $ has a flat prior, the ratio of the nugget to the partial sill has a uniform prior from 6 to 8 (based on our observation that the nugget variance is about 7 times the partial sill variance), the partial sill itself has a (conservative) reciprocal prior, and the range term $ (1/\phi ) $ has a uniform prior from 0 to 12,000 kilometers.

To fulfill this first step of actually estimating the spatial model with the CCES data, recall from Figure 1 that in Step (3.A) we use a bootstrap to obtain estimates, given that the data are too big to be included in the model all at once. In addition, for each bootstrap iteration, we estimate the model using the five-step algorithm from Diggle and Ribeiro (Reference Diggle and Ribeiro2007) as described in Monogan and Gill (Reference Monogan and Gill2016, 110–11). First, we draw several values from a discrete version of the uniform priors for $ {\tau}^2/{\sigma}^2 $ and $ 1/\phi $. Second, we estimate the conditional posterior distribution, $ p(\frac{\tau^2}{\sigma^2},\kern0.50em \frac{1}{\phi}|\mathbf{Y}) $ by placing our draws from the discrete prior into the following formula:

(6)$$ p(\frac{\tau^2}{\sigma^2},\frac{1}{\phi}|\mathbf{Y})\propto \pi (\frac{\tau^2}{\sigma^2})\pi (\frac{1}{\phi})|{\mathbf{V}}_{\overset{\sim }{\beta}}{|}^{\frac{1}{2}}{|\mathbf{H}(\phi )+(\frac{\tau^2}{\sigma^2})\mathbf{I}|}^{-\frac{1}{2}}{({\widehat{\sigma}}^2)}^{-\frac{n}{2}}, $$

where $ {\mathbf{V}}_{\overset{\sim }{\beta}} $ is the correlation matrix of the regression coefficients estimated with generalized least squares (GLS) using the current draw of $ 1/\phi $, $ n $ is the sample size, and $ {\widehat{\sigma}}^2 $ is an estimate of the partial sill based on residuals drawn from the GLS coefficient estimates.Footnote 8 All other terms are defined as before. Third, we draw a single set of sample posterior values for $ {\tau}^2/{\sigma}^2 $ and $ 1/\phi $ from equation 6. Fourth, we attach the set of sampled values to $ p(\beta, {\sigma}^2|\frac{\tau^2}{\sigma^2},\kern0.50em \frac{1}{\phi},\mathbf{Y}) $ and compute the corresponding conditional posterior distributions as:

$$ {\sigma}^2|\mathbf{Y},{\frac{\tau^2}{\sigma^2}}^{*},{\frac{1}{\phi}}^{*}\sim {\chi}_{ScI}^2(n,{\widehat{\sigma}}^2), $$

(7)$$ \beta |\mathbf{Y},{\sigma}^2,{\frac{\tau^2}{\sigma^2}}^{*},{\frac{1}{\phi}}^{*}\sim \mathcal{N}(\overset{\sim }{\beta},{\sigma}^2{V}_{\overset{\sim }{\beta}}). $$

The terms in these equations are again drawn from the GLS estimates from the initial draw of $ \phi $. After taking a draw from the scaled inverse $ {\chi}^2 $ distribution for the partial sill $ {\sigma}^2 $, this term is linked with the draws from the relative nugget and range terms when drawing the regression coefficients from a normal distribution. By repeating the third and fourth steps to generate a sufficiently large sample from each of the conditional posteriors, we build a sufficient Monte Carlo sample to reflect the joint posterior for the full parameter set $ (\frac{\tau^2}{\sigma^2},\kern0.50em \frac{1}{\phi},\kern0.50em {\sigma}^2,\beta ) $. Note that this is not a Markov chain Monte Carlo (MCMC) procedure, but rather a Monte Carlo-Feasible Generalized Least Squares (MC-FGLS) hybrid estimator. We use the MC-FGLS procedure primarily because its sampling strategy is computationally simpler and because existing software in the geoR package already implements this procedure.

We report the results of our model in Table 1. For each parameter in the table, the first numeric column reports the mean of the marginal posterior distribution for the parameter, which serves as a point estimate of our term. The second numeric column reports the standard deviation of the marginal posterior distribution for the parameter, which serves as a standard error. The last two columns report the 90% credible interval, meaning there is a 90% probability that the parameters fall within that range. The first 23 rows report summary statistics for the regression coefficients included in the model. Our goal with this model is to maximize predictive ability, so we include any predictor that is both known to predict ideology and for which population data are observed. As the table shows, these predictors include age, education, race, sex, income, religion, rural versus urban, homeownership, and employment status. We also model trends in geographic space by including the respondents’ coordinates in eastings and northings in the model—in linear, interactive, and quadratic forms. The last three rows of the table summarize the marginal posteriors for the three terms that characterize the spatial error process: the partial sill $ ({\sigma}^2) $, range $ (1/\phi ) $, and nugget $ ({\tau}^2) $ terms.

Table 1. Bayesian Spatial Model of Self-Reported Ideology Using BRSS.

Note. N = 21,764. Data from 2008 CCES, excluding Alaska and Hawaii. Results based on 100 subsamples of 5% original data; 1,000 iterations were run for each subsample with results based on average for each sample. Computed with the geoR 1.7-5.2.1 library in R 3.4.4. Eastings and northings rescaled to megameters (Mm) in this table. BRSS = Bootstrapped Random Spatial Sampling; CI = confidence interval; CCES = Cooperative Congressional Election Study.

Figure 4 offers another illustration of how the spatial error process works given these parameters. The horizontal axis of this plot represents the approximate distance between two survey respondents’ locations. The vertical axis represents the semivariance of observations separated by this distance—referring to either half the variance of the difference between observed responses separated by that distance or the whole variance of undifferenced responses separated by that distance when pooled together. The open black circles along the top show the empirical semivariance of raw survey responses from the 2008 CCES. The blue crosses show the empirical semivariance of residuals from an initial model estimated with OLS that used the same predictors reported in Table 1.

Figure 4. BRSS estimated semivariogram.

Note. BRSS = Bootstrapped Random Spatial Sampling.

Finally, the red line in Figure 4 shows the functional form of the Gaussian semivariance function estimated in our full Bayesian model. This line is computed by assuming that the nugget, range, and partial sill are at their mean values from the posterior distribution. As is the typical case, the semivariance starts lower at more proximate values and rises as distance increases. A low semivariance means that the correlation between observations is high, and similarly a high semivariance implies a low correlation between observations. Our result therefore means that in our forecasting model, the responses of nearby survey respondents will get greater weight in predicting ideology at a particular location than the responses of farther survey respondents.

Forecasts of Public Ideology

With a training model of ideology in hand, we now turn to using this model to make forecasts of public opinion throughout electoral constituencies following the point-to-block realignment strategy described earlier. To implement this plan, we proceed in four steps. First, we kriged 724,814 simulated citizens. 701,050 of these citizens were located in proportion to the population distribution in 11 million census blocks in 2010, while the last 23,764 were included so that each census tract would have at least one simulated citizen in it. This strategy has the advantage of placing citizens in locations reflective of the true population density, which later on will make it easier to cover legislative districts that are compact in size, without completely overlooking sparsely populated areas. For each draw, we started at the centroid of the census block and jittered from the block’s midpoint to the extent of the block’s radius. This allowed us to place each simulated citizen in eastings and northings. Again, Alaska’s and Hawaii’s census blocks were relocated to sit off of the continental west coast.

Second, once we kriged a simulated citizen, we assigned this citizen covariate values consistent with population data for the location. For each census block, we know the block’s distribution of age, sex, race, and homeownership, so we draw covariate values for the simulated citizen in proportion to the local distribution. For other covariates, we have to go to a higher level of aggregation, but we always use the most local possible distribution to simulate covariate values. For instance, we simulate education and income based on block group-level data, and we simulate employment status based on tract-level data. We also simulate religion and urban–rural status with county-level data, using government data besides the 2010 Census (Grammich et al. Reference Grammich, Hadaway, Houseal, Jones, Krindatch, Stanley and Taylor2012; United States Department of Agriculture 2013).

Third, we forecasted ideology for each simulated citizen using the model estimated over the training data. This meant placing all simulated covariate values into the mean model. In addition, we use the spatial variance process model to predict a spatial error term for each simulated citizen as a weighted combination of the training model residuals, with more proximate training observations getting a higher weight. Fourth, we gathered all simulated citizens falling within a constituency and used the average of their forecasts to compute a district average ideology score. This allowed us to make forecasts for states, congressional districts, upper chambers of state legislatures, and lower chambers of state legislatures.

Measures of Ideology and Validity Checks

Figure 5 presents our estimates of ideology in all 50 states. In both panels of the figure, the horizontal axis represents our estimates for the average state ideology, with higher values meaning more conservative. In Figure 5(a), the vertical axis represents the percentage of the two-party vote that Obama won in 2012. Each state is represented by its two-letter postal code, and the line represents the best fit from a regression that models Obama’s vote share as a function of our kriged ideology scores. As the scatterplot and best fit line both show, there is a close relationship between our measures of kriged ideology and presidential vote share, which serves as an external validation of our scores.

Figure 5. Scatterplots of 2012 presidential vote by state and 2011 U.S. Senators’ ideology, each against kriged measure of 2010 state public ideology: (a) Obama vote share 2012 and (b) state ideology 2011.

In addition, Figure 5(b) illustrates how well our measures of public ideology predict the ideology of U.S. senators elected from these respective states. In Figure 5(b), the horizontal axis again is our kriged measure of public ideology. The vertical axis is the first dimension score of DW-NOMINATE, which is frequently used as a measure of member ideology (McCarty, Poole, and Rosenthal Reference McCarty, Poole and Rosenthal1997; Poole and Rosenthal Reference Poole and Rosenthal1997). Higher values of NOMINATE are generally interpreted as more conservative, while lower values are more liberal. Republicans are represented by a red “R” and Democrats by a blue “D.” The line represents a regression predicting each senator’s NOMINATE score with public opinion ideology. As can be seen, more conservative states are more likely to elect conservative members and more likely to elect Republicans. Even within party, the scatterplot shows that within-party variance conforms to expectations: moderate Republicans are elected from more liberal states, and moderate Democrats are elected from more conservative states. Notably, these measures of state ideology are similar to those reported by Monogan and Gill (Reference Monogan and Gill2016): for the continental 48 states, the measures correlate at $ r=.9893 $ $ (SE=0.0211) $. These measures, however, have the advantage of offering scores for Alaska and Hawaii and using more precise input measures.

Figure 6 turns to the 435 districts for the U.S. House of Representatives and displays our measure of public ideology by district. The horizontal axis displays our measure of public opinion ideology by district, and the vertical axis represents each House member’s first dimension DW-NOMINATE score. Again, every Democrat is represented with a blue “D” and every Republican is represented with a red “R.” The line shows the results of a regression of elected members’ ideology as a function of district ideology. Even at this smaller level of geographic precision, we still see that we can use a constituency’s ideology to predict whether those voters will choose a Republican or Democrat and how conservative or liberal the member will be. Again, moderate members of each party tend to be drawn from districts that normally would not elect a member of their respective party. Hence, for both chambers of Congress, we see a relationship between voters’ ideology and the ideology of their members. The fact that this well-established electoral connection continues to be supported by our data further validates our kriging approach.

Figure 6. Scatterplot of 2011 U.S. House of Representatives members’ ideology against kriged measure of 2010 state public ideology.

Finally, we applied our kriging technique to constituencies as precise as state legislative districts. Figure 7 illustrates our measures of constituency ideology in both lower and upper chambers of the state legislatures. In both panels, the horizontal axis captures public ideology with our kriging measure, whereas the vertical axis measures state legislators’ ideology with the common space measure developed by Shor and McCarty (Reference Shor and McCarty2011). In both panels, red dots represent Republican legislators, and blue dots represent Democratic legislators. Districts and legislators for lower chambers are presented in Figure 7(a), whereas upper chambers are presented in Figure 7(b). In each panel, the regression line shows a positive association between electoral conservatism and legislator conservatism. Even at this most precise level where many geographic constituencies are no larger than a neighborhood, our measures of public opinion correspond to the electoral connection that we would expect for state legislators. In addition, for the lower chamber measures of district ideology, our measure correlates with the measures reported by Tausanovitch and Warshaw (Reference Tausanovitch and Warshaw2013) at $ r=.6117 $ $ (SE=0.0119) $. Thus, the measures both appear to be capturing the same concept of constituency ideology. Overall, for many sizes of electoral constituencies, our measures of public ideology pass the external validity checks we consider.

Figure 7. Scatterplots of 2011 state legislators’ ideology in lower and upper chambers, each against kriged measure of 2010 state public ideology: (a) Lower chambers and (b) upper chambers.

Illustration: Roll Call Vote in a Florida Legislature

Quality measures of district public opinion are essential for the sake of evaluating representation in the United States. As an illustration of how constituency public ideology measures can be useful, we turn to a vote in the Florida House of Representatives in 2011. The bill, H. 7161, extended an expiring law on licenses for concealed weapons indefinitely. Under this bill, which was signed into law on June 2, 2011, by Governor Rick Scott, personal information from applications for concealed weapons licenses is exempt from public records disclosures. When the House of Representatives voted on this bill, it passed 99-12. Support was unanimous among the 74 Republicans. Democrats split 25-12 in favor of the bill.

Most state legislative bills are either uncontroversial and widely supported or are split across party lines. However, with H. 7161, there is an important within-party split among Democrats on an important gun control question. Do features of the representatives’ constituencies explain why there is a split? Is it the case that Democrats from liberal districts in Florida felt the most compelled to vote against a more permissive gun law? With our measure of district ideology, we can evaluate whether that is true, and we can compare our findings with those that would be found using other measures of district ideology.

Figure 8 displays the 37 Democrats who voted on H. 7161. On each panel, the horizontal axis represents ideology of the constituency, with values to the left meaning more liberal and values to the right meaning more conservative. The vertical axis represents Shor and McCarty’s (Reference Shor and McCarty2011) measure of state legislative ideology, with lower values meaning more liberal and higher values meaning more conservative. A Democrat is represented with a black plus sign if he or she voted Yea on H. 7161. The member is represented with a red circle if he or she voted Nay on H. 7161. The gray lines on the graph represent the mean values among these 37 Democrats of constituency ideology and legislator ideology, respectively.

Figure 8. Scatterplots of Democrats in the Florida House of Representatives depicting their votes on H. 7161 against member ideology and several measures of constituency ideology: (a) Kriging, (b) MRP, and (c) presidential vote.

Note. MRP = multilevel regression with poststratification.

The first panel of Figure 8 uses our measure of state House district ideology based on universal kriging on the horizontal axis. As can be seen, 8 of the 12 who voted against this bill scored below average in our measure of district ideology. Hence, it does appear that most of those who voted against this bill hailed from particularly liberal districts, as would make sense. If we turn to the vertical axis, we can similarly see that 8 of the 12 who voted against this bill scored below the Democrats’ average on Shor & McCarty’s ideology scores. Given that the ideology scores are based on roll calls, the legislator ideology measure on average ought to do quite well in discerning how votes will break down. Hence, predictions from Shor & McCarty’s measure provide a good gauge as to what constitutes good predictive ability on this vote.Footnote 9 It looks as if our measure of district ideology offers a clear representation story as to why we get deviation in the Democratic caucus.

How did our measure compare with two others? Panels (b) and (c) of Figure 8 substitute Tausanovitch and Warshaw’s (Reference Tausanovitch and Warshaw2013) MRP measure of district ideology and McCain’s share of the presidential vote in the district in 2008, respectively. Again, these two panels show that 8 of 12 Nay voters are more liberal than the average Florida Democrat on Shor and McCarty’s (Reference Shor and McCarty2011) measure of legislator ideology. In Panel (b), we see that only 5 of 12 Nay voters appear to come from more liberal than average districts on the MRP measure. In Panel (c), we observe that only 6 of 12 Nay voters came from districts that had particularly low support for McCain in 2008. Hence, the story of representational quality will vary based on the choice of district ideology measure. In this case, the kriging-based measure tells the strongest story that legislators are responsive to constituents’ outlook. Based on our argument in this article, we believe that the kriging-based measure is the most theoretically informed of these three, and hence we feel the most confident in that conclusion for this application.

Implications for the Applied Researcher

In this article, we have described and implemented the method of Bayesian universal kriging as a way of using survey responses to forecast public opinion in electoral constituencies. Using the 2008 CCES and the 2010 Census, we have created measures of public opinion for the year 2010 at several levels. In doing so, we have improved on past work with this method by correcting a problem of misalignment between ZIP codes and ZCTAs, and we also have found a means of incorporating Alaska and Hawaii into this type of measure. Using presidential vote share and measures of legislative ideology, we verify that this measure behaves as it ought to relative to other established measures in American politics.

Our resulting measures are now freely available for any researcher to use in his or her own analysis. These measures can be downloaded from Dataverse (https://doi.org/10.15139/S3/7NNASB) or by installing our krige package, which is available on the Comprehensive R Archive Network. These new measures serve the practical researcher in several ways. First, by releasing a measure for 2010 based on the most modern and precise Census data, our measures are more recent than many alternative measures, even measures taken for the state level. Second, our measures capture ideology at multiple levels, serving as a means of capturing public sentiment not only for the 50 states but also for the congressional and state legislative districts.

The approach of point-to-block realignment with universal kriging has the potential to fill public opinion measurement needs in many ways. To start, the realignment of kriged points into constituencies need not be to existing legislative districts. A natural extension of this would be to allow users to draw hypothetical districts and extract public opinion in the proposed new district—which would have applications for state legislative and congressional redistricting. Another extension would be to expand this technique to allow ordinal responses from the survey respondents, such as when a public opinion question is asked on a 3-, 5-, or 7-point scale. Doing this would open up the possibility of forecasting ideology at the four levels we consider in more years (when only limited versions of ideology questions are available), and it would also allow for the creation of issue-specific public opinion measures based on questions of this type. Finally, the modeled outcome does not necessarily need to be ideology, as any surveyed attitude with geocoded response is possible. Our approach can even be applied to epidemiological outcomes. For now, we have produced measures that researchers can use for state-level, congressional-level, and state legislative-level research. However, we believe there is an even more promising research agenda with Bayesian kriging that will enable even better measures over time, space, and issue area.

Authors’ Note

Previous versions of this work have been presented at the Annual Summer Meeting of the Society for Political Methodology, July 2016, Houston; the Sixth Asian Political Methodology Meeting, January 2019, Kyoto; the Department of Political Science at the University of Mississippi, February 2019; the Department of Government at Dartmouth College, April 2019; and the Nineteenth Annual State Politics and Policy Conference, May 2019, College Park, Maryland. Complete replication information and our estimates of ideology in 2010 are available at Dataverse (https://doi.org/10.15139/S3/7NNASB). Our estimates of state, congressional district, and state legislative ideology, as well as functions to estimate kriging models described in this article, are available through the krige package, available on the Comprehensive R Archive Network.

Acknowledgments

For helpful assistance, we thank Stephen Jessee, Jason Byers, Simon Heuberger, Min Hee Seo, Yunkyu Sohn, Susan Allen, Conor Dowling, Scott Ainsworth, Joshua Dyck, and Chris Warshaw.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author disclosed receipt of the following financial support for research, authorship, and/or publication of this article: This study is based upon work supported by the National Science Foundation under Grant Numbers SES-1630265 and SES-1630263.

Supplemental Material

Supplemental material for this article is available online at JeffGill.org.

Author Biography

At American University, Jeff Gill is distinguished professor in the Department of Government and the the Department of Mathematics & Statistics, as well as a member of the Center for Behavioral Neuroscience at American University and founding director of the Center for Data Science. He has done extensive work in the development of Bayesian hierarchical models, nonparametric Bayesian models, elicited prior development from expert interviews, as well in fundamental issues in statistical inference. Current applied work includes: Blood and circulation physiology including how our bodies change these dynamics in times of stress such as injury, long-term mental health outcomes from children’s exposure to war, pediatric head trauma, analysis of terrorism data, survey research methodologies, and spatial analysis of social and biomedical conditions.

Footnotes

1. An important, recent exception to this is the annual Cooperative Congressional Election Study (CCES), which is an Internet-based survey that covers every congressional district in the nation (Ansolabehere Reference Ansolabehere2011).

2. Hanretty, Lauderdale, and Vivyan (Reference Hanretty, Lauderdale and Vivyan2018) show that a CAR-based (conditional autoregressive) random effect, like Selb and Munzert (Reference Selb and Munzert2011) use, does improve the performance of predictions, but not as much as including constituency-level predictors in the model.

3. Northings and eastings are an alternative to latitude and longitude advocated by the U.S. National Imagery and Mapping Agency and used by most militaries. These are defined by the Universal Transverse Mercator (UTM) which establishes 60 curved vertical “strips” across the globe, each with 6 degrees of longitude starting at 180 degrees. Within this UTM, grid points are offsets in meters where northing is the distance from the equator and easting is the distance from the closest western line of the 60 vertical zone boundaries. The southern hemisphere is made positive in northings by adding a constant. There are a variety of possible projections and reference points, and we define ours later in the article.

4. Notably on Alaska, though, the Boundary Peaks of the Alaska–British Columbia/Yukon border are a formidable barrier. Furthermore, driving a car from Alaska’s capital to the nearest major Canadian city of Vancouver is a 1,640-mile journey that goes partly Northward before turning South and not possible during certain times of the year.

5. It should be noted, though, that ZIP codes do follow a largely hierarchical structure by digit. One possible application with ZIP code data would be to estimate a hierarchical model using ZIP codes to gauge regions and subregions. In our case, however, the data included in our forecasts are based not on ZIP codes but on precise census block data. Hence, the hierarchical approach would not be productive in this case.

6. We should note that Monogan and Gill (Reference Monogan and Gill2016) use ZIP Code Tabulation Areas (ZCTAs) as an approximation in their measures. We correlated our measures of the continental 48 states with their measures and found $ r=.9893 $ $ (SE=0.0211) $. Thus, the results in a case like this appear to be robust to this issue. Again, however, ZIP code polygons themselves would lack measurement error, so they are the best choice when possible.

7. We chose the Gaussian correlation function by estimating an initial linear model with ordinary least squares (OLS) using the predictor specification that we use in the final model. With the residuals of this model, we obtained the empirical semivariogram and chose the best-fitting parametric semivariogram of several commonly used functions. The Gaussian function had the lowest Bayesian information criterion (BIC) score at 297,881. This was a better score than the wave function (297,896), the exponential function (306,309), or the Matérn with $ \kappa =1 $ (298,268).

8. Specifically, $ \overset{\sim }{\beta}=({\mathbf{X}}^{\prime }\ \mathbf{H}{(\phi )}^{-1}\mathbf{X}{)}^{-1}{\mathbf{X}}^{\prime }\ \mathbf{H}{(\phi )}^{-1}\mathbf{Y} $. Hence, $ {V}_{\overset{\sim }{\beta}}=({\mathbf{X}}^{\prime }H{(\phi )}^{-1}\mathbf{X}{)}^{-1} $. In addition, $ {\widehat{\sigma}}^2=\frac{1}{n}(\mathbf{Y}-\mathbf{X}\beta {)}^{\prime}\mathbf{H}{(\phi )}^{-1}(\mathbf{Y}-\mathbf{X}\beta ) $.

9. Intriguingly, only one legislator voted against the bill who was above average on both measures—Jim Waldman, who represented District 95 in Broward County.

References

Amos, Brian, McDonald, Michael P., Watkins, Russell. 2017. “When Boundaries Collide: Constructing a National Database of Demographic and Voting Statistics.” Public Opinion Quarterly 81 (Suppl. 1): 385400.CrossRefGoogle Scholar
Ansolabehere, Stephen D. 2011. “CCES, Common Content, 2008” Ver. 4. http://hdl.handle.net/1902.1/14003 (accessed June 1, 2020).Google Scholar
Ansolabehere, Stephen D., Snyder, James M., Stewart, Charles. 2001. “Candidate Positioning in U.S. House Elections.” American Journal of Political Science 45:136–59.CrossRefGoogle Scholar
Banerjee, Sudipto, Carlin, Bradley P., Gelfand, Alan E. 2015. Hierarchical Modeling and Analysis for Spatial Data. 2nd ed. New York: Chapman & Hall/CRC.Google Scholar
Berry, William D., Ringquist, Evan J., Hanson, Richard C. Fording Russell L. 1998. “Measuring Citizen and Government Ideology in the American States, 1960-93.” American Journal of Political Science 42:327–48.CrossRefGoogle Scholar
Beyer, Kirsten M. M., Schultz, Alan F., Rushton, Gerard. 2008. “Using ZIP Codes as Geocodes in Cancer Research.” In Geocoding Health Data: The Use of Geographic Codes in Cancer Prevention and Control, Research, and Practice, eds. Rushton, Gerard, Armstrong, Marc P., Gittler, Josephine, Greene, Barry R., Pavlik, Claire E., West Dale L., Michele M., Zimmerman, . New York: CRC Press, 3768.Google Scholar
Cressie, Noel A. C. 1993. Statistics for Spatial Data. Rev. ed. New York: John Wiley.Google Scholar
DeLeon, Richard E., Naff, Katherine C. 2004. “Identity Politics and Local Political Culture: Some Comparative Results from the Social Capital Benchmark Survey.” Urban Affairs Review 39 (6): 689719.CrossRefGoogle Scholar
Diggle, Peter J., Ribeiro, Paulo J. Jr. 2007. Model-Based Geostatistics. New York: Springer.Google Scholar
Djupe, Paul A., Sokhey, Anand E. 2011. “Interpersonal Networks and Democratic Politics.” PS: Political Science & Politics 44 (1): 5559.Google Scholar
Elazar, Daniel J. 1966. American Federalism: A View from the States. New York: Thomas Y. Crowell.Google Scholar
Erikson, Robert S., Wright, Gerald C. 1980. “Policy Representation of Constituency Interests.” Political Behavior 2:91106.CrossRefGoogle Scholar
Erikson, Robert S., Wright, Gerald C., McIver, John P. 1993. Statehouse Democracy: Public Opinion and Policy in the American States. New York: Cambridge University Press.Google Scholar
Fischer, David Hackett. 1989. Albion’s Seed: Four British Folkways in America. New York: Oxford University Press.Google Scholar
Garreau, Joel. 1981. The Nine Nations of North America. Boston: Houghton Mifflin.Google Scholar
Gastil, Raymond D. 1975. Cultural Regions of the United States. Seattle: University of Washington Press.Google Scholar
Gelman, Andrew, Little, Thomas C. 1997. “Poststratification into Many Categories Using Hierarchical Logistic Regression.” Survey Methodology 23:127–35.Google Scholar
Gimpel, James G., Schuknecht, Jason E. 2003. Patchwork Nation: Sectionalism and Political Change in American Politics. Ann Arbor: University of Michigan Press.CrossRefGoogle Scholar
Grammich, Clifford, Hadaway, Kirk, Houseal, Richard, Jones, Dale E., Krindatch, Alexei, Stanley, Richie, Taylor, Richard H. 2012. 2010 U.S. Religion Census: Religious Congregations & Membership Study. Association of Statisticians of American Religious Bodies. https://www.amazon.com/2010-U-S-Religion-Census-Congregations/dp/0615623441 (accessed June 1, 2020).Google Scholar
Grubesic, Tony H. 2008. “Zip Codes and Spatial Analysis: Problems and Prospects.” Socio-Economic Planning Sciences 42 (2): 129–49.CrossRefGoogle Scholar
Grubesic, Tony H., Matisziw, Timothy C. 2006. “On the Use of ZIP Codes and ZIP Code Tabulation Areas (ZCTAs) for the Spatial Analysis of Epidemiological Data.” International Journal of Health Geographics 5:58.Google ScholarPubMed
Hanretty, Chris, Lauderdale, Benjamin E., Vivyan, Nick. 2018. “Comparing Strategies for Estimating Constituency Opinion from National Survey Samples.” Political Science Research and Methods 6 (3): 571–91.CrossRefGoogle Scholar
Huckfeldt, Robert, Sprague, John T. 1995. Citizens, Politics, and Social Communication: Influence in an Election Campaign. New York: Cambridge University Press.Google Scholar
Jackson, John E. 1989. “An Errors in Variables Approach to Estimating Models with Small Area Data.” Political Analysis 1:157–80.Google Scholar
Jackson, John E. 2008. “Endogeneity and Structural Equation Estimation in Political Science.” In The Oxford Handbook of Political Methodology, eds. Box-Steffensmeier, Janet M., Brady, Henry E., Collier, David. New York: Oxford University Press, pp. 404–431.Google Scholar
Kernell, Georgia. 2009. “Giving Order to Districts: Estimating Voter Distributions with National Election Returns.” Political Analysis 17:215–35.CrossRefGoogle Scholar
Lax, Jeffrey R., Phillips, Justin H. 2009. “How Should We Estimate Opinion in the States?American Journal of Political Science 53:107–21.CrossRefGoogle Scholar
Lieske, Joel. 1993. “Regional Subcultures of the United States.” Journal of Politics 55 (4): 888913.Google Scholar
McCarty, Nolan M., Poole, Keith T., Rosenthal, Howard. 1997. Income Redistribution and the Realignment of American Politics. American Enterprise Institute Studies on Understanding Economic Inequality. Washington, DC: AEI Press.Google Scholar
Minnesota Population Center. 2011. National Historic Geographic Information System: Version 2.0. Minneapolis: University of Minnesota.Google Scholar
Monogan, James E. III, Gill, Jeff. 2016. “Measuring State and District Ideology with Spatial Realignment.” Political Science Research and Methods 4 (1): 97121.CrossRefGoogle Scholar
Monogan, James E. III, Konisky, David M., Woods, Neal D. 2017. “Gone with the Wind: Federalism and the Strategic Location of Air Polluters.” American Journal of Political Science 61(2):257–70.Google Scholar
Park, David K., Gelman, Andrew, Bafumi, Joseph. 2004. “Bayesian Multilevel Estimation with Poststratification: State-Level Estimates from National Polls.” Political Analysis 12:375–85.Google Scholar
Park, David K., Gelman, Andrew, Bafumi, Joseph. 2006. “State-Level Opinions from National Surveys: Poststratification Using Multilevel Logistic Regression.” In Public Opinion in State Politics, ed. Cohen, Jeffrey E. Stanford: Stanford University Press, pp. 209–228.Google Scholar
Pool, Ithiel de Sola, Abelson, Robert P., Popkin, Samuel L. 1965. Candidates, Issues, and Strategies. Cambridge: MIT Press.Google Scholar
Poole, Keith T., Rosenthal, Howard. 1997. Congress: A Political-Economic History of Roll Call Voting. New York: Oxford University Press.Google Scholar
Putnam, Robert D. 1966. “Political Attitudes and the Local Community.” American Political Science Review 60(3):640–54.CrossRefGoogle Scholar
Putnam, Robert D. 1993. Making Democracy Work: Civic Traditions in Modern Italy. Princeton: Princeton University Press.Google Scholar
Ravishanker, Nalini, Dey, Dipak K. 2002. A First Course in Linear Model Theory. Boca Raton: Chapman & Hall/CRC.Google Scholar
Selb, Peter, Munzert, Simon. 2011. “Estimating Constituency Preferences from Sparse Survey Data Using Auxiliary Geographic Information.” Political Analysis 19 (4): 455–70.CrossRefGoogle Scholar
Shor, Boris, McCarty, Nolan. 2011. “The Ideological Mapping of American Legislatures.” American Political Science Review 105 (3): 530–51.Google Scholar
Sinclair, Betsy. 2012. The Social Citizen: Peer Networks and Political Behavior. Chicago: University of Chicago Press.Google Scholar
Tam Cho, Wendy K, Gimpel, James G. 2007. “Prospecting for (Campaign) Gold.” American Journal of Political Science 51(2):255–68.CrossRefGoogle Scholar
Tausanovitch, Chris, Warshaw, Christopher. 2013. “Measuring Constituent Policy Preferences in Congress, State Legislatures, and Cities.” Journal of Politics 75 (2): 330–42.Google Scholar
Tobler, Waldo R. 1970. “A Computer Movie Simulating Urban Growth in the Detroit Region.” Economic Geography 46 (2): 234–40.CrossRefGoogle Scholar
United States Department of Agriculture. 2013. “2013 Rural-Urban Continuum Codes.” http://www.ers.usda.gov/data-products/rural-urban-continuum-codes.aspx (accessed June 1, 2020).Google Scholar
Weber, Ronald E., Hopkins, Anne H., Mezey, Michael L., Munger, Frank. 1972. “Computer Simulation of State Electorates.” Public Opinion Quarterly 36:549–65.CrossRefGoogle Scholar
Weber, Ronald E., Shaffer, William R. 1972. “Public Opinion and American State Policy-Making.” Midwest Journal of Political Science 16:683–99.Google Scholar
Figure 0

Figure 1. Flowchart showing the steps of point-to-block realignment.Note. OLS = ordinary least squares.

Figure 1

Figure 2. Map of census block centroids when Alaska and Hawaii are placed near their ideological neighbors from kriging forecasts.

Figure 2

Figure 3. Map illustrating nonoverlap of an example ZIP code compared with corresponding ZCTA.Note. ZCTA = ZIP Code Tabulation Area.

Figure 3

Table 1. Bayesian Spatial Model of Self-Reported Ideology Using BRSS.

Figure 4

Figure 4. BRSS estimated semivariogram.Note. BRSS = Bootstrapped Random Spatial Sampling.

Figure 5

Figure 5. Scatterplots of 2012 presidential vote by state and 2011 U.S. Senators’ ideology, each against kriged measure of 2010 state public ideology: (a) Obama vote share 2012 and (b) state ideology 2011.

Figure 6

Figure 6. Scatterplot of 2011 U.S. House of Representatives members’ ideology against kriged measure of 2010 state public ideology.

Figure 7

Figure 7. Scatterplots of 2011 state legislators’ ideology in lower and upper chambers, each against kriged measure of 2010 state public ideology: (a) Lower chambers and (b) upper chambers.

Figure 8

Figure 8. Scatterplots of Democrats in the Florida House of Representatives depicting their votes on H. 7161 against member ideology and several measures of constituency ideology: (a) Kriging, (b) MRP, and (c) presidential vote.Note. MRP = multilevel regression with poststratification.

Supplementary material: File

Gill supplementary materials

Online Appendix

Download Gill supplementary materials(File)
File 23 KB