There is no shortage of interesting issues surrounding election forecasting. From a long list of fascinating matters worthy of rumination, I have selected three. The first concerns the consequences of shrinking presidential vote margins. The second deals with the incumbency advantage and its implications for model estimation and democratic accountability. The third addresses often neglected criteria for assessing the credibility of forecasting models.
ELECTION MARGINS
Since the late 1980s, presidential elections have been particularly close. From the late 1940s to the early 1980s, a period covering many elections used in estimating most forecasting equations, the two-party popular presidential vote varied from a high of about 62% to a low of 38%, a 24-point range. In the seven presidential elections since Reagan’s 1984 landslide, however, no candidate has received as much as 55% of the two-party popular vote, a 10-point spread. The standard deviation of the two-party vote in the 10 elections from 1948 to 1984 was 6.8 percentage points. In the post-1984 period, it has been only 3.1 percentage points.
Assuming that the more limited range of outcomes in the last seven elections is not a fluke, but a consequence of changed underlying political realities—greater polarization of the public and parties and a more competitively balanced realigned party system (“sorted” for the squeamish)(Abramowitz Reference Abramowitz2010, Abramowitz and Saunders Reference Abramowitz and Saunders1998, 2008; Campbell Reference Campbell2006, Reference Campbell2010)—the hyper-competitiveness of post-1984 presidential elections may have some consequences for the general accuracy of the models as well as their specification and estimation. On the one hand, one would expect narrower vote margins would increase forecast accuracy. Forecasts within the new 55-45 limits should be more accurate than within the old 62-38 limits. This narrower band of plausible forecasts may also increase expectations of greater accuracy.
On the other hand, applying models estimated using mostly wide-range elections to post-1984 narrow-range elections presents a problem: a unit change in a predictor is likely to have had a larger impact on the vote in earlier wide-range elections and, thus, a larger coefficient than would be applicable in more recent vote-constrained elections. The normal fix, a host of interaction terms, is not feasible with so few elections.
This problem is likely to be greater for some forecasting models than others. To the extent that a forecast is based on opinion predictors, either preference polls or approval ratings, polarization probably has already been taken into account. Preference polls and approval ratings should be affected by the same polarization that affects the vote. This cannot be said, however, for other predictors such as the economy and incumbency. With each additional election from the polarized party era included in updated estimations, the impact of the vote margin change should lessen. But for now, to the extent that models do not already take this change into account or adapt to do so, we might expect elections to be a bit closer than predicted.
An unfortunate irony in this is that this problem has probably blurred differences between strong and weak models by strengthening the apparent performance of weak models and weakening the performance of strong models. Consider a rubbish-model, essentially noise. It looked good by goodness-of-fit statistics, but that was only luck. In use, the predictions of bad models tend toward the mean vote (52.0% since 1948). Since the polarized realigned system has moved the playing field in the same direction, bad models will tend to be more accurate than they would have been otherwise. In contrast, strong models that have not taken polarization into account in their specification will produce forecasts more appropriate to earlier higher margin elections. Their errors will be larger than they would have been otherwise. In effect, without adaptations to post-1984 hyper-competitiveness (and poll based models have built-in adaptations to this, at least partially), weak models appear stronger and strong models appear weaker than they might otherwise have been, at least until we grow out of the problem with the inclusion of more post-1984 elections.
PRESIDENTIAL INCUMBENCY
One of the clearest contributions to voting behavior research from election forecasting is in illuminating the impact of presidential incumbency (Campbell Reference Campbell, Campbell and Garand2000, Reference Campbell2008). The effect of incumbency has been reflected explicitly in Norpoth (Reference Norpoth, Campbell and Garand2000), Abramowitz (Reference Abramowitz1988), Fair (Reference Fair1988), and others and implicitly through other variables (approval ratings and preference polls) in many other forecasting models.
Based on our forecasting experience, the incumbency advantage appears to be more complex than often thought. The effect of incumbency is not captured, not even remotely, by a dichotomous variable for a sitting president. It has as much to do with the tenure of the party in office as with the person in office. It also is enmeshed with the standards that voters use in retrospective voting. Voters cut some in-party candidates more slack than others. In-party nonincumbents (successor candidates) receive only partial credit or blame for their party’s past performance. For incumbents, retrospective evaluations of the record appear to depend a good bit on how long the party has occupied the White House.
How might the system function differently if everyone understood that putting a new party in the White House is tantamount to an eight-year commitment?
As several forecasting models suggest (Abramowitz Reference Abramowitz1988; Norpoth Reference Norpoth, Campbell and Garand2000) and as table 1 attests, the presidential incumbency advantage goes largely, if not exclusively, to first party-term incumbents (presidents whose party has occupied the White House for just one term)(Campbell Reference Campbell and Sabato2013a, Reference Campbell2013b). Since 1900, there have been 12 first party-term incumbents. Eleven of the 12 won reelection. Only one (Carter) was defeated, and he entered the election with the economy in steep decline. Barring abject failure, first party-term incumbents are virtually assured of a second term. Other in-party candidates (successor candidates in open seats and incumbents of parties in office for more than one term) seem to have no discernable advantage. The first party-term advantage appears to be so strong that all of the models should determine whether they adequately take it into account in some way.
Table 1 Election Results for the In-Party Presidential Candidate, 1900–2012
First party-term presidents are in the enviable position of being able to credibly campaign either advocating stability if things are going well or advocating change if things are going poorly. Having been in office for just four years, these incumbents can still plausibly blame their predecessor for persisting problems (Campbell Reference Campbell2008). They are credited for their successes, but can evade a good deal of the blame for their failures. In effect, retrospective evaluations may be asymmetric for first party-term incumbents.
The 2012 election is a textbook example. Although a good case could have been made that the incumbent’s policies had led to a weak economy or, at least, had failed to set things back on course, a majority of voters placed more of the blame on the president four years out of office rather than the president in office those four years. Several late campaign polls and the exit polls indicated that a slight majority blamed former President Bush for the weak economy and only just over a third blamed President Obama (Campbell Reference Campbell2013b, 26).
The incumbency advantage, both in the partial credit or blame assigned to successor candidates and the big first party-term advantage, raises some interesting questions about electoral accountability and interpreting elections. Should voters hold successor candidates more accountable for the national conditions left by their party? Are the standards of responsibility for first party-term incumbents so lax that they can evade accountability in all but the most egregious cases? How might the system function differently if everyone understood that putting a new party in the White House is tantamount to an eight-year commitment?
MODEL CREDIBILITY
To the uninitiated, election forecasting is often caricatured as a barefoot empiricist’s atheoretical search for the best fitting equation, the highest R-square. While there may be some truth to this critique in a few cases, in general, it is off base. This misperception is unfortunately fed by the all too often exclusive attention paid to a model’s fit to past elections. I have no argument with judging forecasting models by their fit to past elections, that is essential, but I take issue with that being the sole criterion. Accuracy standards are important, but as Lewis-Beck observed (2005), they are not the whole story. Many factors add to or detract from the credibility of a forecasting model beyond its out-of-sample goodness-of-fit statistics. Unfortunately, these are not neatly captured by a single statistic or index.Footnote 1 As a result, these factors are often neglected. Somehow these other elements should be considered in recognizing a model and its forecast as being credible. It should be observed that the neglect of credibility criteria beyond model fit statistics is by no means a problem limited to forecasting.
Beyond a model’s general accuracy, five criteria are important to a model’s credibility. The list augments that originally assembled by Lewis-Beck (Reference Lewis-Beck2005), sans his index. The first criterion is transparency. Unless we know exactly what has gone into a forecast, it should have about the same credibility as fortune-telling. The ultimate transparency is the availability of the data and replicability of the forecast model’s estimates.
The second criterion is the simplicity and logic of the predictors. Models that have straightforward measurements and sensible weights for predictors are more credible than those with complex or even Rube Goldbergesque indices.
The third metric is model stability and the track record of an unrevised model over a series of elections. An unchanged and fairly accurate forecasting model should be considered more credible than a one-hit wonder or frequently tweaked model. Model revisions are necessary from time to time, but consumers have a right to be a bit wary of models that are frequently “new and improved.”
Relatedly, the fourth criterion is whether a model is supported by closely corresponding companion models. These are models in the same vein as the central model, but are differently timed or configured in their predictor variables and are “well behaved” or have expected coefficient differences. These companion models provide evidence of a model’s robustness and bring additional independently measured data to the forecast. For example, the Trial-Heat and Economy Model was estimated using successive presidential preference polls at different points in a campaign (Campbell and Wink Reference Campbell and Wink1990). As expected, as the forecast moved closer to Election Day, the contribution of the preference poll to the prediction increased and the contribution of the economic indicator decreased. This pattern of effects is exactly what one would expect as the state of the economy gradually becomes fully incorporated into the voters’ preferences.
Last, but not least, forecasting models are more credible if they are consistent with existing empirical explanatory findings or, at least, not inconsistent with explanatory research. Forecasting models need not be based entirely on empirical theory, they have different goals, but their credibility should suffer if contradicted by empirical findings. To the extent that forecasting models are supported by empirically confirmed theory, credibility is substantially enhanced. For example, both prior and subsequent research on campaign effects (Campbell Reference Campbell2008; Erikson and Wlezien Reference Erikson and Wlezien2012; Lazarsfeld Reference Lazarsfeld1944), economic effects (Campbell Reference Campbell2008; Campbell, Dettrey, and Yin Reference Campbell, Dettrey and Yin2010; Holbrook Reference Holbrook2008; Lewis-Beck Reference Lewis-Beck1988; Nadeau and Lewis-Beck Reference Nadeau and Lewis-Beck2001; Norpoth Reference Norpoth, Dorussen and Taylor2002; Vavrek Reference Vavreck2009), and incumbency (Campbell Reference Campbell2008; Mayhew Reference Mayhew2008; Weisberg Reference Weisberg2002) corroborate the Trial-Heat and Economy Model. A forecast model well integrated into empirically supported explanatory theory at both the micro and macro levels should be accorded greater credibility than one lacking that corroboration.
OUTLOOK ON 2016
Although directed at the long-term development of forecasting, the three issues presented here have more immediate implications as well. In looking toward 2016, the hyper-competitive realigned and polarized partisan context as well as its open-seat status set the stage for another very close election, closer perhaps than some of the forecasts will indicate. In terms of model credibility, although forecast consumers cannot conduct a credibility review of each forecast, they can assess its plausibility. Beyond generating a prediction, forecasters should explain why their models work. If the explanation of a forecast seems too convoluted or contrived to be believable, it probably is.