The National Conditions and Trial Heat model is an adaptation of the Incumbency and National Conditions model (Holbrook Reference Holbrook2012), which was developed to account for greater prediction error in contests in which the incumbent president was not running for reelection. The idea behind the incumbency model was that the relationship between national conditions (economic evaluations and presidential approval) should be weaker in open-seat contests than in incumbent contests; the model accounted for this with an interaction term for incumbency and national conditions. Incorporating incumbency proved to be an improvement over a straight-up national conditions model, but it only accounts for one potential explanation for discrepancies between predictions based on national conditions and actual candidate vote shares. Beyond incumbency, it is also possible that any number of election-specific, idiosyncratic factors—exceptionally good or bad candidates or campaign strategies, or unanticipated events, for instance—could lead to unexpected outcomes based on expectations from the national conditions model. The 2000 election stands out somewhat in this regard: in addition to running in an open-seat contest, the Democratic candidate, Vice-President Al Gore, seemed determined to run away from the strong economic record of the Clinton-Gore administration and not to understand that President Clinton’s still-high approval numbers could be an asset. As a result, most forecasting models—and especially those that did not use trial-heat polls—vastly overestimated Gore’s expected vote share in 2000.
But more generally, candidates in any election can over- or under-perform for any number of reasons. At this point in time (late July, 2016), the 2016 election looks like one with potential for one of the candidates to over perform. On the Republican side, we have a candidate (Donald Trump) with little political experience, whose nominating convention was notable for its lack of Republican Party elites and generally viewed as a bit of a bust, and who is prone to making inflammatory and controversial public statements; while the Democratic candidate (Hillary Clinton) has trouble with the liberal wing of her party, is not well-liked by the mass public (based on unfavorable evaluations from public opinion surveys), and is fresh off an FBI investigation that concluded that her treatment of sensitive State Department information was “extremely careless,” though not illegal. There are plenty of reasons to expect that candidate and campaign related factors could lead to one of these candidates doing better than expected based on national conditions. To the extent that this is true, or that the impact of national conditions is muted because the 2016 race is an open-seat contest, the trial-heat variable, which gauges the state of the campaign in early September, should lead to an improved prediction.
THE FORECASTING MODEL
The forecasting model uses the incumbent party candidate’s percent of the two-party popular vote as the dependent variable, and an index of national conditions and a measure of the incumbent party candidate’s performance in pre-election trial-heat polls as the independent variables. The index of national conditions is comprised of presidential approval (Gallup) and aggregate satisfaction with personal finances (Survey of Consumers), both averaged over June, July and August of election years. Each of these components is standardized by taking their value as a percent of the highest value in the data series before averaging them. The trial-heat variable is the average of the incumbent party percent of the expressed two-party vote in Gallup trial-heat polls taken in the first week of September (or earliest polls in September if there were none in the first week) of the election year. In years in which one of the nominating conventions was held in early September, the trial-heat polls are taken from the week after the last day of the convention.
The forecasting model uses the incumbent party candidate’s percent of the two-party popular vote as the dependent variable, and an index of national conditions and a measure of the incumbent party candidate’s performance in pre-election trial-heat polls as the independent variables.
The results of the forecasting model are summarized in tables 1 and 2 and figure 1. In table 1 we see that both national conditions and trial-heat polls are significant influences on presidential elections. The overall fit of the the model is fairly strong, with 86% of the variance in election outcomes explained by the independent variables, and the average out-of-sample prediction error is a modest 2.19 percentage points.
Table 1 The National Conditions and Trial-Heat Model for Presidential Elections from 1952–2012

Note: The index of national conditions is comprised of presidential approval (Gallup) and aggregate satisfaction with personal finances (Survey of Consumers), both averaged over June, July and August of election years. The trial-heat variable is the average of the incumbent party percent of the expressed two-party vote in Gallup trial-heat polls taken in the first week of September (or earliest polls in September if none in the first week) of the election year. In years in which one of the nominating conventions is held in early September, the trial-heat polls are taken from the week after the last day of the convention.
Table 2 Pseudo Predictions from 1992–2012

Note: Each prediction is based on data taken from 1952 to the immediately preceding election. So, for instance, the prediction for 2000 is based on data from 1952 to 1996, the 2004 prediction is based on data from 1952 to 2000, and so on.

Figure 1 Out-of-Sample “Predictions” and Actual Outcomes, 1952– 2012
Figure 1 presents the relationship between the out-of-sample “predictions” and actual outcomes from 1952 to 2012. There are a couple of things to note about this figure. First, these are out-of-sample predictions; this means that the estimates for any given year are generated by dropping that year from the sample and predicting its outcome based the slopes generated by data from all other years. This is an important thing to do so the slopes used to produce the predictions for any given election year are not determined by data from that year. Second, there is a strong relationship between the out-of-sample predictions and actual outcomes. There are clearly some years in which the predictions are more on target than in other years, but, overall, the model produces estimates that mirror the actual election outcomes. Moreover, using the horizontal and vertical lines at 50% on both axes, it is clear that the out-of-sample estimates almost always predict the correct popular vote winner. In fact, the razor thin 1960 outcome is the only one in which the out-of-sample estimate calls the wrong winner, overestimating Republican strength by about 3.5 points, though 2000 (a close outcome) and 2012 (a close prediction) elections also were close calls on this front.
Out-of-sample estimates are an important measure of error, but they do not represent true forecasting error, because many of the estimates are determined by data that occur well after the election takes place (for instance, the out-of-sample estimate for 1952 was generated using estimates produced by data from 1956 to 2012). One thing we can do to get a better sense of how well the model forecasts elections is to generate pseudo-predictions for the last several elections, using estimates produced from all preceding elections in the data set. So, for example, the 2000 prediction is based on the values of the independent variables in 2000 and the slope estimates generated by data from 1952 to 1996, the 2004 prediction is based on data from 1952 to 2000, and so on. The overall size of the sample limits the ability to do this for more than just a few elections, but it is still informative.
The pseudo-predictions (table 2) reinforce the pattern found in the out-of-sample estimates: from 1992 to 2012, the model predicted the correct popular vote winner in all four elections, with an average absolute error of 1.7 percentage points. The 2000 election yields the greatest overall error, but this error is much smaller than the original error from the national conditions model (10.1 points), which was based on just presidential approval and aggregated personal finances and did not include trial-heat polls (Holbrook Reference Holbrook2001).
THE 2016 FORECAST
Average summer values of presidential approval (51.1) and consumer satisfaction (121) produce a national conditions index value 80.5 (historically, very close to the value of 80.3 in 1988), and the early September trial-heat value stands at 51.3 (historically, closest to the value of 51.7 in 2000) ; together, these data generate a predicted 2016 outcome 52.5% of the two-party vote for Hillary Clinton. Using the standard error of the forecast and the t-distribution, the estimated probability of Clinton garnering more than 50% of the vote stands at .81.