State-Level Forecasts of U.S. Senate Elections
Published online by Cambridge University Press: 01 October 2004
Extract
Election forecasting, as a science with models to be tested, got its start in political science 20 years ago (Lewis-Beck and Rice 1984; Rosenstone 1983). When the enterprise began, it was not popular. Forecasts, while entertaining, were not held to be serious research. As high-quality models with accurate forecasts were published in leading journals (see review in Lewis-Beck and Rice 1992), forecasting achieved more respect in the discipline. The 1996 presidential election was a high point. Forecasters formulated models that accurately predicted, well in advance, the presidential winner's vote share (Campbell and Garand 2000). This encouraged forecasters to ply their trade for new elections, and in 2000 scholars again met media demands to predict the presidential vote. On the front page of the Washington Post in May, one forecaster predicted a Gore victory and was quoted as saying, “It's not even going to be close” (Kaiser 2000). In the end, most of the 2000 forecasts greatly overestimated Gore's share of the vote.
- Type
- Features
- Information
- Copyright
- © 2004 by the American Political Science Association
Election forecasting, as a science with models to be tested, got its start in political science 20 years ago (Lewis-Beck and Rice 1984; Rosenstone 1983). When the enterprise began, it was not popular. Forecasts, while entertaining, were not held to be serious research. As high-quality models with accurate forecasts were published in leading journals (see review in Lewis-Beck and Rice 1992), forecasting achieved more respect in the discipline. The 1996 presidential election was a high point. Forecasters formulated models that accurately predicted, well in advance, the presidential winner's vote share (Campbell and Garand 2000). This encouraged forecasters to ply their trade for new elections, and in 2000 scholars again met media demands to predict the presidential vote. On the front page of the Washington Post in May, one forecaster predicted a Gore victory and was quoted as saying, “It's not even going to be close” (Kaiser 2000). In the end, most of the 2000 forecasts greatly overestimated Gore's share of the vote.1
See the March 2001 issue of PS: Political Science and Politics for more information on the forecasters' post-election analyses of what went wrong with the 2000 election models.
Forecasting models have also been developed for U.S. House elections. Tufte (1978, 112) laid the foundation for this work. These models vary slightly from presidential models, but most have at their core House seat or vote change as a function of economic factors and presidential approval (Jacobson 2001, 144). Senate forecasting models are less common (Abramowitz and Segal 1986). In the 2002 elections, model-based forecasts missed the slight Republican surge that was correctly predicted by some media pundits. Despite increasing criticism of forecasting after the 2000 and 2002 elections, our experience suggests that any call to abandon these models is premature. In early 2002, Democratic strategists asked us to build a model to forecast the U.S. Senate race in Maine. By late July, we had constructed a state-level model, applied it to recent Maine Senate elections, and released a forecast for the race. After Election Day, we compared its accuracy to that of the campaign tracking polls. We argue that the strong performance of our forecasting strategy recommends its application to other statewide elections.
2002 Congressional Forecasts
Several political scientists offered model-based forecasts for the national outcome of the 2002 U.S. House races. Prior to the contest, Jacobson (2002) circulated predictions calling for a Republican House gain from three to 16 seats. Given the history of midterm congressional losses for the president's party, his prediction seemed a long shot. Indeed, a month before the election, a group of prominent forecasters—Abramowitz, Campbell, Erikson and Bafumi, Lewis-Beck and Tien, and Tamas—all predicted the exact opposite, a Democratic seat gain in the House ranging from five to 17 seats (APSA 2002). In the end, Republicans made a net gain of six House seats in the 2002 midterms and strengthened their slim majority in the U.S. House.2
In the pre-election House, Republicans effectively held an 11-seat majority, 223–212. After the 2002 midterms, the comparable figure was 229–206 in favor of Republicans.
Bucking the forecasts, several wellknown pundits correctly predicted that Republicans would win the House (Abramowitz 2002). In October, National Journal's Charles Cook judged that more seats were safe or leaning Republican than were safe or leaning Democratic (219 v. 204). Congressional Quarterly showed Republicans leading in more districts than Democrats (223 v. 203). Stuart Rothenberg of Roll Call also saw Republicans in the lead (224 v. 211). The Iowa Electronic Markets, using its unique methods, assigned a .82 probability to the outcome of a Republican-controlled House (Iowa Electronic Markets 2002).3
The Iowa Electronic Market allows investors to use real money to buy “shares” of presidential candidates and congressional parties. The specific payoff or loss depends on the election results. For more information, see the IEM web site at http://www.biz.uiowa.edu/.
All in all, national-level statistical models provided less accurate congressional forecasts in 2002 than did rival forecasting methods. For example, Table 1 details one effort to forecast the outcomes of the 2002 U.S. House and Senate elections using a statistical model of election results since 1950 (Lewis-Beck and Tien 2002). The House equation in column 1 is based on Tufte's (1978) referendum model, which assumes that voters judge the president's party according to its national economic and political performance. In June 2002, disposable income growth was 2.21%, and President Bush's job approval rating was 70%. Given a midterm year, the model predicted a gain of eight seats for the Democrats, wrongly awarding them the House majority and missing the slight Republican surge in the 2002 elections.

Following tradition, few political scientists made a forecast for the 2002 Senate races. Lewis-Beck and Tien (2002) did construct a Senate forecast, but overall it did not fare too well (see column 2). Theoretically, the model is a national referendum model, in all ways similar to their House model except that it also takes into account how many Senate seats of the president's party are exposed in the election. With 20 Republican seats up for grabs, the June 2002 income growth and presidential popularity numbers yielded a forecast of a net Democratic gain of three seats, or a widening of their narrow Senate majority. In fact, the Democratic Party lost two seats and the Senate majority in the 2002 midterm elections.
Senate Forecasts: Problems and Opportunities
It is surprising that relatively few scholars have developed forecasting models for the U.S. Senate, given its greater power and prestige compared the House. One reason for this neglect is the fact that Senate elections are more competitive (Steen 2002), and so tend to be somewhat less predictable than House races (Mann and Ornstein 1984, 43).4
For example, the first Senate forecasting models (Hibbing and Alford 1982) explained only about one-third of the variance in seat change. House models tend to explain threequarters or more. Later Senate models, with a theoretical structure not unlike that of the Senate model in Table 1, offered higher R-squared values but still explained less than did most House models (Abramowitz and Segal 1986; Lewis-Beck 1985; Lewis-Beck and Rice 1992, 84).
One obstacle to forecasting Senate elections, rarely explicitly recognized, is the unit of analysis problem. Senate races are statewide, not national, affairs. Voters select a candidate to represent the state, not the nation. Aggregating Senate outcomes to a national result is a fallacy of composition that commits faulty inference from the whole (the nation) to the part (the state). Put simply, patterns in the aggregated data may falsely suggest that individual voters respond to national pressures, when in fact they respond mostly to state-level pressures. We avoid this inference trap. By making the state the unit of analysis, we make constituency congruent with constituent. This allows for a more precise specification of state-level variables that influence the vote (campaign spending and challenger quality, for example).5
One of the criticisms of forecasting is that researchers can make predictions without in-depth knowledge about the specific candidates and campaign in question. This is both a strength (in that fairly accurate forecasts require minimal data collection) and a possible weakness (in that Senate races are less predictable than other races due to their high profile and candidate-centered nature). Using the state as the unit of analysis drives scholars to dig deeper into state-specific electoral influences and the dynamics of the unfolding campaign. Our model is not exhaustive, but by measuring campaign spending and challenger quality we take one step in this direction. As state-level models are refined over time, the addition of other relevant local factors should yield even better forecasts.
In this analysis, we estimate a statelevel model and then evaluate it for the accuracy of its forecasts. Early in 2002, Democratic strategists asked us to make a forecast for that year's Senate race in Maine between incumbent Susan Collins and challenger Chellie Pingree. Our charge was to make a reasonably accurate forecast by late summer, using the same quantitative models that had brought notoriety to the discipline in recent elections. We proposed a state-level forecasting model, applied it to recent Senate elections in Maine, and released the forecast at the end of July. After the 2002 elections were over, we compared its accuracy to that of independent tracking polls that followed the Collins-Pingree campaign. As a prospective forecasting tool, our model clearly beat the tracking polls. This suggests that a similar strategy could be used to make accurate forecasts for individual Senate and gubernatorial elections, or for a larger subset of races that party or media analysts expect to be relatively competitive.
The 2002 Campaign: Collins vs. Pingree
As a relative newcomer to the U.S. Senate, Republican Susan Collins did not have the standing of former Senators George Mitchell or Margaret Chase Smith. Elected in 1996 with only 49% of the vote, Collins focused on local issues in her first term and cultivated a reputation as a moderate in a state that Al Gore won in 2000.6
Collins' interest group ratings, endorsements from liberal groups, and close affiliation with a small group of centrist Republicans from the Northeast all support this reputation.
It is important to note that Pingree did not exit her majority leadership post in 2001 due to an electoral defeat or a change in party control. Instead, term limits forced her to retire.
During 2002, the race became a key battleground in the fight for congressional control. Given the slim party balance in the Senate, Pingree predicted that races like hers were “going to decide George W. Bush's fate” (Nichols 2001, 14). Individual and party contributions poured into both war chests. Analysts expected Collins' spending to exceed the $3 million spent by Senator Olympia Snowe's 2000 campaign (Meara 2001), which it did. The race was a rarity: it was the first all-woman Senate race in Maine since 1960 (only the fourth in Senate history) and featured two unmarried candidates. It was unclear how these issues would affect Maine voters, if at all. Campaigning as an outsider, Pingree argued that being a female challenger “raises the prospect that change is possible” (Nichols 2001, 18). We built a state-level model and released our forecast in the context of a high-profile and potentially competitive campaign.
Forecasting Senate Races with State-Level Models
Congressional forecasting models that are based on national-level time series have at their theoretical core a political economy explanation of the vote. Support for the party in the White House, V, is thought to be a function of national economic performance, E, and national political performance, P. Hence,

where Vn is the Senate vote for the incumbent party nationally, E is national economic growth, and P is national presidential popularity. The argument is a classic one of voter-sanctioned rewards and punishments, with good performance eliciting more votes, and bad performance leading to fewer votes. This model has been tested in aggregate national time series, as in the analyses in Table 1.
Of course, it is also possible to posit such a model for a single state, rather than for the national outcome, such that

where the variables are defined as in Equation 1, except that Vs=the Senate vote that the incumbent party receives in the state.
The state-level model in Equation 2 assumes that voters reward (or punish) the party that occupies the White House, based on national economic and political conditions. The argument is not farfetched. Voter Z simply has to reason that the party in the White House merits support (in this case, in statewide elections) because the economy is strong and/or the president is doing his job well. Bivariate evidence from Senate elections in Maine from 1954 to 2000 supports this hypothesis. The more popular the president is nationally, the more votes his party receives (Pearson's r=.48). The evidence, albeit weak, also shows that the higher the rate of national economic growth, the more votes the incumbent party receives (r=.13). Results as suggestive as these call for a multivariate analysis.
In Table 2, we use OLS regression to estimate three multivariate, state-level U.S. Senate models using Maine data from 1954 to 2000 (the time frame for available data). The National Model (column 1) shows incumbents' share of the Maine Senate vote as a function of national trends in presidential popularity (Presidential Approval Change) and economic growth (Change in Real GDP), plus the incumbent party's share of the vote in the last Senate election for the seat (Incumbent Party Previous Vote). The results are weak in that only presidential job approval is a significant predictor of the vote. On the whole, the model is unsatisfactory in that it accounts for less than 25% of the variance in incumbents' vote share over time.

To improve forecasting accuracy, it would seem useful to introduce statelevel variables that account for specific Maine candidates and campaigns. Challenger Quality deserves special attention. As shown in recent elections, strong challengers cut into an incumbent's lead more effectively than weak challengers (Green and Krasno 1988; Jacobson 1980; Squire 1992). In Maine Senate races from 1954 to 2000, highprofile challengers such as a former governor and House members were able to defeat incumbents. There are several ways to measure challenger quality. We selected a straightforward dummy variable. Challengers were judged to be of high quality (and scored 1) if they held an office with statewide visibility (like governor, U.S. House member, statewide party leader, or state officeholder). Lower-profile officials (i.e., rank-and-file state legislators, local officials) and political amateurs were judged of low quality (and scored 0).8
In alternative specifications of the statelevel models, we used a more refined measure of challenger quality (Lublin 1994), one based on a four-point scale. Inclusion of this variable reduced the overall fit of the model, so we chose to keep the more straightforward measure.
Adding Challenger Quality to the original model creates State-Level Model I (column 2). Incumbent Party Previous Vote and Presidential Approval Change are statistically significant at the .05, and Change in Real GDP barely misses significance at that level.9
We continue to use national economic growth in this model for two reasons. None of the state-level economic measures used in early specifications were significant predictors of the Senate vote. Second, research on economic voting suggests that voters are more likely to assign responsibility for state economic conditions to governors rather than Senators, and that Senators' share of credit or blame for the national economy is filtered through presidential party (Atkeson and Partin 1995). Our results basically support these findings.
Another key state-level variable is campaign spending. While there is debate over the effectiveness of incumbent spending, scholars agree that challenger spending is a critical factor in election outcomes (Green and Krasno 1988; Jacobson 1990). To the extent that U.S. Senate challengers match incumbent spending, their statewide name recognition and share of the vote increase (Gerber 1998). In Maine Senate campaigns where campaign finance data are available (1972–2002), one half of challengers spent an amount that equaled or exceeded spending by the incumbent. To measure the impact of spending, we constructed an index called Incumbent Spending Advantage: the ratio of incumbent spending to challenger spending in the two-year campaign cycle.10
We used the latest Federal Election Commission reports to estimate Collins' spending advantage over Pingree. As of the last available report (May 30), Collins had raised $2.6 million to Pingree's $1.9 million, which translated into an incumbent-challenger ratio of 1.37:1.
State-Level Model II (column 3) contains all of the variables from State-Level Model I, but adds the campaign spending index. Because pre-1970s spending data are not available, the model is estimated for elections from 1972 to 2000. Both Challenger Quality and Incumbent Spending Advantage are highly significant (p<.01), and substantively they have the largest standardized coefficients (beta weights=–.67 and .44, respectively). The model explains 97% of the variance in the incumbent party's vote share, a big improvement over the other models. The fairly low Standard Error of the Estimate (3.7 percentage points), appropriate for gauging out-of-sample forecasts like 2002, is also a good sign. Diagnostic tests show that the results are robust despite the small number of cases.11
We ran regression diagnostics to check for outlier or autocorrelation problems (Fox 1991). Adding the 2002 race to the model changes the coefficients and R2 hardly at all. None of the studentized residuals exceed the critical absolute value of two. As expected, given the robust coefficients, none of the DfBetas (measuring the impact of dropping a case on the coefficients) exceed the critical value. Autocorrelation is also not a problem (Durbin-Watson statistic=2.01).
Models vs. Polls: Forecasting the 2002 Maine Senate Race
On Election Day 2002, Susan Collins defeated Chellie Pingree by 17 percentage points. By some measures, however, Collins underperformed. From 1954 to 2000, 11 out of 14 Maine Senate incumbents won a larger share of the vote. Did we accurately predict the size of Collins win? We focus on State-Level Model II because it is the most complete specification and was the basis of the late-July forecast that we released to campaign strategists. As former majority leader of the Maine state senate, we judged Pingree to be a high quality challenger. Collins had a modest advantage in campaign funding (incumbent to challenger ratio=1.37). President Bush's rising approval rating, up 20 points from July 2001 to July 2002, also hurt Pingree. This contrary tug of national and local forces is captured in Model II, which forecast Collins' vote share within three percentage points (forecast=61.6, actual=58.5).12
Excluding open-seat races does not substantially change the coefficients or model fit.
Yet statistical models are not the only available method for forecasting Senate elections. During election years, pollsters conduct tracking polls to capture the ebb and flow of races. In step with “horse race” journalism, media outlets use poll results to predict winners (Patterson 1994). These predictions can be implicit (as in judging candidate viability early on) or explicit (as seen in prognostications weeks or even months before the election). How does our model's accuracy fare in comparison to the tracking polls? Because the 2002 Maine Senate race was of strategic importance, surveys were conducted throughout the year. Table 3 reports the results of independent tracking polls in 2002, from late January to the week before the election.

Were these tracking polls accurate in 2002? “Yes,” in the sense that they always gave the lead to the eventual winner. “No,” in that they consistently exaggerated Collins' eventual margin of victory. Collins won the election by 17 percentage points, but the average margin in the eight tracking polls through October 26 was 32 points. Throughout the year, the polls showed Pingree much farther behind than she ended up being on Election Day. From May through October, the crucial months of the campaign, the tracking polls expected Pingree to suffer an overwhelming defeat. Only the very last poll (conducted by Strategic Marketing ending October 27), got the margin of victory about right, reporting Collins ahead of Pingree by 19 points.
For the 2002 Maine Senate race, which provided the better forecast—our model or the polls? In evaluating forecasting methods, Lewis-Beck (1985) stresses that the most important components are Accuracy (A) and Lead (L). The closer a point estimate is to the outcome, the more accurate it is. The earlier the forecast is made, the greater lead it has. An optimal forecast combines high accuracy with a long lead time. A summer forecast that hits the incumbent's vote share on the head would be one of extremely high quality. Using these criteria, all of the polls in Table 3 must be judged of poor quality: the first eight because they greatly exaggerated Collins' margin, the ninth and final poll because it offered almost no lead time, appearing just a week before the election. Tracking polls released so close to the election have little forecasting value to parties or contributors looking to allocate resources where they could be decisive.
Lewis-Beck (1985, 61) offers the following formula for evaluating the overall quality of a forecasting instrument:

where Q=the quality rating of a forecasting instrument, A=accuracy, scored from low (0) to high (3), L=lead time, scored from no distance away (0) to a long distance away (3), and M=27, the maximum possible score for the numerator. This gives Q a theoretical upper bound of 1.00. Suppose that for the final poll by Strategic Marketing, A=3 (for excellent accuracy), and L=1 (for short lead time), then Q=9/27=.33. This is a modest score, and no better than the other polls in Table 3. For example, we can evaluate the July poll, which has a longer lead time as: A=1 (for poor accuracy), L=3 (for a long lead time). So in this case, again Q=9/27=.33, indicating a low-quality forecast.
Applying this formula, none of the polls from the 2002 Maine Senate race were strong forecasting instruments. Even so, there are other problems with using tracking polls to make forecasts. First, the existence of multiple polls (with diverse methods and time frames) makes prediction difficult because forecasting is a prospective activity. Which poll should be trusted: the latest, earliest, or a combination? The trade-off between accuracy and lead complicates this choice because lead time gives a forecast its interest, but accuracy gives it its power. Second, Caller ID and low response rates are making it increasingly difficult for survey researchers to predict actual turnout (Fund 2002). Third, the fact that media outlets treat polls as forecasts creates an implicit anti-challenger bias. Challengers' low name recognition early on, combined with the reality that voters are making up their minds later and later, creates a “catch-22” for challengers. To the extent that political parties and contributors use early polls to gauge their viability, challengers will be short-changed in the hunt for money and media coverage.
Now directly compare the tracking polls' performance in 2002 to that of the models in Table 2. The models have a reasonable lead time, with all measures available by late July of the election year. In terms of accuracy, State-Level Model II is best, with a three-point error for the 2002 forecast. As for lead time, this forecast was truly ex ante, having been made in late July. Compare the quality of the Model II forecast to that of the October 27 poll. For Model II, let us conservatively say that A=2 (good accuracy, although not quite as good as the last poll), L=3 (long lead time, much better than the last poll), and so Q=18/27=.67. By this rough measure, the quality of our forecast easily surpasses that of the last pre-election poll (Q=.33). Although this formula is more qualitative than quantitative, our model outperformed tracking polls as a prospective forecasting tool by virtue of its combination of lead time and accuracy.
Summary and Conclusions
The discipline of election forecasting is thriving, but its statistical modeling subfield has been struggling. The heralded inaccuracies of the 2000 presidential models were followed, with less fanfare, by weak congressional models in 2002. Improved models and methods are needed. Senate elections have received relatively little attention and offer an opportunity. Their electoral geography suggests a solution to the unit of analysis problem that underlies aggregated national time series models. Assuming available data, each state has Senate elections that can be studied in their own aggregate time series analysis. Comparatively, that analysis should be powerful for two reasons: (1) the state acts as its own statistical control, and (2) state-level models allow for more precise model specification via the introduction of uniquely local variables.
What is wrong with a pooled time series analysis? This approach is problematic for two reasons. There is the issue of data availability for all 50 states. Extended time series measures on campaign spending and challenger quality are not universally available from state to state in Senate and gubernatorial races. Second, not all states are equally interesting to forecasters. In many states, an incumbent victory is a foregone conclusion; in others, there looks to be a real race. Party strategists routinely target competitive races. These races merit special attention from forecasters and should not be dumped into a nationwide pool of state data.
The 2002 Maine Senate race drew heavy interest from inside and outside the state. To produce a better forecast, we drew on a standard theory of referendum voting and combined it with greater attention to state-level forces like challenger quality and campaign spending. We offered a parsimonious model of U.S. Senate elections in Maine. The model predicted the 2002 result with more accuracy and a greater lead time than nearly all of the campaign tracking polls. We see no reason why this strategy could not be reapplied to other competitive statewide races.
These results have implications for forecasting and for politics generally. They suggest that statistical modeling can be a winning methodology, especially when national models are disaggregated to the natural units of the constituency (i.e., the state in Senate races). With respect to politics, the message is for the underdog. In Maine, tracking polls consistently estimated the challenger to be much farther behind than she turned out to be. In contrast, our statistical model put her much closer to the incumbent. If early on her supporters had believed the news from the model instead of the polls, they might have been more optimistic about her chances. That belief could have generated more volunteers, more campaign money, and, ultimately, more votes.
References

Congressional Forecasting Models for the 2002 Midterm Electiona

National and State-Level Forecasting Models: The 2002 Maine Senate Race

Independent Tracking Polls in the 2002 Maine Senate Race
- 6
- Cited by