Experimental tests of opinion in foreign crises have repeatedly shown that survey respondents punish inconsistency in presidential actions (Chaudoin Reference Chaudoin2014; Levendusky and Horowitz Reference Levendusky and Horowitz2012; Potter and Baum Reference Potter and Baum2014; Tomz Reference Tomz2007). Some have directly manipulated the cost of the hypothetical conflict so that all subjects have similar expectations (Kertzer and Brutger Reference Kertzer and Brutger2016; Trager and Vavreck Reference Trager and Vavreck2011), but most audience cost experiments have not explored respondents’ default expectations about costs.
However, psychology research on construal level theory (CLT) demonstrates that people tend to discount the long-term consequences of decisions, and think about distal or hypothetical events more abstractly than they do immediate scenarios. In this paper, I argue that this tendency introduces a bias into survey experiments on foreign policy opinion about military intervention. This is because survey respondents reasoning about an impending military engagement are likelier to consider the costs of such an action than are those reasoning in the hypothetical environment. To provide evidence of this bias, I replicate the common “audience costs” experiment and introduce a prompt to consider casualties. I find that priming respondents to articulate their expectations about casualties in a foreign intervention cuts the estimated audience cost by more than half, reduces support for intervention in general, and substantially decreases disapproval of the hypothetical empty threat.
AUDIENCE COST THEORY MEETS CONSTRUAL LEVEL THEORY
Audience Cost Theory (ACT), founded on the theoretical work of Schelling (Reference Schelling1960) and Fearon (Reference Fearon1994), holds that democratic audiences care about national reputation, and so will punish leaders who back down from military threats. The theory is difficult to test, so has inspired innovative experimental studies. Tomz (Reference Tomz2007) first provided experimental support for the theory's assumption that audiences punish empty threats. Further experiments on audience costs have isolated other key variables in the causal logic of ACT: urgency of the crisis (Tomz Reference Tomz2007); ambiguity of the threat and party reputation (Trager and Vavreck Reference Trager and Vavreck2011); reasons for backing down (Levendusky and Horowitz Reference Levendusky and Horowitz2012); and other inconsistency costs (Levy et al. Reference Levy, McKoy, Poast and Wallace2015). There have been other similarly designed ACT survey experiments.
An issue that experimenters have not adequately addressed in these designs is that people think differently about hypothetical and distant situations than they do about those present and immediate. In psychology, this is often discussed in the context of CLT (Bar-Anan et al. Reference Bar-Anan, Liberman and Trope2006; Liberman and Trope Reference Liberman and Trope1998; Trope Reference Trope, Van Lange, Kruglanski and Higgins2012; Trope and Liberman Reference Trope and Liberman2010). In more distal scenarios, people engage in “high-level construal,” considering the situation abstractly, overlooking detail and focusing on general themes and desirability of outcomes (Trope Reference Trope, Van Lange, Kruglanski and Higgins2012). By contrast, in more immediate scenarios, people engage in “low-level construal,” with increased consideration of the specific details and feasibility of outcomes (Trope and Liberman Reference Trope and Liberman2000). The common example is that contemplating a vacation six months away may involve thinking about “relaxing” and “having fun,” whereas doing so a week beforehand may involve thinking about making dinner reservations and the stress of driving a rented car on foreign roads. This disparity is known to cause poor long-term planning and inaccurate predictions about one's own behavior and abilities in the more distant future (Liberman and Trope Reference Liberman and Trope1998).
Costs are associated with low-level construal. This is part of why people fall victim to what is known as “delay discounting” (Frederick et al. Reference Frederick, Loewenstein and O'Donoghue2002; Green and Myerson Reference Green and Myerson2004; Murphy et al. Reference Murphy, Vuchinich and Simpson2001; Odum Reference Odum2011), the tendency to discount (or neglect entirely) consideration of the costs and rewards in the distant future or in hypothetical situations, giving them less weight than costs and rewards closer at hand. Along these lines, we know that people procrastinate thinking about costs (Frederick et al. Reference Frederick, Loewenstein and O'Donoghue2002), do not predict their behavior well (DellaVigna and Malmendier Reference DellaVigna and Malmendier2006), and generally estimate value poorly (Gabaix et al. Reference Gabaix, Li and Laibson2005).
This understanding of human psychology is relevant in the foreign policy survey experimental context because the hypothetical scenarios shown to respondents are distant, but the scenarios they are meant to reflect in the analogous political environment are often quite immediate. This is especially true of ACT experiments, which are meant to teach us something about how people reason about impending military engagement in foreign civil wars, a costly endeavor. Prior work tells us that pecuniary costs affect support for presidential belligerence (Flores-Macías and Kreps Reference Flores-Macías and Kreps2015; Geys Reference Geys2010; Kriner et al. Reference Kriner, Lechase and Zielinski2015). Moreover, casualty levels—“the most salient cost of war” (Gartner Reference Gartner2008, 105)—have repeatedly been found to influence popular support for foreign intervention (Boettcher and Cobb Reference Boettcher and Cobb2006; Eichenberg Reference Eichenberg2005; Gelpi et al. Reference Gelpi, Feaver and Reifler2009; Karol and Miguel Reference Karol and Miguel2007; Kriner Reference Kriner2006; Walsh Reference Walsh2015). With this in mind, the vacation example's parallel in foreign policy opinion research is the possibility that the hypothetical scenario an experimental respondent reads might induce high-level, desirability concerns—like national reputation or leadership competence and consistency—while failing to account for low-level, feasibility concerns—higher taxes, American casualties, and other wartime stressors. Past research shows that an actual, impending intervention in a foreign war would induce these low-level concerns.
Some researchers have explored the effects of cost on opinion in ACT experiments. They directly manipulate the cost of an action within the vignettes themselves, inserting a definite quantity within the mind of the respondent and measuring the effect of that number (Flores-Macías and Kreps Reference Flores-Macías and Kreps2015; Kriner et al. Reference Kriner, Lechase and Zielinski2015; Kriner and Shen Reference Kriner and Shen2012; Walsh Reference Walsh2015). However, in many of these experiments, costs are neither purposefully manipulated nor discussed at all, and the context of the survey vignette remains one of low information and high abstraction. As such, we would expect respondents to engage in high-level construal and be less likely to consider relevant costs (O'Donoghue and Rabin Reference O'Donoghue and Rabin1999).
Hence, there are good empirical reasons to expect a construal level bias in the context of the foreign policy survey experiment because it relies on low information vignettes. The scholarly work outlined above implies that people assessing real military intervention hold casualties in their minds as a relevant consideration to a greater extent than to those assessing a hypothetical scenario, which does not encourage such low-level construal. As such, testing for the presence and effect of this sort of bias in experimental estimates of audience costs is an important contribution. Moreover, knowing whether respondents primed to consider costs have different levels of disapproval for belligerence (Kertzer and Brutger Reference Kertzer and Brutger2016) or policy inconsistency (Chaudoin Reference Chaudoin2014) may reveal important differences between respondents’ approaches to real and hypothetical situations of military intervention.
RESEARCH DESIGN
I designed an experiment to establish whether respondents in survey experimental tests of foreign policy opinion discount the costs associated with hypothetical acts of intervention. The basic design was to induce low-level construal by introducing an open-ended question to consider casualties before measuring approval, and then compare the size of the replicated effect within that group to that of the control, which received no such prime prior to indicating approval.Footnote 1 Rather than planting a specific casualty level in the vignette as some prior work has, I allowed the respondents to maintain whatever expectations they already held about foreign intervention. This better reflects what might happen in a poll in which survey respondents are asked their opinion on an impending act of intervention, such as polls of American opinion about intervention in Syria, in 2013 (CNN 2013; The Economist 2013).
My treatment was a single question asking how many casualties they expected in the hypothetical scenario. This prime made them consider casualties for a moment before answering the approval question. According to CLT, respondents in the control group will construe at the high level but not the low level, perhaps considering national reputation or leader's competence, but not the costs of a real intervention, such as lost lives. However, treated respondents are more likely consider them, as prior work shows they would in the “real world.”
If experimental respondents already consider the real costs of intervention in forming their responses, this inducement of low-level construal of the scenario should change nothing. However, if they skip cost thinking, we should expect to see some mitigation of audience costs. Following this logic, the key hypotheses are as follows:
Cost hypothesis. Respondents primed to consider American casualties before indicating approval will
1. indicate greater disapproval of intervention,
2. indicate less disapproval of backing down from threats,
3. generate a lower estimated audience cost.
The experimental test was straightforward. In October 2015, I recruited 1,512 Amazon Mechanical Turk users for an experiment modeled after Tomz (Reference Tomz2007).Footnote 2 All respondents were shown the same introductory statement and one vignette from Tomz, in which a dictator invades its neighbor to get more power and resources, it has a strong military, and its victory hurts the safety and economy of the United States.Footnote 3 This group garnered the highest disapproval in his study and should represent the most difficult test for my hypotheses.
The respondents were assigned randomly to one of six groups in a 3×2 arrangement.Footnote 4 They each encountered one of three presidential actions now standard among audience cost experiments: the president stays out of the conflict (the Stay Out condition); threatens intervention and backs down (empty threat); or threatens and intervenes (follow through). They also encountered an attention check. Respondents were asked for their approval/disapproval following the scenario:
Do you approve, disapprove, or neither approve nor disapprove of the way the U.S. President handled the situation?
○ Approve
○ Disapprove
○ Neither approve nor disapprove
I followed up with two questions to create a seven-point ordinal measure: strongly disapprove, disapprove, lean disapprove, neutral, lean approve, approve, and strongly approve.
The treatment group received the following question before answering the approval question:
How many U.S. troops would you expect were [to be] killed when [if] the president [had] decided to stop the invasion with military force?
○ Fewer than 50
○ 50–99
○ 100–499
○ 500–1,999
○ 2,000–4,999
○ 5,000–10,000
○ More than 10,000
This prompted them to consider the cost of intervention in terms of casualties, and report their expectations, just before they indicated their level of approval.
The control group answered the approval question first, providing the measure of the dependent variable before indicating their expectations about casualties. In the control group, the casualties question serves the role of a data collection instrument; its presence has no effect on the findings discussed below.
EXPERIMENTAL RESULTS
Figure 1 displays the differences in respondent estimates of casualties across vignettes. The stay out group expected higher casualties than the empty threat group, who expected more than the follow through group. Since casualties are not discussed or even alluded to in any of the three vignettes, one would expect pretreatment perceptions about casualties to be orthogonal to vignette assignment. However, vignette assignment clearly alters expectations, even without mentioning casualties.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190212014050373-0683:S2052263018000222:S2052263018000222_fig1g.jpeg?pub-status=live)
Figure 1 Effect of Vignettes on Expected Casualties.
Dafoe et al. (Reference Dafoe, Zhang and Caughey2018) call this kind of variance in assumptions about undiscussed details in the vignettes an “information equivalence violation.” This kind of systematic variation in background assumptions can sometimes mediate experimental effects through atheoretical pathways. In addition to the construal level discrepancy identified in this study, the difference in respondent expectations about casualties identified here could also be responsible for some portion of the treatment effects found in previous experiments.Footnote 5 Importantly, this possibility does not affect the validity of the comparisons made in this paper, which are based on assignment to a casualties prime, not to a specific scenario.
I analyzed differences in presidential approval across treatment groups in two ways: independent two-group t-tests using the seven-point measure of approval to obtain the initial comparison across the treatment groups;Footnote 6 and the better suited ordered logit test, with standard demographic controls.Footnote 7
Figure 2 shows that the casualties prime significantly increased average approval of the president in the empty threat condition, by about 0.4 points on our seven-point scale. The other two groups move in the expected direction, but are insignificant. Experimental studies of audience costs generally hinge upon changes within the empty threat group, so this result is pivotal.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190212014050373-0683:S2052263018000222:S2052263018000222_fig2g.jpeg?pub-status=live)
Figure 2 Difference in Approval from Casualties Prime.
The ordered logit model provides an even clearer picture.Footnote 8 The difference for the stay out group was negligible, but the empty threat and follow through groups saw a similarly oriented effect on approval of intervention, significant at the 0.05 level. Figure 3 shows the difference caused by the casualties prime on the predicted probability of assignment to each approval category. Respondents in the empty threat group, primed to consider casualties were: 5.9% less likely to strongly disapprove, 1.9% less likely to somewhat disapprove, 3.3% more likely to somewhat approve, and 2.7% more likely to strongly approve. Conversely, follow through respondents were 3.1% more likely to strongly disapprove, 1.9% more likely to somewhat disapprove, 2.1% less likely to somewhat approve, and 4.9% less likely to strongly approve. In short, primed subjects were more critical of intervention and less critical of inconsistency, supporting the first two components of the cost hypothesis.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190212014050373-0683:S2052263018000222:S2052263018000222_fig3g.jpeg?pub-status=live)
Figure 3 Change in Predicted Assignment with Casualties Prime.
Most importantly, it sets up the hypothesis’ third component, because the muted disapproval of the empty threat altered the “absolute audience cost.” This was defined by Tomz as the “surge in disapproval” caused by not following through on the commitment to intervene, as compared to a commitment to stay out of the conflict altogether (Tomz Reference Tomz2007, 829).
Table 1 replicates Tomz's method of calculating absolute audience cost (Tomz Reference Tomz2007, 827), and demonstrates the difference in absolute cost caused by the casualties prime. With an absolute cost of 17 for the control group, my estimate approximates Tomz's estimate of 16. However, the casualties prime resulted in an 11-point drop in absolute audience cost, leaving it at just 6 for the treated group. Some discounting of casualties may be expected in these experiments, but that priming casualty consideration should alter the standard result so much is surprising. Prior work indicates expectations about casualties inform opinion on military engagement, so this result does indeed imply a gap between how subjects respond to real and hypothetical acts of military intervention.
Table 1 The Effect of Casualties Priming on Absolute Audience Costs
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190212014050373-0683:S2052263018000222:S2052263018000222_tab1.gif?pub-status=live)
95% confidence interval in parentheses.
*Discrepancy due to rounding error.
DISCUSSION
Psychology research associated with CLT suggests that people tend to give less consideration to low-level concerns in abstract scenarios than they do in those closer at hand. Costs are subject to low-level construal and consideration of the feasibility of outcomes. However, abstract situations encourage primarily high-level construal and prioritizing concerns of desirability. In the survey experimental context, this pattern would seem to suggest the existence of a small but specific external validity problem: that subjects responding to the low information vignettes common in these designs would give relevant costs less consideration than they would if they were facing the analogous situation of impending American intervention in the real world.
Two findings reported herein affirm cause for this concern. First, I find an effect consistent with expectations derived from CLT and delay discounting. Respondents asked to estimate casualties support a hypothetical act of intervention to a significantly lesser degree than those who are not. Inducing them to think about casualties increases approval of backing down from commitments and decreases approval of following through on threats. This is despite the fact that they reported a relatively lower expected casualty level in the follow through scenario.
Second, the casualties prime substantially decreased estimated audience costs. This might imply that some prior experimental results on audience costs are less reflective of real-world opinion formation on foreign intervention than they are taken to be. Citizens assessing real military intervention consider casualties. Inducing them to do so in the hypothetical is shown here to strongly mute the size of the estimated audience cost the president faces.Footnote 9
Two methodological recommendations are implied by these results. First, to improve external validity, experimentalists embedding their treatments in vignettes would benefit from considering possible gaps in construal level between their scenario and the analogous political environment, and then design features to encourage the right level of consideration. This may take the form of a similar question to the one employed in this design, a simple suggestion in the main prompt to consider the costs and benefits, or explicit information about low-level issues embedded in the vignette.
Second, respondents have different assumptions about casualties across treatment groups, even though they are mentioned nowhere in the initial scenario. The presence of an information equivalence violation in this replication study may also suggest an internal validity threat in the basic hypothetical design. Without some evidence that the effect is carried through a theoretically outlined pathway, it is difficult to know how exactly to interpret it. Whenever possible, experimentalists should make sure to incorporate tests of causal mechanisms in their work.
Several useful avenues for future work are implied in these findings. A project aimed at fully understanding these findings might identify whether and to what extent a similar prime to consider casualties would affect opinion on a real international military engagement. Another useful direction might be to explore untested respondent assumptions about the benefits of intervention, and test whether those are construed similarly to costs. Further work might also explore the extent to which information equivalence violations are present in or responsible for common experimental findings. Overall, there is much room to expand our understanding of respondent behavior in the experimental context within the foreign policy research agenda.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/XPS.2018.22