Hawkish Biases and Group Decision Making

Joshua D. Kertzer; Marcus Holmes; Brad L. LeVeck; Carly Wayne

doi:10.1017/S0020818322000017

Hawkish Biases and Group Decision Making

Published online by Cambridge University Press: 11 March 2022

and

Abstract
Biases and Group Decision Making
Research Design
Analysis
Conclusion
Data Availability Statement
Funding
Footnotes
References

Abstract

How do cognitive biases relevant to foreign policy decision making aggregate in groups? Many tendencies identified in the behavioral decision-making literature—such as reactive devaluation, the intentionality bias, and risk seeking in the domain of losses—have been linked to hawkishness in foreign policy choices, potentially increasing the risk of conflict, but how these “hawkish biases” operate in the small-group contexts in which foreign policy decisions are often made is unknown. We field three large-scale group experiments to test how these biases aggregate in groups. We find that groups are just as susceptible as individuals to these canonical biases, with neither hierarchical nor horizontal group decision-making structures significantly attenuating the magnitude of bias. Moreover, diverse groups perform similarly to more homogeneous ones, exhibiting similar degrees of bias and marginally increased risk of dissension. These results suggest that at least with these types of biases, the “aggregation problem” may be less problematic for psychological theories in international relations than some critics have argued. This has important implications for understanding foreign policy decision making, the role of group processes, and the behavioral revolution in international relations.

Keywords

Political psychology foreign policy decision making aggregation problem group processes cognitive biases

Type: Research Article
Information: International Organization , Volume 76 , Issue 3 , Summer 2022 , pp. 513 - 548

DOI: https://doi.org/10.1017/S0020818322000017 [Opens in a new window]
Copyright: Copyright © The Author(s), 2022. Published by Cambridge University Press on behalf of The IO Foundation.

The past several decades have seen a surge of interest in psychological approaches to the study of international politics.Footnote ¹ Unlike structural realist or rationalist approaches, which largely study features of the environments in which actors are embedded, psychological theories of international politics turn to the properties of actors themselves.Footnote ² A large volume of literature has thus emerged on the psychology of political elites: their operational codes, personality traits and leadership styles, and so on.Footnote ³ One of the central insights of this literature is that leaders are imbued with many of the same psychological mechanisms as ordinary citizens: they are prone to misperceptions, engage in motivated reasoning, and rely on heuristics and biases.Footnote ⁴

The presence of these biases in decision making is of particular importance. As Kahneman and Renshon note, in the context of foreign policy, nearly all of the cognitive biases uncovered by psychologists would lead political leaders to make more hawkish decisions, all else equal.Footnote ⁵ That is, these tendencies increase suspicion, hostility, and aggression toward potential adversaries, increasing the risk of political conflict and violence.Footnote ⁶ Individuals’ tendency to take risks to avoid a loss, for example, could encourage leaders to prolong wars beyond the point at which victory is achievable, engaging in risky offensives with little chance of success.Footnote ⁷ Likewise, leaders may become less willing to make concessions and more willing to risk large losses when bargaining.Footnote ⁸ The biased ways in which people assess the motives of adversaries could also increase the potential for conflict.Footnote ⁹ For instance, individuals tend to assess the intentionality of an act by its consequences, rather than by a thorough examination of the perpetrator's motives.Footnote ¹⁰ As a result, wartime actions that produce morally bad outcomes are more likely to be deemed intentional than identical actions that produce morally good outcomes.Footnote ¹¹ Yet another cognitive bias that can prolong or worsen conflict is reactive devaluation, the tendency of individuals to immediately discount or devalue proposals coming from an adversary, compared to identical proposals offered by one's own side or a third-party mediator.Footnote ¹²

Yet for all of its rich insights, this literature has wrestled with a challenge. Most of what scholars know about psychological biases in decision making comes from the study of individuals, but many foreign policy decisions are made in group contexts. Indeed, groups are often used in foreign policy decision-making settings precisely because of their (presumed) ability to counter the decision-making pathologies or shortcomings of individuals acting in isolation.Footnote ¹³ Thus the theoretical and empirical value of insights from the behavioral sciences on the pathologies of individual decision making are often criticized in the study of foreign policy for a lack of clear understanding of how preferences, information, or traits aggregate into group-level decisions, with critics typically arguing that these psychological biases should be mitigated or otherwise cancel out in group settings.Footnote ¹⁴ Even proponents of psychological approaches have noted this limitation. In an important review of prospect theory, for example, Levy notes that “Most of what we want to explain in international politics involves the actions and interactions of states … each of which is, in principle, a collective decision-making body. The concepts of loss aversion, the reflection of risk orientations, and framing were developed for individual decision making and tested on individuals, not on groups, and we cannot automatically assume that these concepts and hypotheses apply equally well at the collective level.”Footnote ¹⁵ Writing two decades later, Hafner-Burton and colleagues express a similar concern, noting that institutional structures are often designed precisely to mitigate individual psychological biases.Footnote ¹⁶

Ultimately, however, the question of how psychological biases in foreign policy aggregate in groups—and whether groups indeed attenuate these biases—remains an empirical one, as theories of aggregation provide few guarantees. For example, Arrow's famous “impossibility theorem” shows that, even if all the individuals in a group are perfectly rational and calculating, many aggregation mechanisms can still produce irrational choices.Footnote ¹⁷ Meanwhile, other theorems show that aggregation can lead to more optimal decision making. However, such improvement often requires a set of fairly restrictive assumptions. For example, Condorcet's well-known jury theorem shows that sufficiently large groups can make better decisions if each individual votes independently and makes the right choice with probability greater than 50 percent. Yet, violating any of these assumptions may actually cause groups to make worse decisions than individuals.Footnote ¹⁸ This could be particularly concerning in many foreign policy decision-making contexts, where policy is often decided by small groups of individuals who influence one another and who may be systematically biased toward the wrong decision.Footnote ¹⁹

In this piece, we offer what we believe to be the first direct experimental test of the aggregation of psychological biases in foreign policy. We field three large-scale online experiments, where nearly 4,000 participants work through a series of foreign policy scenarios, which they completed either as individuals, or in one of two different types of group structures. We find that three prominent tendencies from the behavioral decision-making literature—risk taking to avoid a loss, the intentionality bias, and reactive devaluation—largely replicate in small-group contexts. We find no evidence that these tendencies are significantly reduced in group settings, and find that in some decision-making contexts they may even be exacerbated. Moreover, we find little evidence that more experienced leaders can improve group decision making or that more diverse groups are less prone to hawkish biases. These findings have important implications for how we understand the role of group processes in foreign policymaking, suggesting that groups are not a panacea for producing optimal policy decisions, and that we should not assume that the psychological tendencies that shape individual decision making do not appear in collective contexts as well.

Biases and Group Decision Making

The question of how group processes affect decision making is not a new one. Indeed, outside of international politics, there is a rich and diverse literature that has explored the ways in which group settings affect bias and judgment. In legal studies, for example, research on jury decision making explores how juror-level characteristics aggregate in shaping jury-level decisions.Footnote ²⁰ In business administration, organizational behavior research focuses on how the traits of team members have varying effects on team performance depending on the types of tasks.Footnote ²¹ In social network analysis, scholars have experimentally studied the conditions under which collective decision making outperforms individual decision making.Footnote ²² Indeed, a small cottage industry has now formed that includes interdisciplinary approaches to “small group decision making,” which investigates, among other things, individual cognitive biases and under what conditions they might be overcome (or exacerbated) in a group setting. Even nonhuman animal models might offer relevant insights. A school of fish can follow light too weak for any individual fish to follow, for example.Footnote ²³

While this diverse scholarship may offer crucial insights for the study of foreign policy, it has important limitations. Many invocations of the “aggregation problem” in political science are more philosophical than empirical, assuming ex ante that aggregation is a challenge rather than empirically testing the specific contexts in which psychological variables should or should not aggregate.Footnote ²⁴ Because of the high cost of bringing large numbers of people into the lab, many of the canonical experimental tests of aggregation in group decision making have traditionally been somewhat underpowered, testing the impact of relatively small groups.Footnote ²⁵ Thus it has been difficult to identify what aspects of group decision making causally affect outcomes. Perhaps most importantly, foreign policy decision making involves three theoretically relevant institutional structures and task properties that differentiate it from some of the main configurations frequently studied in the literature outside political science.

First, foreign policy decision making, particularly over security issues, often features ill-structured problems, where the probability distributions may be unknown.Footnote ²⁶ Actors may not know, or may disagree on, the parameters of the decision-making task; they may even disagree on the ultimate goal with respect to the decision to be made. These situations stand in contrast to much, though not all, of the small-group research and analysis of aggregation that occur in other disciplines. Investigations of cognitive biases, for example, often use well-structured problems with clear probability distributions. Alternatively, studies that investigate the “wisdom of crowds” will often use difficult, but nevertheless clearly structured, math problems.Footnote ²⁷ It therefore remains unclear how generalizable insights from clearly structured problems may be to decision making in the more amorphous context that characterizes much of international politics.

Second, foreign policy decision making often involves hierarchically structured groups, where the chain of command and the decision-making rules are known to all the actors involved. While the existing research on small group dynamics and decision making in groups takes many forms, including analysis of groups within large-scale hierarchical settings such as firms, much of the research political science has brought in has tended to focus on “flat” or horizontal groups, such as teams, and has not systematically compared the effects of hierarchical versus horizontal decision-making structures.Footnote ²⁸ Hierarchies may emerge endogenously over time as a result of specific group members’ personalities, but this is theoretically very different from ingrained hierarchies built on formal and clear roles and decision-making rules.Footnote ²⁹ It is partly because of the hierarchical nature of many foreign policy institutions that much of the foreign policy decision-making literature focuses on leaders, rather than advisers.Footnote ³⁰ Moreover, without manipulating these structural conditions it is difficult to gain analytical leverage on how hierarchy affects foreign policy decision making.

Third, the substantive focus of scholars of foreign policy decision making, including distinctive outcomes of interest, are often very different from those studied in small-group research in other domains. Analysts of foreign policy are often interested in explaining specific dependent variables, such as a decision to use force. These are quite different from those often studied in small-group research, such as team morale or workplace satisfaction in a business context, or performance on mathematical exercises. It may be that the specific decisions of interest, such as the use of force, engage different aggregation processes, limiting the utility of extrapolating findings from small-group research to foreign policy.

Empirical research in political science has tended to focus on how groups might improve decision making, which brings in a normative component, and has returned a mixed bag of results: factors such as group size, composition, decision-making rules, political context, and leadership can all affect the quality of the decision-making process and outcome.Footnote ³¹ For example, groupthink, the most famous psychological dynamic documented in political group decision making, whereby group members’ striving for unanimity exacerbates decision-making pathologies, is hypothesized to be a contingent phenomenon, most likely to emerge under conditions of strong social-unit cohesion and external stress.Footnote ³²

Driven by this finding, as well as subsequent research affirming the danger of group members’ striving for unanimity, many of the most prominent proposals for improving the quality of foreign policy decision making focus on constructing a diverse decision unit, led by an experienced leader who fosters healthy debate and dissent in the policymaking process.Footnote ³³ These principles guide decision-making models such as multiple advocacy, the competitive advisory system, and distributed decision making.Footnote ³⁴ Indeed, the perceived value of diversity as a tool to harness the mental power of groups and improve decision making is a hallmark of much recent scholarship.Footnote ³⁵ However, diversity is not without risk, and may also increase intragroup conflict and decision paralysis.Footnote ³⁶ Thus the benefits of diversity in improving decision making may depend on the presence of a leader who is well positioned to channel that diversity in productive directions. For example, research has suggested that a leader's experience, leadership style, predispositions, and personality can all shape their ability to harness the information-processing power of groups to improve decision making.Footnote ³⁷ However, most research in political science on group decision making has relied on small-N case studies, which limits our ability to identify how different attributes of the group setting, such as the distribution of information individuals have or the experience they bring to the table, affect the quality of decision making.

In sum, while there are impressive cognate bodies of literature on aggregation outside of political science, and rich descriptive evidence on group dynamics in policymaking settings, we do not yet have strong experimental evidence regarding the effects of groups in the complex settings that characterize foreign policy decision making, nor do we fully understand how different decision rules, group composition, and leader attributes shape these processes.

In this study we test for the effects of group decision making on the prevalence of three well-known cognitive biases that have been observed in individual decision making: risk taking to avoid a loss, the intentionality bias, and reactive devaluation.Footnote ³⁸ Each of these biases has been theorized to bias political elites in a “hawkish” direction.Footnote ³⁹ In other words, all else equal, the presence of these biases may cause leaders to demonstrate a greater “propensity for suspicion, hostility, and aggression in the conduct of conflict, and for less cooperation and trust when the resolution of conflict is on the agenda” than is objectively warranted.Footnote ⁴⁰

For example, loss aversion could reduce leaders’ willingness to compromise in negotiations. Their own concessions would be viewed as “losses,” while an adversary's concessions would be viewed as “gains”—and even when these concessions are equal, the gains would feel smaller than the losses, and so compromises would likely be rejected.Footnote ⁴¹ Similarly, the intentionality bias, whereby individuals assess whether an action was intentional based on its effects, may lead to misperceptions or unfounded certainty regarding intentionality. Actions with negative consequences, or “side effects,” are more likely to be seen as intentional. Such ascriptions are relevant in a range of contexts, from security dilemma escalation to public assessments of blame in civil conflicts.Footnote ⁴² Finally, reactive devaluation—a bias whereby a proposal is automatically perceived as less valuable if offered by an adversary—has been shown to affect attitudes toward negotiations in various political conflicts, from US–Soviet interactions during the Cold War to the ongoing Israeli–Palestinian conflict.Footnote ⁴³ Together, then, these three biases have the potential to reduce the likelihood of negotiation success and trigger or prolong violent political conflict. Assessing the extent to which these individual-level biases scale to affect foreign policy decisions that are often made in group contexts is crucial for understanding how the institutional structures of foreign policymaking potentially mitigate or exacerbate the influence of these biases on international cooperation and conflict.

Research Design

The present study aims to examine the relative efficacy of groups in reducing the impact of these biases on decision making using three large-scale online group experiments conducted in Fall 2019 and Winter 2020, whose structure is summarized in Figure 1.Footnote ⁴⁴ By manipulating the group setting, this study provides causal leverage to examine how the cognitive biases of individuals aggregate in different types of group decision-making units. As with all experiments, there are important questions about external validity to keep in mind, which we discuss in detail later.

FIGURE 1. Study design

The study proceeds as follows. After completing an individual-differences and demographic battery, respondents are randomly assigned to one of three group conditions. In the individual condition, 760 respondents are asked to make decisions on various foreign policy scenarios individually, taking notes as they think through their options. In the two group conditions, respondents are assigned to a group with four other survey takers, in which they participate in a group chatroom, discussing their options together before deciding on a course of action. There are two types of groups: horizontal groups, where participants are asked to try to come to a collective, unanimous decision and each participant has equal say in the process; and hierarchical groups, in which one of the five participants is randomly designated as the leader of the group, and gets to make the final choice, in consultation with the four other participants, who take on the role of adviser. In the analysis that follows, the group conditions consist of 3,213 respondents, forming 771 groups (406 horizontal, 365 hierarchical) of up to five members each. We paid an average of USD 10 per subject in respondent incentives, and all together, the effective sample size (N) of the study is 3,987.Footnote ⁴⁵

After being assigned to one of these treatments, respondents pass through three separate experimental modules using canonical experimental setups to examine the prevalence of various biases in the context of foreign policy decision-making scenarios. Respondents in the individual condition complete these modules as individuals, writing down their justifications for their decisions and making decisions themselves, whereas respondents in the group conditions complete these modules as groups, deliberating as a group before reaching decisions.Footnote ⁴⁶ An example of a group deliberation is shown in Figure 2. Respondents were generally engaged in the group deliberations; in the horizontal condition, 73 to 76 percent of group members in the analysis participated more than once in each deliberation, similar to the rate observed in the hierarchical condition (74 to 81 percent), with leaders participating more frequently than advisers—though as we show in section 4 of the online supplement, our findings are robust and do not significantly vary across different levels of group participation.

FIGURE 2. Sample group deliberation transcript

The first experimental module examines sensitivity to gain and loss frames on policy preferences—a canonical finding from prospect theory. Subjects are presented with a scenario in which “600 lives are at stake in a war-torn region.” Subjects are asked to choose one of two courses of action (Policy A or Policy B). Policy A will definitively lead to 200 people dying and 400 people being saved. Policy B has a probabilistic outcome, with a 1/3 probability that no one will die (all 600 will be saved) and a 2/3 probability that 600 people will die (none will be saved). The experimental treatment within this module is whether the results of each policy are presented in the domain of gains (e.g., “200 people will be saved”) versus the domain of loss (e.g., “400 people will die”). Half of the respondents in each experimental condition (individual, horizontal group, or hierarchical group) receive the “gains” treatment and half receive the “loss” treatment.Footnote ⁴⁷

The second experimental module tests susceptibility to the intentionality bias—the degree to which assessments of intentionality are affected by the (negative) results of an event. In this module, respondents are asked to assess how likely it is that a US navy vessel sunk 100 miles off the coast of North Korea was intentionally versus accidentally targeted by the North Koreans. The randomly assigned treatment in this module is the number of casualties the sinking of this vessel has caused: none versus all 100 servicepeople on board. Half of the respondents in each experimental condition receive each treatment. This represents a more ill-structured problem than that posed by the previous experiment.

The final experimental module explores the prevalence of reactive devaluation of a trade negotiations proposal between the United States and China. Subjects view a short proposal that purports to resolve ongoing US–Chinese disputes over trade. The experimental treatment is the authorship of the text—whether the United States or China drafted the proposal. As with the first two modules, half the respondents in each experimental condition receive each treatment. Instrumentation for each of the three experiments is shown in section 1 of the online supplement.

We calculate our dependent variable differently in the three modules based on the group condition. In the individual conditions, we focus on the choice of each individual respondent. In the hierarchical conditions, we focus on the choice of each group leader. In the horizontal conditions, we primarily use a median voter rule to calculate each group's decision, but we also use two other aggregation rules (majority vote and unanimity) to test how sensitive our findings are to other means of aggregating group members’ votes. We describe these different aggregation methods in detail in section 2.1 of the online supplement.

Together, these studies are useful because they allow us to examine the extent to which hawkish biases replicate in individual settings and the degree to which group discussion—and the structure and composition of those groups—affect their prevalence, in experiments that differ from one another in a variety of ways. The existing literature lends us strong theoretical expectations in regard to the individual condition, given the canonical nature of these cognitive biases: we expect that individuals will be more risk seeking in the domain of losses than the domain of gains, will be more likely to assess an incident as intentional when its costs are higher, and will evaluate a proposal from an adversary more negatively than the same proposal from their own side.

Yet given both the novelty of our particular study and the contradictory arguments in the literature on the efficacy of groups in reducing biases, the ultimate effects of groups on these hawkish biases remains an open question. Groups could reduce the prevalence of hawkish biases, exacerbate them, or have no effect—particularly given that these hawkish biases may be deeply ingrained, or outside the realm of conscious awareness.Footnote ⁴⁸ Empirically adjudicating between these competing expectations constitutes one of the central contributions of our study.

Analysis

To test these competing expectations, we turn to each of our three experiments in sequence. For each experiment, we first look within each group condition (individual, horizontal, hierarchical) to examine the prevalence of the hawkish bias tested (susceptibility to gains/loss framing, the intentionality bias, or reactive devaluation). We then compare these differences across groups to assess the extent to which these different decision-making structures affect susceptibility to each of the tested biases. Finally, we probe the robustness of our findings, assessing the degree to which various types of leader characteristics or aspects of group diversity affect susceptibility to biases and the ability to reach a decision in the first place.

Susceptibility to Gains/Loss Framing

We begin by examining the prevalence of a canonical hawkish bias across our three group formulations: the effects of loss-versus-gains framing on individuals’ acceptance or avoidance of risky choice.

In the individual condition, our results strongly replicate the core finding of prospect theory. When choices are framed as a potential loss (e.g., of life) individuals are significantly more likely to choose the probabilistic policy—that is, they are more accepting of the risk that all 600 lives will be lost, in order to preserve the possibility of an outcome where no one dies. In contrast, those presented with a gains framework, where people may be saved, are much more risk averse, preferring the nonprobabilistic Policy A (200 people will be saved).

Do groups reduce susceptibility to this bias? Our results suggest they do not; if anything, groups may increase the effect of frames on choice. In both types of groups, groups randomly presented with loss frames are significantly more likely to prefer the probabilistic outcome than groups that were presented with a gain frame (Figure 3). Examining the magnitude of these effect sizes across decision-making structures, we find that hierarchical groups in particular are significantly more sensitive than individuals to framing effects.Footnote ⁴⁹

FIGURE 3. Prospect theory framing effects replicate in groups

Comparing the horizontal groups to individual decision makers, Figure 4 suggests that the susceptibility to gain/loss frames may depend on the specific decision rule used to assess these groups. For example, examining horizontal groups that succeeded in reaching a unanimous decision, we find similar results as in the hierarchical condition: the group setting increases susceptibility to these framing effects (p < .005). However, if we examine the full set of horizontal groups using a less stringent decision rule, such as a majority rule (p < .09) or median voter (p < .16), we do not find evidence that horizontal groups perform significantly differently than individuals. Either way, it is clear that horizontal groups do not reduce susceptibility to prospect theory's framing effects.

FIGURE 4. Prospect theory framing effects by horizontal aggregation method

Intentionality Bias

Next, we examine the relative prevalence of the intentionality bias across group settings. While the prospect theory module examines a fairly well-defined decision problem where each policy choice features known probability outcomes, the intentionality bias module examines a more complex choice: how likely do you think it is that an event was caused by a purposeful attack by an adversary? In the individual condition, our results again strongly replicate the canonical intentionality bias finding. When the consequences of an event are more negative (in this case causing fatalities), individuals are significantly more likely to assess the event as an intentional provocation rather than the result of an accident or miscommunication. Group settings do little to attenuate this tendency: both horizontal and hierarchical groups are significantly more likely to assess the sinking of a US navy ship as the consequence of an intentional attack by the North Koreans when there are fatalities reported (Figure 5).

FIGURE 5. Intentionality bias replicates in groups

However, unlike the prospect theory experiment, with the intentionality bias, we find that groups have no effect on the severity of this tendency. While certain group configurations tended to make our respondents somewhat more susceptible to framing effects, in this case groups perform similarly to individuals—no better or worse.Footnote ⁵⁰

As before, horizontal groups that reach a unanimous decision do display a somewhat more pronounced bias than those assessed with less stringent decision rules (majority rule or median voter), but this difference is not statistically significant (Figure 6). Regardless of the aggregation method, both horizontal and hierarchical groups increase their assessments of intentionality in response to negative outcomes to a similar extent as individuals do.

FIGURE 6. Intentionality bias effects by horizontal aggregation method

Reactive Devaluation

Finally, we turn to reactive devaluation. Here we unexpectedly do not replicate the standard reactive devaluation result in two of the three decision-making conditions (Figure 7). Individuals are not significantly less likely to support a proposal authored by China than one authored by the United States. Hierarchical groups, where the decision is ultimately made by a single individual after group discussion, also do not prefer US-authored proposals.

FIGURE 7. Reactive devaluation experiment displays mixed results

On the one hand, this finding is surprising: the theoretical expectation is that proposals written by an adversary (e.g., China) will be automatically devalued with respect to proposals written by one's own side (the United States). However, work on reactive devaluation also suggests that there are two distinct mechanisms by which proposals are devalued: reactance processes that lead individuals to devalue that which is available compared to what is not, and reliance on source credibility as a heuristic for value.Footnote ⁵¹ Our treatment aims to test this second mechanism: American respondents might devalue a Chinese-authored proposal relative to an American-authored one because they would assume that the other country's negotiators do not have America's best interests in mind.

However, to the extent that source credibility drives reactive devaluation, reactive devaluation should be strongest when individuals are presented with ambiguous proposals that increase their reliance on source heuristics.Footnote ⁵² When the proposal is detailed and specific, subjects may be less likely to automatically devalue it because the proposal itself provides enough information to make an assessment. In our study, the proposal was quite specific and detailed, with bullet points outlining the exact compromises each side would make in the ongoing trade war. This level of detail may have attenuated reactive devaluation, making it easier for subjects to look past the purported authorship of the proposal to evaluate the actual proposal content.

Another possibility is that the conflict tested in this study—contested trade negotiations in the shadow of Trump-era trade wars—resulted in less reactive devaluation either because of the unusual domestic politics of the Trump era, or simply because the rivalry was less clear-cut than the violent, intractable conflicts in which this bias has historically been studied. In other words, Israelis may be more suspicious and distrusting of Palestinians, and Americans more distrusting of the Soviet Union or North Vietnam during the Cold War, than Americans in 2020 were of China, with whom the United States had a less directly confrontational relationship.Footnote ⁵³

However, even with the specificity of this proposal and ambiguity in the rivalry, we do observe reactive devaluation in horizontal groups, particularly those that reached unanimous decisions (Figure 8). Unanimous horizontal groups are marginally more likely than individuals (p < .06) to devalue the Chinese proposal relative to the American one. This suggests that, to the extent that the potential for reactive devaluation occurs in this context, groups are, if anything, increasing this tendency.

FIGURE 8. Reactive devaluation effects by horizontal aggregation method

Extensions and Limitations

Thus far, our results suggest that two canonical biases from the judgment and decision-making literature—sensitivity to framing effects in prospect theory, and the intentionality bias—persist or become even more pronounced in group settings. And, while we fail to replicate reactive devaluation in our individual condition and hierarchical group contexts, we replicate it in horizontal groups, which is inconsistent with the claim that the hawkish biases that manifest in individual settings disappear in groups. However, there are important limitations and caveats worth discussing, many of which involve questions of external validity, and differences between inevitably stylized experiments and real-world foreign policy decision making.

First, our experiments lack many of the social dynamics of real foreign policy decision-making groups where there is social pressure, people have worked with each other before (and might again), issue linkage is possible, bureaucratic interests are present, and so on.Footnote ⁵⁴ In contrast, our respondents participate anonymously, in novel groups formed explicitly for this study, with little social pressure for cohesion or prospect of future interaction.Footnote ⁵⁵ We encourage future researchers to build on these studies by incorporating some of these features into their experimental designs to determine the impact of differing levels of social pressure on group susceptibility to bias. And yet the absence of these features likely makes our findings a more conservative test of groups’ ability to reduce bias, since the features missing from our studies are also the very features typically linked to biased information-processing and pathological group dynamics such as groupthink.Footnote ⁵⁶ In that sense, the fact that we replicate the prospect theory and intentionality bias effects across all our group conditions even without the distorting effects of social conformity pressures should increase our confidence in the pervasiveness of these tendencies.

Second, in the real world, leaders are not randomly assigned but strategically selected for particular skills, attributes, or experience. On the one hand, this is precisely why experiments are helpful: in a naturalistic setting, it would be difficult to identify the effect of group structures independently of the properties of actors in specific roles in the group. Experiments, in contrast, let us harness the power of random assignment and sidestep these concerns about endogeneity. On the other hand, this also leads to an important empirical question: are groups with certain types of leaders better able to avoid these biases?

To test this question, we take advantage of the lengthy battery of individual differences administered to respondents at the beginning of the study. Since there are many potential traits that could moderate the impacts of framing effects, intentionality bias, and reactive devaluation, we adopt a data-driven approach, estimating a sparse Bayesian method for variable selection. We fit a LASSOplus model regressing our dependent variable on the treatment, a vector of twenty-one individual differences (foreign policy orientations, personality traits, demographic characteristics, government experience, and so on), and interactions between these leader-level traits and the treatment using data from the hierarchical condition.Footnote ⁵⁷ This machine-learning approach thus lets us test whether certain kinds of leaders (such as those high in need for cognition, or with more experience) better help their groups avoid these biases. Crucially, none of these leader-level characteristics significantly moderate the treatments. We thus find no evidence that groups with better leaders are less likely to display these patterns. We encourage future work to build on these findings by assigning respondents with specific traits (such as narcissism) to leader and adviser roles, to test how it affects the quality of decision making.Footnote ⁵⁸

The question of leader traits raises a related issue. Our study was conducted on samples of ordinary citizens, rather than experienced decision makers. It is of course possible that groups composed of actual elite decision makers would behave differently, though two considerations are relevant here. One is that these three hawkish biases have previously all been identified in foreign policy elites using archival and case study evidence,Footnote ⁵⁹ so we already have reason to believe that foreign policy decision makers experience hawkish biases; the question is whether group contexts moderate the magnitude of these biases at a significantly different rate among elites than they do among members of the mass public. The other is that meta-analyses of paired experiments on elite and mass samples suggest strikingly similar responses to experimental treatments, so we should not assume that they rely on fundamentally different cognitive architectures.Footnote ⁶⁰ Ultimately, however, this is an empirical question. It is also one that elite experiments may be poorly equipped to answer, suggesting benefits for archival or mixed-method approaches. Experimental or survey-based studies on real foreign policy decision makers invariably involve smaller sample sizes—effectively made smaller still once analyzed at the group level—such that many group-level elite experiments would likely be underpowered, particularly if they use the sample of elites most directly implicated by their theory.Footnote ⁶¹

Group-Level Diversity

Yet even if leader-level traits don't seem to minimize these three biases, it is possible that group-level ones do. One of the most-studied attributes of groups hypothesized to improve decision making is diversity.Footnote ⁶² Diversity refers most broadly to “compositional differences among people” within a particular unit, such as a decision-making group.Footnote ⁶³ In a decision-making context, these compositional differences are often understood as representing the interaction of different cognitive styles. As Page has argued, in the context of problem solving for example, diversity of perspectives, interpretations, heuristics, and individual predictive models that are used to infer cause and effect all come together to “increase the number of solutions that a collection of people can find by creating different connections among the possible solutions.”Footnote ⁶⁴ Diverse groups are also thought to lead to more extensive debate, increase exposure to others’ viewpoints, introduce differences in risk preferences, and avoid group pathologies such as groupthink, where striving for uniformity may overwhelm accuracy motives.Footnote ⁶⁵ In short, “diversity trumps homogeneity.”Footnote ⁶⁶ Yet, groups that are too diverse may move too far in the other direction, to where a “polythink” dynamic prevents them from reaching consensus at all.Footnote ⁶⁷ Relatedly, in some instances diverse groups may be more prone to conflict, as social identity and categorization processes may impede the value of information and perspective pooling that leads to higher group performance.Footnote ⁶⁸

We therefore examine the potential mitigating effect of diversity on susceptibility to bias, assessing whether groups with a more diverse composition are affected less by these various hawkish biases. Rather than using Herfindahl indices, which flatten diversity onto a single dimension, we operationalize diversity in a multidimensional fashion, calculating the group-level variance of a given trait in each group, and averaging across diversity scores for four types of traits, to produce measures of four different types of diversity.

We first examine diversity from a demographic perspective, in which more diverse groups are those with members with different ages, gender and racial identities, religions, and socioeconomic backgrounds. This type of descriptive diversity, in addition to being normatively valuable, has been hypothesized to improve decision making by broadening the information set and policy options reviewed and considered by a group.Footnote ⁶⁹ Second, we operationalize diversity in terms of personal dispositions: the “big five” personality characteristics, need for cognition, trait aggression, and risk orientation. This type of cognitive diversity is often studied in the organizational behavior literature, which is interested in how the variability of personality characteristics in teams affects their collective performance.Footnote ⁷⁰ Third, we turn to diversity of experience within groups, where different members of the group have varying levels of experience in leadership and small-group decision making (political or otherwise). In foreign policy decision-making contexts, diversity of experience may be particularly important, since decision units are typically a mixture of experienced bureaucrats and shorter-term political figures, themselves with varying experience in government.Footnote ⁷¹ Finally, we consider groups whose members vary in their political attitudes or orientations, including political ideology, right-wing authoritarianism, social dominance orientation, and foreign policy orientations. These types of attributes have long been theorized to play a prominent role in foreign policy beliefs and attitudes, but how the variance of these traits within a decision-making unit affects decision outcomes has been less explored.Footnote ⁷²

Regardless of how we operationalize diversity, however, we find no systematic effects of diversity on susceptibility to any of the hawkish biases we examine. Diverse groups are just as likely as more homogeneous groups, and no less likely than individuals, to exhibit these biases (Figure 9).Footnote ⁷³ It is not that diversity has no effects whatsoever: more diverse groups, particularly those with more diverse dispositions and political attitudes, are more likely to fail to reach agreement at all (Figure 10). This is particularly the case in the intentionality bias and reactive devaluation experiments, where respondents are assessing adversarial interactions with China and North Korea. Groups whose members hold different social and political attitudes are more likely to show internal dissensus and disagreement.Footnote ⁷⁴ Nonetheless, more diverse groups do not appear to be less likely to display these three tendencies.

FIGURE 9. More diverse groups are no less susceptible to these three biases

FIGURE 10. More diverse groups are more likely to experience dissension

Group Size and Modes of Interaction

Finally, there are two other considerations worth noting, which also serve as alternative interpretations of our results. One is that for the ameliorative effects of aggregation to take place, group members need to interact face to face rather than deliberate at a distance.Footnote ⁷⁵ Another is that for the ameliorative effects of aggregation to take place, groups need to be much larger; after all, foreign ministries have hundreds or thousands of individuals. While small groups might replicate individual-level biases, the “wisdom of crowds” might suggest greater rationality as groups grow in size.Footnote ⁷⁶ On the one hand, these interpretations are obviously in tension with one another, since as groups increase in size, the rate of face-to-face communication decreases. On the other, there are a number of empirical tests we can employ to speak to some of these questions directly.

First, we can exploit variation in group size in our results. The magnitude of the hawkish biases we observe does not significantly shrink with group size (see Section 3 in the online supplement), and simulation methods suggest that some might actually increase.

This pattern comports with archival evidence from the United States regarding leaders’ frustrations with the pathologies of large decision-making units and the perception that larger groups had more problematic tendencies than smaller ones. As a result, while there was variation from administration to administration, a number of high-profile decisions, from the Cuban Missile Crisis to the first Gulf War, often involve the president and a relatively small number of influential advisers.Footnote ⁷⁷ John F. Kennedy, for example, was disappointed by the results of relying on a large number of advisers, noting, “The advice of every member of the Executive Branch brought in to advise was unanimous—and the advice was wrong.” In response, at least partially, to these perceived failings of larger groups, Kennedy created a smaller Executive Committee, and often relied on ad hoc meetings of even smaller groups within it. Similarly, George H.W. Bush relied on ad hoc small groups of advisers when deciding whether to invade Iraq. The results from this study are likely directly applicable to these types of cases of relatively small group decision making, which have been quite common in historical US foreign policymaking.

Second, although all our respondents participated online rather than in person, if we think about face-to-face interaction in terms of the added information it conveys, we can test this informational mechanism directly by testing whether groups where respondents exchanged more information as part of their deliberations displayed weaker biases than groups where respondents communicated less.Footnote ⁷⁸ Interestingly, across all three experiments, for both horizontal and hierarchical groups, we find no evidence that the magnitude of the biases groups display significantly decreases with group participation (see section 4 in the online supplement).Footnote ⁷⁹

One explanation may relate to behavioral modifications that are made when more information-rich environments, such as face-to-face meetings, are unavailable. When unable to communicate with visual expressive behaviors, individuals use textual proxies for visual cues, which in some cases may enhance, rather than degrade, social bonding processes.Footnote ⁸⁰ Research in social information processing theory suggests that when individuals meet for the first time, as is the case in our study, text-based communication can enhance intimacy and self-disclosure, positively affecting relationship building.Footnote ⁸¹ For example, Wheeler and Holmes argue that face-to-face interaction as a quotidian practice of international politics is a relatively recent phenomenon, which means that text-based communication was, historically, the only route to relationship building.Footnote ⁸² Particularly as global pandemics take diplomacy online, we see questions about the role of interaction modality in group decision-making as an important topic for future research.

Conclusion

In a recent review of the problem of aggregation, Gildea notes that “how psychological mechanisms, which are primarily individually embodied, may operate and exercise influence within complex group and institutional environments remains a crucial and contested question.”Footnote ⁸³ To date, such concerns have remained largely conceptual in nature, and the answer to this question has proven elusive because studying it empirically introduces a number of difficult methodological and substantive challenges. We offer a direct test of how a particular class of psychological biases aggregate in foreign policy contexts by experimentally testing how a trio of so-called “hawkish biases” linked to foreign policy aggregate in groups. Our results, which suggest that the aggregation problem may be less problematic than some scholars have alleged, and that individual-level psychological biases do not necessarily cancel out in groups, may be surprising for some. If “the whole point of government is to ensure multiple voices and checks and balances so that rational decisions can, in theory, persist despite individual preferences and biases,” we may need to revisit the assumption that multiple voices lead to more rational outcomes.Footnote ⁸⁴ Our results suggest that the biases that manifest in lone voices are similarly present in group decision making.

One important theoretical implication of our findings is that we should be more comfortable envisioning individual-level biases scaling up to small groups in decision-making contexts. In an important application of prospect theory to foreign policy, McDermott applied the bias to a number of cases, focusing “on a unitary actor embodied by the president.” She notes that “prospect theory is less easily applied to the dynamics of group decision making, except to the extent that all members are assumed to share similar biases in risk propensity, although each may possess a different understanding of such crucial features as appropriate frame for discussion, applicable reference point, domain of action, and so on.”Footnote ⁸⁵ By analyzing prospect theory's applicability to groups experimentally, we are able to control many of these elements, including the domain of action and parameters for discussion, and our results suggest that such an application of individual psychology to groups may therefore not be as infeasible as some may fear. Further empirical work is required to assess how the experimental results we obtain here generalize to those in historical cases, while additional experimental work will likely be helpful in establishing how the group decision-making process operates. One such question concerns the study of reference point in groups. As Kameda and Davis ask, “What happens if a group is composed of some members who have experienced certain losses recently and others who have experienced certain gains recently?”Footnote ⁸⁶ Randomly assigning group members with treatments that condition their individual reference points may allow researchers to trace the effects of those reference points in the group decision-making process.

An additional potential implication concerns our failure to detect beneficial effects of diversity on group decision making. One reason for this may relate to the nature of the tasks we employ here: unlike the protocols used in many of the experimental tests in the wisdom-of-the-crowd literature, testing the “miracle of aggregation” using math problems or prediction tasks, none of these studies have an objective right answer. In this sense, though, they better resemble the ill-structured problems that characterize much of foreign policy decision making, suggesting that the wisdom of the crowd may be a poor analogy for many of the questions IR scholars care about—although we also examine this question directly in follow-up work, using incentivized group bargaining experiments.Footnote ⁸⁷ Future research should also focus on identifying other possible diversity mechanisms, such as those that relate to visible diversity and face-to-face interactions.Footnote ⁸⁸ In face-to-face contexts, group members will likely be more aware of diversity within their group, creating a possibility that group members’ knowledge of their group's diversity affects their problem solving.

Another interpretation may have to do with the robustness of the biases themselves. Perhaps the three cognitive biases examined in this study are particularly ubiquitous and resistant to attempts at mitigation. We have some empirical evidence on this front: we use the same LASSOplus approach we used in the leader characteristic analysis, but testing for heterogeneous treatment effects by individual-level traits in the individual condition. As before, none of these individual differences significantly moderate the treatments. Thus, one potential reason why we fail to find that diversity has mitigating effects has to do with the robustness of the regularities we study here. In other words, diversity may be beneficial in improving decision making in other crucial ways, even if it does not appreciably alter a group's susceptibility to these types of cognitive biases.Footnote ⁸⁹ Yet the fact that these “nonstandard preferences” appear to be so robust also suggests the merits of rational choice approaches incorporating these regularities into their models.Footnote ⁹⁰ In other experimental work, we build on these findings by examining how individual-specific traits relevant to foreign policy decision making—rather than these judgment and decision-making biases that appear to be fairly robust across individuals—aggregate in group decision-making contexts.

This is not to say that groups do not exhibit their own peculiarities that may lead to subrational or irrational outcomes. It may be, for example, that not only do groups not reduce the effects of cognitive biases, they introduce new dynamics that may exacerbate deviations from expected utility models. Early psychological research identified many of these tendencies. “Risky shifts,” or the tendency of individuals in groups to make riskier decisions than when polled individually, is a finding that led to a robust literature on group polarization, consistent with the findings of our prospect theory experiment.Footnote ⁹¹ Similarly, initial studies on group conformity spurred over half a century of investigating the conditions in which groups create conformity dynamics in foreign policy situations, particularly as they relate to perceived policy failures.Footnote ⁹² It may be, however, that groupthink is receiving unfair blame. As Whyte has argued, “history and the daily newspaper provide examples of policy decisions made by groups that resulted in fiascoes. The making of such decisions is frequently attributed to the groupthink phenomenon”—though it may be that “prospect polarization” instead is the culprit.Footnote ⁹³ Precisely because cognitive biases have largely been studied at the individual level, and not believed to be a group-level phenomenon, group-level theories such as groupthink have taken on a heavy explanatory burden. By relaxing the assumption that we need group-level theories to explain “nonstandard decision making,” new explanatory frameworks become available. It is also conceivable that the persistence of cognitive biases in groups exacerbates conformity dynamics by facilitating premature consensus, a possibility worthy of future research.

Finally, while our focus here is on the aggregation of biases that IR scholars have argued are particularly important in foreign policy decision making, it is worth noting that our findings are relevant for the study of collective decision making in a wide range of contexts. Prospect theory is frequently applied to a variety of questions in American and comparative politics; intentionality bias is central to questions of blame attribution in politics more generally; and reactive devaluation is tightly linked to theories of negative partisanship.Footnote ⁹⁴ These findings should therefore be of interest to scholars of collective decision making across a broad set of domestic political issues, rather than just foreign ones.

In treating aggregation as an empirical rather than conceptual question, our study also has important implications beyond the three biases studied here. While we focused on studying group decision making in the context of foreign policy, similar group processes are present in a wide range of complex institutional environments. Practice theorists, for example, have argued that diplomacy in an organization such as NATO includes micro dyadic interactions between individual diplomats, as well as collective decision making in which diplomats conform with logics of practice or habit.Footnote ⁹⁵ During NATO decision-making sessions on the proposed use of force in Libya in 2011, for example, Adler-Nissen and Pouliot report that diplomats drew on the taken-for-granted nature of the decision making, noting that “at some point you just know where the wind blows,” and that in these discussions, “the diplomatic process gradually gains a life of its own.”Footnote ⁹⁶ One of the criticisms levied at this type of approach, however, is that the mechanism by which a group comes to know which way the wind is blowing, or how diplomacy gains a life of its own, is often underspecified, making it difficult to know a priori when and what types of practices are likely to affect outcomes in any given setting.Footnote ⁹⁷ Our methodological approach offers one step toward a potential solution. By studying aggregation empirically, group experiments such as those reported here may help us better identify the ways in which group practical sense is created, providing an incremental step in building microfoundations for practice theories. Altogether, this research shows the value of treating the “aggregation problem” in foreign policy as a phenomenon that deserves to be studied empirically, rather than just assumed.

Data Availability Statement

Replication files for this article may be found at <https://doi.org/10.7910/DVN/N8GBLF>.

Supplementary Material

Supplementary material for this article is available at <https://doi.org/10.1017/S0020818322000017>.

Acknowledgments

Thanks to Henry Atkins, Riley Carney, Brad DeWees, Emily Jackson, Austin Jordan, Max Kuhelj Bugaric, Daiana Lilo, Ethan Mallove, MD Mangini, Andras Molnar, Clay Oxford, Eric Parajon, Yon Soo Park, Heather Rodenberg, Andi Zhou, and the students in GOVT 204 and the Political Psychology and International Relations research lab at William and Mary for research and programming assistance. We're also grateful to Jack Levy, John Paschkewitz, Katy Powers, Ryan Powers, Ken Schultz, Lisa Troyer, Alan van Beek, Jessica Weeks, and audiences at ISPP, APSA, Washington University in St. Louis, the University of Pennsylvania, and the University of Georgia for feedback, as well as to Rose McDermott for enormously helpful conversations at the WCFIA security cluster conference in 2018, and to Keri Lemasters, Brooke Moore, Sarah Pollack, and Amy Stockton for making the study possible.

Funding

This research was funded by the Defense Advanced Research Projects Agency (award no. W911NF1920162). We acknowledge the support of the Weatherhead Center for International Affairs and the Institute for Quantitative Social Science at Harvard University.

Footnotes

1. For a review, see Davis and McDermott Reference Davis and McDermott2021; Hafner-Burton et al. Reference Hafner-Burton, Haggard, Lake and Victor2017; Kertzer and Tingley Reference Kertzer and Tingley2018; Levy Reference Levy, Huddy, Sears and Levy2013; Mintz Reference Mintz2007.

2. Holmes Reference Holmes2018; Kertzer Reference Kertzer2016; Lake and Powell Reference Lake and Powell1999; Landau-Wells Reference Landau-Wells2018; Larson Reference Larson1985; McDermott Reference McDermott1998; Powers Reference Powers2022; Rathbun Reference Rathbun2014; Renshon Reference Renshon2017; Saunders Reference Saunders2011; Vertzberger Reference Vertzberger1998; Waltz Reference Waltz1979; Yarhi-Milo Reference Yarhi-Milo2018.

3. Etheredge Reference Etheredge1978; George Reference George1969; Greenstein Reference Greenstein1969; Hermann Reference Hermann1980; Leites Reference Leites1951.

4. Baekgaard et al. Reference Baekgaard, Christensen, Dahlmann, Mathiasen and Petersen2019; Brooks, Cunha, and Mosley Reference Brooks, Cunha and Mosley2015; Jervis Reference Jervis1976; D. Johnson Reference Johnson2020; Kertzer Reference Kertzer2021; Kertzer, Rathbun, and Rathbun Reference Kertzer, Rathbun and Rathbun2020; LeVeck et al. Reference LeVeck, Alex Hughes, Fowler, Hafner-Burton and Victor2014; Poulsen and Aisbett Reference Poulsen and Aisbett2013; Sheffer et al. Reference Sheffer, Loewen, Soroka, Walgrave and Sheafer2018; Stein Reference Stein1988.

5. Kahneman and Renshon Reference Kahneman and Renshon2007. See also D. Johnson Reference Johnson2020, 268. While our interest here is on three biases that tend to move in a hawkish direction with respect to decision making, it may be that others have a tendency to move in a dovish direction, or create misperceptions that lead to cooperation rather than the use of force. See Grynaviski Reference Grynaviski2014.

6. Kahneman and Renshon Reference Kahneman, Renshon, Trevor Thrall and Cramer2009. We follow Kahneman and Renshon Reference Kahneman and Renshon2007 in referring to these phenomena as “hawkish biases,” but we do not use the term in a pejorative sense, or to imply that these tendencies are inherently irrational—see, for example, Gigerenzer and Gaissmaier Reference Gigerenzer and Gaissmaier2011; D. Johnson Reference Johnson2020. We can think of these tendencies more generally as what behavioral scientists refer to as “nonstandard” preferences, beliefs, and decision making, behavioral regularities traditionally excluded from canonical rational choice models, as in DellaVigna Reference DellaVigna2009; Hafner-Burton et al. Reference Hafner-Burton, Haggard, Lake and Victor2017. For an application of hawkishness to international relations more generally, see Mattes and Weeks Reference Mattes and Weeks2019.

7. Kahneman and Tversky Reference Kahneman and Tversky1979; McDermott Reference McDermott1998.

8. Levy Reference Levy1996.

9. Jervis Reference Jervis1976.

10. Knobe Reference Knobe2003.

11. Chu, Holmes, and Traven Reference Chu, Holmes and Traven2021.

12. Ashmore et al. Reference Ashmore, Bird, Del Boca and Vanderet1979; Maoz et al. Reference Maoz, Ward, Katz and Ross2002; Ross and Ward Reference Ross, Ward and Zanna1995.

13. Hart, Stern, and Sundelius Reference Hart, Stern and Sundelius1997.

14. Powell Reference Powell2017; Saunders Reference Saunders2017, S220.

15. Levy Reference Levy1997, 102.

16. Hafner-Burton et al. Reference Hafner-Burton, Haggard, Lake and Victor2017, S18–S21.

17. Arrow Reference Arrow1950.

18. Austen-Smith and Banks Reference Austen-Smith and Banks1996.

19. Janis Reference Janis1972.

20. Devine et al. Reference Devine, Clayton, Dunford, Seying and Pryce2001.

21. Moynihan and Peterson Reference Moynihan and Peterson2001.

22. Bernstein, Shore, and Lazer Reference Bernstein, Shore and Lazer2018.

23. Berdahl et al. Reference Berdahl, Torney, Ioannou, Faria and Couzin2013.

24. See, for example, Gildea Reference Gildea2020; D. Johnson Reference Johnson2015; Mercer Reference Mercer1995, 237–38; Powell Reference Powell2017; Wendt Reference Wendt2004.

25. E.g., Lewin, Lippitt, and White Reference Lewin, Lippitt and White1939.

26. Brutger and Kertzer Reference Brutger and Kertzer2018; Voss and Post Reference Voss, Post, Chi, Glaser and Farr1988.

27. LeVeck and Narang Reference LeVeck and Narang2017.

28. Kerr and Tindale Reference Kerr and Tindale2004; Larrick Reference Larrick2016; LeVeck and Narang Reference LeVeck and Narang2017.

29. Strodtbeck, James, and Hawkins Reference Strodtbeck, James and Hawkins1957.

30. Though see Ausderan Reference Ausderan2013; Kaarbo Reference Kaarbo1998; Redd Reference Redd2002; Saunders Reference Saunders2017; Weeks Reference Weeks2014.

31. Kerr, MacCoun, and Kramer Reference Kerr, MacCoun and Kramer1996.

32. Hart, Stern, and Sundelius Reference Hart, Stern and Sundelius1997; Janis Reference Janis1972.

33. Esser Reference Esser1998; Sunstein and Hastie Reference Sunstein and Hastie2014.

34. George Reference George1972; R.T. Johnson Reference Johnson1974; Schneeweiss Reference Schneeweiss2012.

35. Horowitz et al. Reference Horowitz, Stewart, Tingley, Bishop, Samotin, Roberts, Chang, Mellers and Tetlock2019; Page Reference Page2019.

36. Mintz and Wayne Reference Mintz and Wayne2016.

37. Hermann and Preston Reference Hermann and Preston1994; Horowitz and Fuhrmann Reference Horowitz and Fuhrmann2018; Preston Reference Preston2001; Saunders Reference Saunders2017; Schafer and Crichlow Reference Schafer and Crichlow2010.

38. Kahneman and Tversky Reference Kahneman and Tversky1979; Knobe Reference Knobe2003; Ross and Ward Reference Ross, Ward and Zanna1995.

39. Kahneman and Renshon Reference Kahneman and Renshon2007.

40. Kahneman and Renshon Reference Kahneman, Renshon, Trevor Thrall and Cramer2009, 79.

41. Kahneman and Tversky Reference Kahneman, Tversky and Shafir2017.

42. Mitzen and Schweller Reference Mitzen and Schweller2011; Pechenkina and Argo Reference Pechenkina and Argo2020.

43. Ashmore et al. Reference Ashmore, Bird, Del Boca and Vanderet1979; Maoz et al. Reference Maoz, Ward, Katz and Ross2002.

44. Respondents were a sample of adults in the United States recruited using Qualtrics. Qualtrics is a panel aggregator, so it has access to a much larger sample than any single online panel, which is necessary to produce a sufficient flow of respondents for successful synchronous group interaction.

45. These groups of five—as well as the assigned leader in hierarchical groups—stay the same throughout each of three experimental modules. That is, group members do not change from module to module, though some groups do become smaller due to dropouts; our analysis includes only groups with no fewer than three members in a given experiment; in the hierarchical condition the group must also include a leader. We also manually screened the respondents for “bots,” removing from the analysis any individual (or group, in the group conditions) that displayed bot-like behavior in the chat logs. For a detailed set of attrition tests and sensitivity analyses that show the robustness of the findings, see section 2.2 in the online supplement.

46. Respondents in the group conditions deliberated using a chat platform constructed in SMARTRIQS. See Molnar Reference Molnar2019.

47. All members of a single group receive the same treatment. For example, the five members of a horizontal group that have been randomly grouped together would all receive only the “gains” frame.

48. D. Johnson Reference Johnson2020; Myers and Lamm Reference Myers and Lamm1976; Powell Reference Powell2017.

49. When comparing across groups, we use a variety of methods to account for potential covariate imbalance between individual and group conditions. Results are substantively similar regardless. Without controls, p < .002; with a series of controls for leader-level characteristics, p < .003; and with group-level controls (demographic characteristics averaged across all group members), p < .002 (see section 2.1 of the online supplement).

50. Comparing across groups, the difference in the effect of fatalities between the individual and horizontal condition (using the median voter rule) has p < .94 without controls, p < .91 with controls. The difference in the effect of fatalities on assessments of intentionality between the individual and hierarchical condition has p < .86 without controls, p < .85 with controls at the leader level, and p < .85 with controls at the group level (see section 2.1 in the online supplement).

51. Brehm and Brehm Reference Brehm and Brehm2013; Hovland and Weiss Reference Hovland and Weiss1951; Ross Reference Ross1993.

52. Maoz et al. Reference Maoz, Ward, Katz and Ross2002.

53. Ashmore et al. Reference Ashmore, Bird, Del Boca and Vanderet1979; Maoz et al. Reference Maoz, Ward, Katz and Ross2002; Ross and Ward Reference Ross, Ward and Zanna1995.

54. Allison Reference Allison1971.

55. Although the fact that respondents complete multiple experimental modules in the same groups means that there is some opportunity for repeated interaction and social learning—and we do not find that the magnitude of the bias in our data decays over multiple experimental interactions—as a test of social pressure it is relatively modest.

56. Janis Reference Janis1972.

57. Ratkovic and Tingley Reference Ratkovic and Tingley2017.

58. Harden Reference Harden2021.

59. McDermott Reference McDermott1998; Ross and Ward Reference Ross, Ward and Zanna1995; Traven Reference Traven2021.

60. Kertzer Reference Kertzer2021.

61. Kertzer and Renshon Reference Kertzer and Renshon2022.

62. Horowitz et al. Reference Horowitz, Stewart, Tingley, Bishop, Samotin, Roberts, Chang, Mellers and Tetlock2019; Page Reference Page2007, Reference Page2019.

63. Roberson Reference Roberson2019, 70.

64. Page Reference Page2007, 9.

65. Janis Reference Janis1972.

66. Page Reference Page2007, 10.

67. Mintz and Wayne Reference Mintz and Wayne2016.

68. Roberson Reference Roberson2019.

69. Page Reference Page2019.

70. Halfhill et al. Reference Halfhill, Sundstrom, Lahner, Calderone and Nielsen2005.

71. Mintz, Redd, and Vedlitz Reference Mintz, Redd and Vedlitz2006; Saunders Reference Saunders2017.

72. Hermann Reference Hermann2001; Larson Reference Larson1994; Rathbun et al. Reference Rathbun, Kertzer, Reifler, Goren and Scotto2016.

73. As a robustness check, we also examine the effect of gender composition in groups in particular, in both absolute terms (e.g., the number of group members who do not identify as male) and relative ones (the proportion who do not identify as male). The number of non-male group members in the horizontal condition appears to moderate the treatment regardless of the functional form used, and the LASSOplus results suggest that leader gender does not meaningfully affect group decisions in the hierarchical condition either.

74. The dissensus measure is the variance in the dependent variable among members of the group. Greater variance among group members in the preferred decision is more dissensus.

75. Holmes Reference Holmes2018.

76. Surowiecki Reference Surowiecki2005.

77. For a review, see Jordan et al. Reference Jordan, Taylor, Meese and Nielsen2009.

78. Holmes Reference Holmes2018.

79. Importantly, these tests also suggest that our replication of these biases in the group conditions is unlikely to be an artifact of group members’ not taking the study seriously.

80. Walther Reference Walther1992.

81. Antheunis, Valkenburg, and Peter Reference Antheunis, Valkenburg and Peter2007; Tidwell and Walther Reference Tidwell and Walther2006.

82. Wheeler and Holmes Reference Wheeler and Holmes2021.

83. Gildea Reference Gildea2020, 1–2.

84. D. Johnson Reference Johnson2015, 760.

85. McDermott Reference McDermott1998, 187.

86. Kameda and Davis Reference Kameda and Davis1990, 58.

87. Brutger and Kertzer Reference Brutger and Kertzer2018.

88. E.g., Staples and Zhao Reference Staples and Zhao2006.

89. And, of course, descriptive diversity can be normatively valuable regardless of any benefits it may provide for decision making.

90. Kertzer Reference Kertzer2016; Mintz, Valentino, and Wayne Reference Mintz, Valentino and Wayne2021; Stein Reference Stein2017.

91. Stoner Reference Stoner1961.

92. Asch Reference Asch and Guetzkow1951; Badie Reference Badie2010; Janis Reference Janis1972; Sherif Reference Sherif1935.

93. Whyte Reference Whyte1989, 40.

94. E.g., Brutger Reference Brutger2021; Malhotra and Kuo Reference Malhotra and Kuo2008; McDermott Reference McDermott2004; McGraw Reference McGraw1991; Sheffer et al. Reference Sheffer, Loewen, Soroka, Walgrave and Sheafer2018.

95. McCourt Reference McCourt2016.

96. Adler-Nissen and Pouliot Reference Adler-Nissen and Pouliot2014.

97. Ringmar Reference Ringmar2014, 6.

References

Adler-Nissen, Rebecca, and Pouliot, Vincent. 2014. Power in Practice: Negotiating the International Intervention in Libya. European Journal of International Relations 20 (4):889–911.CrossRef Google Scholar

Allison, Graham. 1971. Essence of Decision. Little, Brown and Company.Google Scholar

Antheunis, Marjolijn L., Valkenburg, Patti M., and Peter, Jochen. 2007. Computer-Mediated Communication and Interpersonal Attraction: An Experimental Test of Two Explanatory Hypotheses. Cyberpsychology and Behavior 10 (6):831–36.CrossRef Google Scholar PubMed

Arrow, Kenneth J. 1950. A Difficulty in the Concept of Social Welfare. Journal of Political Economy 58 (4):328–46.CrossRef Google Scholar

Asch, Solomon E. 1951. Effects of Group Pressure upon the Modification and Distortion of Judgments. In Groups, Leadership and Men: Research in Human Relations, edited by Guetzkow, Harold, 177–90. Carnegie PressGoogle Scholar

Ashmore, Richard D., Bird, David, Del Boca, Frances K., and Vanderet, Robert C.. 1979. An Experimental Investigation of the Double Standard in the Perception of International Affairs. Political Behavior 1 (2): 123–35.CrossRef Google Scholar

Ausderan, Jacob Thomas. 2013. International Conflict and the Strategic Selection of Foreign Policy Advisors. PhD diss., Florida State University.Google Scholar

Austen-Smith, David, and Banks, Jeffrey S.. 1996. Information Aggregation, Rationality, and the Condorcet Jury Theorem. American Political Science Review 90 (1):34–45.CrossRef Google Scholar

Badie, Dina. 2010. Groupthink, Iraq, and the War on Terror: Explaining US Policy Shift Toward Iraq. Foreign Policy Analysis 6 (4):277–96.CrossRef Google Scholar

Baekgaard, Martin, Christensen, Julian, Dahlmann, Casper Mondrup, Mathiasen, Asbjørn, and Petersen, Niels Bjørn Grund. 2019. The Role of Evidence in Politics: Motivated Reasoning and Persuasion Among Politicians. British Journal of Political Science 49 (3):1117–40.CrossRef Google Scholar

Berdahl, Andrew, Torney, Colin J., Ioannou, Christos C., Faria, Jolyon J., and Couzin, Iain D.. 2013. Emergent Sensing of Complex Environments by Mobile Animal Groups. Science 339 (6119):574–76.CrossRef Google Scholar PubMed

Bernstein, Ethan, Shore, Jesse, and Lazer, David. 2018. How Intermittent Breaks in Interaction Improve Collective Intelligence. Proceedings of the National Academy of Sciences 115 (35):8734–39.CrossRef Google Scholar PubMed

Brehm, Sharon S., and Brehm, Jack W.. 2013. Psychological Reactance: A Theory of Freedom and Control. Academic Press.Google Scholar

Brooks, Sarah M., Cunha, Raphael, and Mosley, Layna. 2015. Categories, Creditworthiness, and Contagion: How Investors’ Shortcuts Affect Sovereign Debt Markets. International Studies Quarterly 59 (3):587–601.CrossRef Google Scholar

Brutger, Ryan. 2021. The Power of Compromise: Proposal Power, Partisanship, and Public Support in International Bargaining. World Politics 73 (1):128–66.CrossRef Google Scholar

Brutger, Ryan, and Kertzer, Joshua D.. 2018. A Dispositional Theory of Reputation Costs. International Organization 72 (3):693–724.CrossRef Google Scholar

Chu, Jonathan, Holmes, Marcus, and Traven, David. 2021. Intentions from Consequences: How Moral Judgments Shape Citizen Perceptions of Wartime Conduct. Journal of Experimental Political Science 8 (2):203–207.CrossRef Google Scholar

Davis, James W., and McDermott, Rose. 2021. The Past, Present, and Future of Behavioral IR. International Organization 75 (1):147–77.CrossRef Google Scholar

DellaVigna, Stefano. 2009. Psychology and Economics: Evidence from the Field. Journal of Economic Literature 47 (2):315–72.CrossRef Google Scholar

Devine, Dennis J., Clayton, Laura D., Dunford, Benjamin B., Seying, Rasmy, and Pryce, Jennifer. 2001. Jury Decision Making: Forty-five Years of Empirical Research on Deliberating Groups. Psychology, Public Policy, and Law 7 (3):622–727.CrossRef Google Scholar

Esser, James K. 1998. Alive and Well After Twenty-five Years: A Review of Groupthink Research. Organizational Behavior and Human Decision Processes 73 (2–3):116–41.CrossRef Google Scholar

Etheredge, Lloyd S. 1978. Personality Effects on American Foreign Policy, 1898–1968: A Test of Interpersonal Generalization Theory. American Political Science Review 72 (2):434–51.CrossRef Google Scholar

George, Alexander L. 1969. The “Operational Code”: A Neglected Approach to the Study of Political Leaders and Decision Making. International Studies Quarterly 13 (2):190–222.CrossRef Google Scholar

George, Alexander L. 1972. The Case for Multiple Advocacy in Making Foreign Policy. American Political Science Review 66 (3):751–85.CrossRef Google Scholar

Gigerenzer, Gerd, and Gaissmaier, Wolfgang. 2011. Heuristic Decision Making. Annual Review of Psychology 62:451–82.CrossRef Google Scholar PubMed

Gildea, Ross James. 2020. Psychology and Aggregation in International Relations. European Journal of International Relations 26 (S1):166–83.CrossRef Google Scholar

Greenstein, Fred I. 1969. Personality and Politics: Problems of Evidence, Inference, and Conceptualization. Markham.Google Scholar

Grynaviski, Eric. 2014. Constructive Illusions: Misperceiving the Origins of International Cooperation. Cornell University Press.Google Scholar

Hafner-Burton, Emilie M., Haggard, Stephan, Lake, David A., and Victor, David G.. 2017. The Behavioral Revolution and International Relations. International Organization 71 (S1):S1–S31.CrossRef Google Scholar

Halfhill, Terry, Sundstrom, Eric, Lahner, Jessica, Calderone, Wilma, and Nielsen, Tjai M.. 2005. Group Personality Composition and Group Effectiveness: An Integrative Review of Empirical Research. Small Group Research 36 (1):83–105.CrossRef Google Scholar

Harden, John P. 2021. All the World's a Stage: US Presidential Narcissism and International Conflict. International Studies Quarterly 65 (3):825–37.CrossRef Google Scholar

Hart, Paul ’t, Stern, Eric, and Sundelius, Bengt. 1997. Beyond Groupthink: Political Group Dynamics and Foreign Policy-Making. University of Michigan Press.CrossRef Google Scholar

Hermann, Margaret G. 1980. Explaining Foreign Policy Behavior Using the Personal Characteristics of Political Leaders. International Studies Quarterly 24 (1):7–46.CrossRef Google Scholar

Hermann, Margaret G. 2001. How Decision Units Shape Foreign Policy: A Theoretical Framework. International Studies Review 3 (2):47–81.CrossRef Google Scholar

Hermann, Margaret G., and Preston, Thomas. 1994. Presidents, Advisers, and Foreign Policy: The Effect of Leadership Style on Executive Arrangements. Political Psychology 15 (1):75–96.CrossRef Google Scholar

Holmes, Marcus. 2018. Face-to-Face Diplomacy: Social Neuroscience and International Relations. Cambridge University Press.CrossRef Google Scholar

Horowitz, Michael, Stewart, Brandon M., Tingley, Dustin, Bishop, Michael, Samotin, Laura Resnick, Roberts, Margaret, Chang, Welton, Mellers, Barbara, and Tetlock, Philip. 2019. What Makes Foreign Policy Teams Tick: Explaining Variation in Group Performance at Geopolitical Forecasting. Journal of Politics 81 (4):1388–404.CrossRef Google Scholar

Horowitz, Michael C., and Fuhrmann, Matthew. 2018. Studying Leaders and Military Conflict: Conceptual Framework and Research Agenda. Journal of Conflict Resolution 62 (10):2072–86.CrossRef Google Scholar

Hovland, Carl I., and Weiss, Walter. 1951. The Influence of Source Credibility on Communication Effectiveness. Public Opinion Quarterly 15 (4):635–50.CrossRef Google Scholar

Janis, Irving L. 1972. Victims of Groupthink: A Psychological Study of Foreign-Policy Decisions and Fiascoes. Houghton Mifflin.Google Scholar

Jervis, Robert. 1976. Perception and Misperception in International Politics. Princeton University Press.Google Scholar

Johnson, Dominic D.P. 2015. Survival of the Disciplines: Is International Relations Fit for the New Millennium? Millennium 43 (2):749–63.CrossRef Google Scholar

Johnson, Dominic D.P. 2020. Strategic Instincts: The Adaptive Advantages of Cognitive Biases in International Politics. Princeton University Press.Google Scholar

Johnson, Richard Tanner. 1974. Managing the White House: An Intimate study of the Presidency. Harper Collins.Google Scholar

Jordan, Amos A., Taylor, William J. Jr., Meese, Michael J., and Nielsen, Suzanne C.. 2009. American National Security. Johns Hopkins University Press.Google Scholar

Kaarbo, Juliet. 1998. Power Politics in Foreign Policy: The Influence of Bureaucratic Minorities. European Journal of International Relations 4 (1):67–97.CrossRef Google Scholar

Kahneman, Daniel, and Renshon, Jonathan. 2007. Why Hawks Win. Foreign Policy 158:34–38.Google Scholar

Kahneman, Daniel, and Renshon, Jonathan. 2009. Hawkish Biases. In American Foreign Policy and the Politics of Fear: Threat Inflation Since 9/11, edited by Trevor Thrall, A. and Cramer, Jane K., 79–96. Routledge.Google Scholar

Kahneman, Daniel, and Tversky, Amos. 1979. Prospect Theory: An Analysis of Decision Under Risk. Econometrica 47 (2):263–91.CrossRef Google Scholar

Kahneman, Daniel, and Tversky, Amos. 2017. Conflict Resolution: A Cognitive Perspective. In Preference, Belief, and Similarity: Selected Writings of Amos Tversky, edited by Shafir, Eldar, 729–46. MIT Press.Google Scholar

Kameda, Tatsuya, and Davis, James H.. 1990. The Function of the Reference Point in Individual and Group Risk Decision Making. Organizational Behavior and Human Decision Processes 46 (1):55–76.CrossRef Google Scholar

Kerr, Norbert L., MacCoun, Robert J., and Kramer, Geoffrey P.. 1996. Bias in Judgment: Comparing Individuals and Groups. Psychological Review 103 (4):687–719.CrossRef Google Scholar

Kerr, Norbert, and Tindale, Scott. 2004. Group Performance and Decision Making. Annual Review of Psychology 55:623–55.CrossRef Google Scholar PubMed

Kertzer, Joshua D. 2016. Resolve in International Politics. Princeton University Press.Google Scholar

Kertzer, Joshua D. 2021. Re-assessing Elite Public Gaps in Political Behavior. American Journal of Political Science (online first).Google Scholar

Kertzer, Joshua D., Rathbun, Brian C., and Rathbun, Nina Srinivasan. 2020. The Price of Peace: Motivated Reasoning and Costly Signaling in International Relations. International Organization 74 (1):95–118.CrossRef Google Scholar

Kertzer, Joshua D., and Renshon, Jonathan. 2022. Experiments and Surveys on Political Elites. Annual Review of Political Science 25:1–26.CrossRef Google Scholar

Kertzer, Joshua D., and Tingley, Dustin. 2018. Political Psychology in International Relations: Beyond the Paradigms. Annual Review of Political Science 21:319–39.CrossRef Google Scholar

Knobe, Joshua. 2003. Intentional Action and Side Effects in Ordinary Language. Analysis 63 (3):190–94.CrossRef Google Scholar

Lake, David A., and Powell, Robert, eds. 1999. Strategic Choice and International Relations. Princeton University Press.CrossRef Google Scholar

Landau-Wells, Marika. 2018. Dealing with Danger: Threat Perception and Policy Preferences. PhD diss., Massachusetts Institute of Technology.Google Scholar

Larrick, Richard P. 2016. The Social Context of Decisions. Annual Review of Organizational Psychology and Organizational Behavior 3 (1):441–67.CrossRef Google Scholar

Larson, Deborah Welch. 1985. Origins of Containment: A Psychological Explanation. Princeton University Press.CrossRef Google Scholar

Larson, Deborah Welch. 1994. The Role of Belief Systems and Schemas in Foreign Policy Decision-Making. Political Psychology 15 (1):17–33.CrossRef Google Scholar

Leites, Nathan. 1951. The Operational Code of the Politburo. McGraw-Hill.Google Scholar

LeVeck, Brad L., Alex Hughes, D., Fowler, James H., Hafner-Burton, Emilie, and Victor, David G.. 2014. The Role of Self-Interest in Elite Bargaining. Proceedings of the National Academy of Sciences 111 (52):18536–41.CrossRef Google Scholar PubMed

LeVeck, Brad L., and Narang, Neil. 2017. The Democratic Peace and the Wisdom of Crowds. International Studies Quarterly 61 (4):867–80.CrossRef Google Scholar

Levy, Jack S. 1996. Loss Aversion, Framing, and Bargaining: The Implications of Prospect Theory for International Conflict. International Political Science Review 17 (2):179–95.CrossRef Google Scholar

Levy, Jack S. 1997. Prospect Theory, Rational Choice, and International Relations. International Studies Quarterly 41 (1):87–112.CrossRef Google Scholar

Levy, Jack S. 2013. Psychology and Foreign Policy Decision-Making. In Oxford Handbook of Political Psychology, 2nd ed., edited by Huddy, Leonie, Sears, David O., and Levy, Jack S., 301–33. Oxford University Press.Google Scholar

Lewin, Kurt, Lippitt, Ronald, and White, Ralph K.. 1939. Patterns of Aggressive Behavior in Experimentally Created “Social Climates.” Journal of Social Psychology 10 (2):269–99.CrossRef Google Scholar

Malhotra, Neil, and Kuo, Alexander G.. 2008. Attributing Blame: The Public's Response to Hurricane Katrina. Journal of Politics 70 (1):120–35.CrossRef Google Scholar

Maoz, Ifat, Ward, Andrew, Katz, Michael, and Ross, Lee. 2002. Reactive Devaluation of an “Israeli” vs. “Palestinian” Peace Proposal. Journal of Conflict Resolution 46 (4):515–46.CrossRef Google Scholar

Mattes, Michaela, and Weeks, Jessica L.P.. 2019. Hawks, Doves and Peace: An Experimental Approach. American Journal of Political Science 63 (1):53–66.CrossRef Google Scholar

McCourt, David M. 2016. Practice Theory and Relationalism as the New Constructivism. International Studies Quarterly 60 (3):475–85.CrossRef Google Scholar

McDermott, Rose. 1998. Risk-Taking in International Politics: Prospect Theory in American Foreign Policy. University of Michigan Press.CrossRef Google Scholar

McDermott, Rose. 2004. Prospect Theory in Political Science: Gains and Losses from the First Decade. Political Psychology 25 (2):289–312.CrossRef Google Scholar

McGraw, Kathleen M. 1991. Managing Blame: An Experimental Test of the Effects of Political Accounts. American Political Science Review 85 (4):1133–57.CrossRef Google Scholar

Mercer, Jonathan. 1995. Anarchy and Identity. International Organization 49 (2):229–52.CrossRef Google Scholar

Mintz, Alex. 2007. Why Behavioral IR? International Studies Review 9 (1):157–62.Google Scholar

Mintz, Alex, Redd, Steven B., and Vedlitz, Arnold. 2006. Can We Generalize from Student Experiments to the Real World in Political Science, Military Affairs, and International Relations? Journal of Conflict Resolution 50 (5):757–76.CrossRef Google Scholar

Mintz, Alex, Valentino, Nicholas, and Wayne, Carly. 2021. Beyond Rationality: Behavioral Political Science in the Twenty-first Century. Cambridge University Press.CrossRef Google Scholar

Mintz, Alex, and Wayne, Carly. 2016. The Polythink Syndrome: US Foreign Policy Decisions on 9/11, Afghanistan, Iraq, Iran, Syria, and ISIS. Stanford University Press.CrossRef Google Scholar

Mitzen, Jenniger, and Schweller, Randall L.. 2011. Knowing the Unknown Unknowns: Misplaced Certainty and the Onset of War. Security Studies 20 (1):2–35.CrossRef Google Scholar

Molnar, Andras. 2019. SMARTRIQS: A Simple Method Allowing Real-Time Respondent Interaction in Qualtrics Surveys. Journal of Behavioral and Experimental Finance 22:161–69.CrossRef Google Scholar

Moynihan, Lisa M., and Peterson, Randall S.. 2001. A Contingent Configuration Approach to Understanding the Role of Personality in Organizational Groups. Research in Organizational Behavior 23:327–78.CrossRef Google Scholar

Myers, David G., and Lamm, Helmut. 1976. The Group Polarization Phenomenon. Psychological Bulletin 83 (4):602–27.CrossRef Google Scholar

Page, Scott E. 2007. The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies. Princeton University Press.Google Scholar

Page, Scott E. 2019. The Diversity Bonus: How Great Teams Pay Off in the Knowledge Economy. Princeton University Press.Google Scholar

Pechenkina, Anna O., and Argo, Nichole. 2020. How Do Civilians Assign Blame and Praise Amidst Civil Conflict? Behavioral Sciences of Terrorism and Political Aggression 12 (4):243–67.CrossRef Google Scholar

Poulsen, Lauge N. Skovgaard, and Aisbett, Emma. 2013. When the Claim Hits: Bilateral Investment Treaties and Bounded Rational Learning. World Politics 65 (2):273–313.CrossRef Google Scholar

Powell, Robert. 2017. Research Bets and Behavioral IR. International Organization 71 (S1):S265–S277.CrossRef Google Scholar

Powers, Kathleen E. 2022. Nationalisms in International Politics. Princeton University Press.Google Scholar

Preston, Thomas. 2001. The President and His Inner Circle: Leadership Style and the Advisory Process in Foreign Policy Making. Columbia University Press.CrossRef Google Scholar

Rathbun, Brian C. 2014. Diplomacy's Value: Creating Security in 1920s Europe and the Contemporary Middle East. Cornell University Press.Google Scholar

Rathbun, Brian C., Kertzer, Joshua D., Reifler, Jason, Goren, Paul, and Scotto, Thomas J.. 2016. Taking Foreign Policy Personally: Personal Values and Foreign Policy Attitudes. International Studies Quarterly 60 (1):124–37.CrossRef Google Scholar

Ratkovic, Marc, and Tingley, Dustin. 2017. Sparse Estimation and Uncertainty with Application to Subgroup Analysis. Political Analysis 25 (1):1–40.CrossRef Google Scholar

Redd, Steven B. 2002. The Influence of Advisers on Foreign Policy Decision Making: An Experimental Study. Journal of Conflict Resolution 46 (3):335–64.CrossRef Google Scholar

Renshon, Jonathan. 2017. Fighting for Status: Hierarchy and Conflict in World Politics. Princeton University Press.Google Scholar

Ringmar, Erik. 2014. The Search for Dialogue as a Hindrance to Understanding: Practices as Inter-paradigmatic Research Program. International Theory 6 (1):1–27.CrossRef Google Scholar

Roberson, Quinetta M. 2019. Diversity in the Workplace: A Review, Synthesis, and Future Research Agenda. Annual Review of Organizational Psychology and Organizational Behavior 6 (1):69–88.CrossRef Google Scholar

Ross, Lee. 1993. Reactive Devaluation in Negotiation and Conflict Resolution. Stanford Center on International Conflict and Negotiation, Stanford University.Google Scholar

Ross, Lee, and Ward, Andrew. 1995. Psychological Barriers to Dispute Resolution. In Advances in Experimental Social Psychology, vol. 27, edited by Zanna, M.P., 255–304. Elsevier.Google Scholar

Saunders, Elizabeth N. 2011. Leaders at War: How Presidents Shape Military Interventions. Cornell University Press.CrossRef Google Scholar

Saunders, Elizabeth N. 2017. No Substitute for Experience: Presidents, Advisers, and Information in Group Decision Making. International Organization 71 (S1):S219–S247.CrossRef Google Scholar

Schafer, Mark, and Crichlow, Scott. 2010. Groupthink Versus High-Quality Decision Making in International Relations. Columbia University Press.Google Scholar

Schneeweiss, Christoph. 2012. Distributed Decision Making. Springer Science and Business Media.Google Scholar

Sheffer, Lior, Loewen, Peter John, Soroka, Stuart, Walgrave, Stefaan, and Sheafer, Tamir. 2018. Nonrepresentative Representatives: An Experimental Study of the Decision Making of Elected Politicians. American Political Science Review 112 (2):302–21.CrossRef Google Scholar

Sherif, Muzafer. 1935. A Study of Some Social Factors in Perception. Archives of Psychology: 5–60.Google Scholar

Staples, D., and Zhao, Lina. 2006. The Effects of Cultural Diversity in Virtual Teams Versus Face-to-Face Teams. Group Decision and Negotiation 15 (July):389–406.CrossRef Google Scholar

Stein, Janice Gross. 1988. Building Politics into Psychology: The Misperception of Threat. Political Psychology 9 (2):245–71.CrossRef Google Scholar

Stein, Janice Gross. 2017. The Micro-Foundations of International Relations Theory: Psychology and Behavioral Economics. International Organization 71 (S1):S249–S263.CrossRef Google Scholar

Stoner, James Arthur Finch. 1961. A Comparison of Individual and Group Decisions Involving Risk. Master's thesis, Massachusetts Institute of Technology.Google Scholar

Strodtbeck, Fred L., James, Rita M., and Hawkins, Charles. 1957. Social Status in Jury Deliberations. American Sociological Review 22 (6):713–19.CrossRef Google Scholar

Sunstein, Cass R., and Hastie, Reid 2014. Making Dumb Groups Smarter. Harvard Business Review 92 (12):90–98.Google Scholar

Surowiecki, James. 2005. The Wisdom of Crowds. Anchor.Google Scholar

Tidwell, Lisa Collins, and Walther, Joseph B.. 2006. Computer-Mediated Communication Effects on Disclosure, Impressions, and Interpersonal Evaluations: Getting to Know One Another a Bit at a Time. Human Communication Research 28 (3):317–48.CrossRef Google Scholar

Traven, David. 2021. Law and Sentiment in International Politics: Ethics, Emotions, and the Evolution of the Laws of War. Cambridge University Press.CrossRef Google Scholar

Vertzberger, Yaacov Y.I. 1998. Risk Taking and Decisionmaking: Foreign Military Interventions. Stanford University Press.CrossRef Google Scholar

Voss, James F., and Post, Timothy A.. 1988. On the Solving of Ill-structured Problems. In The Nature of Expertise, edited by Chi, Michelene H., Glaser, Robert, and Farr, Marshall J., 261–85. Erlbaum.Google Scholar

Walther, Joseph B. 1992. Interpersonal Effects in Computer-Mediated Interaction: A Relational Perspective. Communication Research 19 (1):52–90.CrossRef Google Scholar

Waltz, Kenneth N. 1979. Theory of International Politics. McGraw-Hill.Google Scholar

Weeks, Jessica L.P. 2014. Dictators at War and Peace. Cornell University Press.Google Scholar

Wendt, Alexander. 2004. The State as Person in International Theory. Review of International Studies 30 (2):289–316.CrossRef Google Scholar

Wheeler, Nicholas J., and Holmes, Marcus. 2021. The Strength of Weak Bonds: Substituting Bodily Co-presence in Diplomatic Social Bonding. European Journal of International Relations 27 (3):730–52.CrossRef Google Scholar

Whyte, Glen. 1989. Groupthink Reconsidered. Academy of Management Review 14 (1):40–56.CrossRef Google Scholar

Yarhi-Milo, Keren. 2018. Who Fights for Reputation in International Politics? Leaders, Resolve, and the Use of Force. Princeton University Press.Google Scholar