Claims in psychology (and elsewhere) often rest on unarticulated, unconvincing assumptions, and we agree with much of Cesario's criticism. Yet the structure of his argument is symptomatic of the intuitive, implicit style of causal inference that contributes to careless conclusions. Outside of experiments, psychologists don't like explicit causal inference (Grosz, Rohrer, & Thoemmes, Reference Grosz, Rohrer and Thoemmes2020); the “C-Word” (Hernán & Robins, Reference Hernán and Robins2010) is avoided, even if the discussion hinges on a causal interpretation. If instead an experiment was conducted, causal claims are accepted. But how the effect estimate from the experiment can be transferred to the rest of the world is left unarticulated, even if the discussion hinges on such transportability. This “inference by omission” can go wrong, and so psychologists compile lists of threats to validity, to which Cesario's “fatal flaws” could be appended. Unfortunately, lists of problems are not solutions.
Causal inference frameworks, such as the potential outcomes model (see Hernán & Robins, Reference Hernán and Robins2010, for a comprehensive introduction; Little & Rubin, Reference Little and Rubin2000) and graphical causal models (e.g., Pearl, Glymour, & Jewell, Reference Pearl, Glymour and Jewell2016; see also Rohrer, Reference Rohrer2018, for an introduction for psychologists), provide rigorous formalization that aids in spelling out assumptions and deriving their implications. This explicitly causal lens can improve research design, analysis, and interpretation for experiments and non-experiments alike, so let us apply it to the question of decision-maker bias.
For the case of experiments, we can simplify Cesario's list of flaws. His major concern is effect modification: The effect of group membership depends on so-called effect modifiers (e.g., disambiguating information and decision-maker features). The distribution of effect modifiers in the experimental setting differs from the distribution in the setting which we want to make statements about. If the experimental setting holds the effect modifier constant at a value that does not occur in the target setting, transport of the estimate is impossible. But, if the experimental setting includes plausible values of the effect modifier, transport becomes possible under certain licensing assumptions (Pearl & Bareinboim, Reference Pearl and Bareinboim2014). An understanding of these assumptions would help psychologists to systematically improve their studies for inferences about effects outside of the lab.
The effect modification issue implies that lab studies will misestimate the effects of decision-maker bias outside of the lab. But Cesario raises further concerns about the effect sizes claimed in the literature by comparing the path of interest (group membership → decision-maker bias → decision) to other paths (group membership → attributes of group members → decision). He implies that the latter explain much more variability in the final decision. We agree with Cesario that experimental studies on decision-maker bias following the standard design are not suitable to address this comparison.
This opens the door for the observational evidence relevant to the second path. Cesario states that “recent study suggests that the different rates of exposure to police through violent crime situations greatly – if not entirely – accounts for the overall per capita disparities in being fatally shot by the police” (sect. 4.1.2, para.3). which he uses as evidence that decision-maker bias is not to blame. Ross, Winterhalder, and McElreath (Reference Ross, Winterhalder and McElreath2021) show that this work incorrectly adjusts for crime rates. But, even if it had correctly adjusted for crime, if we formalize this claim about mediation; race → police exposure (e.g., in violent crime situations) → being fatally shot by the police, with only little or even no effects mediated through other pathways (captured in the remaining “direct effect,” which would include decision-maker bias) remaining; we run into a problem (Fig. 1).
Figure 1. Mediational claim implicit in the notion that violent crime rates “account for” disparities in being fatally shot by the police. Police exposure is a collider variable between race and U, conditioning on it induces spurious associations between the two.
Exposure to police through violent crime situations will be affected by both race (including effects of earlier decision-maker bias) and other (potentially unobserved) factors (U). Conditioning on exposure induces collider bias, introducing spurious associations between race and U. For example, consider the possibility that Blacks are more likely to be involved with the police in general (e.g., Fryer, Reference Fryer2019) and that aggressiveness increases the chances to be confronted with the police (regardless of race). Without any actual group differences in aggressiveness, this means that – conditional on police exposure – Black individuals involved in such situations are less aggressive, which would decrease their chances of being fatally shot. Such induced confounding could hide decision-maker bias and has been discussed at great length (for summaries of the debate, see Hu, Reference Hu2021; Lundberg, Johnson, & Stewart, Reference Lundberg, Johnson and Stewart2021); it crops up for other topics as well (e.g., the gender wage gap, Hünermund, Reference Hünermund2018). Outside of the lab, individuals are not randomly allocated to situations, which makes it challenging to identify decision-maker bias in observational data.
Perhaps the greatest benefit of an explicit causal inference framework is that it requires us to be more precise about the causal questions we are asking, thus enforcing conceptual consistency. Is Cesario trying to answer a forward causal question (the effect of decision-maker bias on outcomes) or a backward causal question (what causes group disparities in outcomes; Gelman & Imbens, Reference Gelman and Imbens2013)? What counterfactuals are meant to be invoked? Counterfactuals about race (“What if this person were white instead of Black”), which have been criticized for being hard or impossible to define or otherwise inadequate (Kohler-Hausmann, Reference Kohler-Hausmann2018); or counterfactuals about racism (“What if there was no decision-maker bias”), as suggested by Krieger and Smith (Reference Krieger and Smith2016)?
Clarifying these matters upfront may enable a more productive debate, as it ensures that we are not talking past each other. Causal inference frameworks do not magically guarantee value-free answers, but they force us to be precise about the questions we ask, and to be transparent about the assumptions that we are willing to make (e.g., Hu, Reference Hu2021). This rigor is all the more important for politically charged topics where the stakes are high, and where it is all too easy to fall for clear-cut (counter) narratives.
Claims in psychology (and elsewhere) often rest on unarticulated, unconvincing assumptions, and we agree with much of Cesario's criticism. Yet the structure of his argument is symptomatic of the intuitive, implicit style of causal inference that contributes to careless conclusions. Outside of experiments, psychologists don't like explicit causal inference (Grosz, Rohrer, & Thoemmes, Reference Grosz, Rohrer and Thoemmes2020); the “C-Word” (Hernán & Robins, Reference Hernán and Robins2010) is avoided, even if the discussion hinges on a causal interpretation. If instead an experiment was conducted, causal claims are accepted. But how the effect estimate from the experiment can be transferred to the rest of the world is left unarticulated, even if the discussion hinges on such transportability. This “inference by omission” can go wrong, and so psychologists compile lists of threats to validity, to which Cesario's “fatal flaws” could be appended. Unfortunately, lists of problems are not solutions.
Causal inference frameworks, such as the potential outcomes model (see Hernán & Robins, Reference Hernán and Robins2010, for a comprehensive introduction; Little & Rubin, Reference Little and Rubin2000) and graphical causal models (e.g., Pearl, Glymour, & Jewell, Reference Pearl, Glymour and Jewell2016; see also Rohrer, Reference Rohrer2018, for an introduction for psychologists), provide rigorous formalization that aids in spelling out assumptions and deriving their implications. This explicitly causal lens can improve research design, analysis, and interpretation for experiments and non-experiments alike, so let us apply it to the question of decision-maker bias.
For the case of experiments, we can simplify Cesario's list of flaws. His major concern is effect modification: The effect of group membership depends on so-called effect modifiers (e.g., disambiguating information and decision-maker features). The distribution of effect modifiers in the experimental setting differs from the distribution in the setting which we want to make statements about. If the experimental setting holds the effect modifier constant at a value that does not occur in the target setting, transport of the estimate is impossible. But, if the experimental setting includes plausible values of the effect modifier, transport becomes possible under certain licensing assumptions (Pearl & Bareinboim, Reference Pearl and Bareinboim2014). An understanding of these assumptions would help psychologists to systematically improve their studies for inferences about effects outside of the lab.
The effect modification issue implies that lab studies will misestimate the effects of decision-maker bias outside of the lab. But Cesario raises further concerns about the effect sizes claimed in the literature by comparing the path of interest (group membership → decision-maker bias → decision) to other paths (group membership → attributes of group members → decision). He implies that the latter explain much more variability in the final decision. We agree with Cesario that experimental studies on decision-maker bias following the standard design are not suitable to address this comparison.
This opens the door for the observational evidence relevant to the second path. Cesario states that “recent study suggests that the different rates of exposure to police through violent crime situations greatly – if not entirely – accounts for the overall per capita disparities in being fatally shot by the police” (sect. 4.1.2, para.3). which he uses as evidence that decision-maker bias is not to blame. Ross, Winterhalder, and McElreath (Reference Ross, Winterhalder and McElreath2021) show that this work incorrectly adjusts for crime rates. But, even if it had correctly adjusted for crime, if we formalize this claim about mediation; race → police exposure (e.g., in violent crime situations) → being fatally shot by the police, with only little or even no effects mediated through other pathways (captured in the remaining “direct effect,” which would include decision-maker bias) remaining; we run into a problem (Fig. 1).
Figure 1. Mediational claim implicit in the notion that violent crime rates “account for” disparities in being fatally shot by the police. Police exposure is a collider variable between race and U, conditioning on it induces spurious associations between the two.
Exposure to police through violent crime situations will be affected by both race (including effects of earlier decision-maker bias) and other (potentially unobserved) factors (U). Conditioning on exposure induces collider bias, introducing spurious associations between race and U. For example, consider the possibility that Blacks are more likely to be involved with the police in general (e.g., Fryer, Reference Fryer2019) and that aggressiveness increases the chances to be confronted with the police (regardless of race). Without any actual group differences in aggressiveness, this means that – conditional on police exposure – Black individuals involved in such situations are less aggressive, which would decrease their chances of being fatally shot. Such induced confounding could hide decision-maker bias and has been discussed at great length (for summaries of the debate, see Hu, Reference Hu2021; Lundberg, Johnson, & Stewart, Reference Lundberg, Johnson and Stewart2021); it crops up for other topics as well (e.g., the gender wage gap, Hünermund, Reference Hünermund2018). Outside of the lab, individuals are not randomly allocated to situations, which makes it challenging to identify decision-maker bias in observational data.
Perhaps the greatest benefit of an explicit causal inference framework is that it requires us to be more precise about the causal questions we are asking, thus enforcing conceptual consistency. Is Cesario trying to answer a forward causal question (the effect of decision-maker bias on outcomes) or a backward causal question (what causes group disparities in outcomes; Gelman & Imbens, Reference Gelman and Imbens2013)? What counterfactuals are meant to be invoked? Counterfactuals about race (“What if this person were white instead of Black”), which have been criticized for being hard or impossible to define or otherwise inadequate (Kohler-Hausmann, Reference Kohler-Hausmann2018); or counterfactuals about racism (“What if there was no decision-maker bias”), as suggested by Krieger and Smith (Reference Krieger and Smith2016)?
Clarifying these matters upfront may enable a more productive debate, as it ensures that we are not talking past each other. Causal inference frameworks do not magically guarantee value-free answers, but they force us to be precise about the questions we ask, and to be transparent about the assumptions that we are willing to make (e.g., Hu, Reference Hu2021). This rigor is all the more important for politically charged topics where the stakes are high, and where it is all too easy to fall for clear-cut (counter) narratives.
Conflict of interest
None.