Cesario's interpretation of experimental studies of bias in section 5 is correct as a standard Bayesian interpretation. Cesario rightly points out that experimental social psychologists fail to take properly into account that stereotypes may sometimes be accurate in everyday prediction. According to Cesario, acting on these stereotypes in the experimental situation might not be an “error” from a Bayesian perspective and experimental social psychologists should acknowledge this possibility.
However, social psychologists might have another, ethical, reason to label the observed decision-maker biases as “erroneous,” and Cesario misses this reason when criticizing science, technology, engineering, and mathematics (STEM) hiring studies. In his criticism, Cesario does not acknowledge that researchers are not usually interested in merely explaining group disparities per se, but also aim to account for discriminatory behavior resulting in group disparities. Accounting for discriminatory behavior provides a rationale for the experimental design and the use of the word “erroneous” in the context of studying recruitment bias. Researchers labeling biased decision-making as “erroneous” do not merely claim that biased decision-making is wrong because it violates the norms of rational statistical inference in the context of the experiment. Instead, biased decision-making can also be labeled as an “error,” because it results in illegal and morally condemnable discrimination. In this context, the words “discrimination” and “bias” are used as moralized concepts (Altman, Reference Altman and Zalta2020) or thick concepts (Williams, Reference Williams1985) that simultaneously describe a phenomenon and express an evaluative stance toward it.
Let us consider the laboratory studies of STEM hiring in contrast to the goals of real-world hiring. In real-world hiring, the normative motivation for relying only on the relevant information provided by an applicant's resume is to prevent discrimination and to guarantee fair and equal treatment of applicants, which applicants also expect from recruiters (Gilliland, Reference Gilliland1993). If one uses information on an applicant's membership in a salient social group as a decision-making criterion, this reasoning can be labeled as “biased, erroneous decision-making” because it violates the ethical norms of good recruitment practices. Following good recruitment practice, one judges a candidate based solely on the skills and merits of the applicant. This practice reflects the decision-making ideals of the recruiter, who wishes to closely adhere to the norms of anti-discrimination legislation (Koivunen, Ylöstalo, & Otonkorpi-Lehtoranta, Reference Koivunen, Ylöstalo and Otonkorpi-Lehtoranta2015). These ideals are also widespread, because, for instance, gender discrimination in hiring is deemed illegal in 89% of countries (Heymann, Bose, Waisath, Raub, & McCormack, Reference Heymann, Bose, Waisath, Raub and McCormack2020).
In light of these norms, the experimental designs of STEM hiring are not mere displays of “methodological trickery,” as Cesario suggests (sect. 5, para. 5). It is not trickery to create an experimental design where “the single relevant piece of information is the qualification of the applicant as revealed by the resume; being influenced by anything other than this information is treated as biased, erroneous decision-making” (sect. 5, para. 6). The design that Cesario describes reflects the real-world decision-making goals of recruiters and legislators. When a participant in a laboratory experiment uses irrelevant non-performance-related information on group membership to evaluate and to select candidates for an open position, the participant engages in “biased,” “erroneous,” and “discriminatory” decision-making that would count as “biased,” “erroneous,” and “discriminatory” decision-making also outside the lab.
Given that Cesario's goal is to suggest a new approach for experimental social psychology that begins with an analysis of actual decisions, Cesario should also walk the talk when criticizing STEM hiring research. The lesson is that participants in a laboratory study may be non-biased in the Bayesian sense, but at the same time their decision-making can be regarded as discriminatory and erroneous in the moral sense. First of all, it might be true that real-world recruiters (or college students enrolled in psychological experiments studying recruitment bias) might be Bayesian actors in the sense that they form their decision by using “information that may be probabilistically accurate in everyday life” (sect. 5, para. 7), as Cesario puts it. Second, it is also true that the experiments studying recruitment bias are designed in such a way that the label of “erroneous behavior” is attached to situations in which participants use information within the experiment that may actually lead to more accurate decisions outside the experiment. For instance, in some contexts, knowing that an applicant belongs to a certain salient social group might lead to somewhat accurate predictions about the applicant's future job performance or ability to commit to a job (Arrow, Reference Arrow, Ashenfelter and Rees1973; Phelps, Reference Phelps1972). Knowing about an applicant's childcare responsibilities might be a factor that has real-predictive value in some contexts. Nevertheless, the use of this information in a way that leads to disparate treatment of applicants in hiring decisions counts as statistical discrimination.
To conclude, it should be added that providing a deeper understanding of the motivation behind the experimental designs of STEM hiring does not show Cesario to be wrong in his main claim. One cannot naïvely assume that the social psychological experiments of categorical bias or audit studies provide causal explanations that would universally account for all real-world group disparities. What my comment puts forth is the possibility that the research methodologies of laboratory studies of recruitment and the interpretation of the results as “errors” may reflect the normative ethical values of the researchers and modern societies, because similar phenomena have occurred in other fields of science. Normative views on gender have been shown to influence how data are interpreted in anthropological studies on human evolution (Longino, Reference Longino1990), and normative views of divorce have influenced the ways in which research questions and research designs are framed when studying the effects of divorce on well-being (Anderson, Reference Anderson2004).
It should also be noted that the purpose of my comment is entirely descriptive, and the goal is to correct and deepen Cesario's interpretation of studies of recruitment bias. I do not seek to defend the scientific soundness of the research methodologies and the ways of interpreting results by using the concepts of “bias” and “discrimination” as morally laden thick concepts. One can indeed question whether it is good scientific practice to allow values to influence science in this way.
Cesario's interpretation of experimental studies of bias in section 5 is correct as a standard Bayesian interpretation. Cesario rightly points out that experimental social psychologists fail to take properly into account that stereotypes may sometimes be accurate in everyday prediction. According to Cesario, acting on these stereotypes in the experimental situation might not be an “error” from a Bayesian perspective and experimental social psychologists should acknowledge this possibility.
However, social psychologists might have another, ethical, reason to label the observed decision-maker biases as “erroneous,” and Cesario misses this reason when criticizing science, technology, engineering, and mathematics (STEM) hiring studies. In his criticism, Cesario does not acknowledge that researchers are not usually interested in merely explaining group disparities per se, but also aim to account for discriminatory behavior resulting in group disparities. Accounting for discriminatory behavior provides a rationale for the experimental design and the use of the word “erroneous” in the context of studying recruitment bias. Researchers labeling biased decision-making as “erroneous” do not merely claim that biased decision-making is wrong because it violates the norms of rational statistical inference in the context of the experiment. Instead, biased decision-making can also be labeled as an “error,” because it results in illegal and morally condemnable discrimination. In this context, the words “discrimination” and “bias” are used as moralized concepts (Altman, Reference Altman and Zalta2020) or thick concepts (Williams, Reference Williams1985) that simultaneously describe a phenomenon and express an evaluative stance toward it.
Let us consider the laboratory studies of STEM hiring in contrast to the goals of real-world hiring. In real-world hiring, the normative motivation for relying only on the relevant information provided by an applicant's resume is to prevent discrimination and to guarantee fair and equal treatment of applicants, which applicants also expect from recruiters (Gilliland, Reference Gilliland1993). If one uses information on an applicant's membership in a salient social group as a decision-making criterion, this reasoning can be labeled as “biased, erroneous decision-making” because it violates the ethical norms of good recruitment practices. Following good recruitment practice, one judges a candidate based solely on the skills and merits of the applicant. This practice reflects the decision-making ideals of the recruiter, who wishes to closely adhere to the norms of anti-discrimination legislation (Koivunen, Ylöstalo, & Otonkorpi-Lehtoranta, Reference Koivunen, Ylöstalo and Otonkorpi-Lehtoranta2015). These ideals are also widespread, because, for instance, gender discrimination in hiring is deemed illegal in 89% of countries (Heymann, Bose, Waisath, Raub, & McCormack, Reference Heymann, Bose, Waisath, Raub and McCormack2020).
In light of these norms, the experimental designs of STEM hiring are not mere displays of “methodological trickery,” as Cesario suggests (sect. 5, para. 5). It is not trickery to create an experimental design where “the single relevant piece of information is the qualification of the applicant as revealed by the resume; being influenced by anything other than this information is treated as biased, erroneous decision-making” (sect. 5, para. 6). The design that Cesario describes reflects the real-world decision-making goals of recruiters and legislators. When a participant in a laboratory experiment uses irrelevant non-performance-related information on group membership to evaluate and to select candidates for an open position, the participant engages in “biased,” “erroneous,” and “discriminatory” decision-making that would count as “biased,” “erroneous,” and “discriminatory” decision-making also outside the lab.
Given that Cesario's goal is to suggest a new approach for experimental social psychology that begins with an analysis of actual decisions, Cesario should also walk the talk when criticizing STEM hiring research. The lesson is that participants in a laboratory study may be non-biased in the Bayesian sense, but at the same time their decision-making can be regarded as discriminatory and erroneous in the moral sense. First of all, it might be true that real-world recruiters (or college students enrolled in psychological experiments studying recruitment bias) might be Bayesian actors in the sense that they form their decision by using “information that may be probabilistically accurate in everyday life” (sect. 5, para. 7), as Cesario puts it. Second, it is also true that the experiments studying recruitment bias are designed in such a way that the label of “erroneous behavior” is attached to situations in which participants use information within the experiment that may actually lead to more accurate decisions outside the experiment. For instance, in some contexts, knowing that an applicant belongs to a certain salient social group might lead to somewhat accurate predictions about the applicant's future job performance or ability to commit to a job (Arrow, Reference Arrow, Ashenfelter and Rees1973; Phelps, Reference Phelps1972). Knowing about an applicant's childcare responsibilities might be a factor that has real-predictive value in some contexts. Nevertheless, the use of this information in a way that leads to disparate treatment of applicants in hiring decisions counts as statistical discrimination.
To conclude, it should be added that providing a deeper understanding of the motivation behind the experimental designs of STEM hiring does not show Cesario to be wrong in his main claim. One cannot naïvely assume that the social psychological experiments of categorical bias or audit studies provide causal explanations that would universally account for all real-world group disparities. What my comment puts forth is the possibility that the research methodologies of laboratory studies of recruitment and the interpretation of the results as “errors” may reflect the normative ethical values of the researchers and modern societies, because similar phenomena have occurred in other fields of science. Normative views on gender have been shown to influence how data are interpreted in anthropological studies on human evolution (Longino, Reference Longino1990), and normative views of divorce have influenced the ways in which research questions and research designs are framed when studying the effects of divorce on well-being (Anderson, Reference Anderson2004).
It should also be noted that the purpose of my comment is entirely descriptive, and the goal is to correct and deepen Cesario's interpretation of studies of recruitment bias. I do not seek to defend the scientific soundness of the research methodologies and the ways of interpreting results by using the concepts of “bias” and “discrimination” as morally laden thick concepts. One can indeed question whether it is good scientific practice to allow values to influence science in this way.
Financial support
The study is funded by the University of Helsinki's 3-year research project “From cyborg origins of modern economics to its automated future. Towards a new philosophy of economics.”
Conflict of interest
None.