Hostname: page-component-7b9c58cd5d-wdhn8 Total loading time: 0 Render date: 2025-03-13T11:28:39.113Z Has data issue: false hasContentIssue false

Accuracy, bias, self-fulfilling prophecies, and scientific self-correction

Published online by Cambridge University Press:  22 March 2017

Lee Jussim*
Affiliation:
Department of Psychology, Rutgers University, Piscataway, NJ 08544. Jussim@rci.rutgers.eduhttp://www.rci.rutgers.edu/~jussim/

Abstract

In my Précis of Social Perception and Social Reality (Jussim 2012, henceforth abbreviated as SPSR), I argued that the social science scholarship on social perception and interpersonal expectancies was characterized by a tripartite pattern: (1) Errors, biases, and self-fulfilling prophecies in person perception were generally weak, fragile, and fleeting; (2) Social perceptions were often quite accurate; and (3) Conclusions appearing throughout the social psychology scientific literature routinely overstated the power and pervasiveness of expectancy effects, and ignored evidence of accuracy. Most commentators concurred with the validity of these conclusions. Two, however, strongly disagreed with the conclusion that the evidence consistently has shown that stereotypes are moderately to highly accurate. Several others, while agreeing with most of the specifics, also suggested that those arguments did not necessarily apply to contexts outside of those covered in SPSR. In this response, I consider all these aspects: the limitations to the tripartite pattern, the role of politics and confirmation biases in distorting scientific conclusions, common obstructions to effective scientific self-correction, and how to limit them.

Type
Author's Response
Copyright
Copyright © Cambridge University Press 2017 

R1. Introduction

When I began writing Social Perception and Social Reality (Jussim Reference Jussim2012; henceforth abbreviated as SPSR), my goal was to offer a corrective to a slew of manifestly false claims in common conclusions about social perception. The Précis target article summarized this as the tripartite pattern: (1) Errors, biases, and self-fulfilling prophecies in person perception do occur, but are generally weak, fragile, and fleeting; (2) Social perceptions were often quite accurate; and (3) Conclusions appearing throughout the psychological literature (with educational psychology as a notable exception) were often unhinged from these data by virtue of routinely declaring expectancy effects powerful and pervasive, and consistently ignoring evidence of accuracy. I also argued that defining stereotypes as inaccurate is logically incoherent, and that, in sharp contrast to 100 years of claims to the contrary, the evidence is that stereotypes are often (not always) quite accurate. All of this was “news,” not because of new dramatic data, but because the old data never justified the strong claims about the power of expectancies to create social reality. To paraphrase Winston Churchill's characterization of the early architects of appeasement, social psychology periodically stumbled on the truth but simply picked itself up and hurried along as if nothing had happened.

The evidence supporting an emphasis on error, bias, and irrationality in social perception was maintained by virtue of overstatements of what research actually found (e.g., by ignoring effect sizes), and selective citation of a small number of dramatic “wow effect” studies (Jussim et al. Reference Jussim, Crawford, Anglin, Stevens and Duarte2016b), many of which have proven difficult to replicate. The error and bias (“stupidism” as per Kihlstrom) perspective was also maintained by reliance on a “toolbox” of double standards, blind spots, and researcher confirmation biases that served to elevate the supposed importance of “bias” results and studies while denigrating or dismissing evidence of accuracy. Indeed, SPSR pointed out that even some of the most classic demonstrations of “bias” in social perception (e.g., C. E. Cohen Reference Cohen1981; Hastorf & Cantril Reference Hastorf and Cantril1954; Rosenhan Reference Rosenhan1973) actually provided more evidence of accuracy than of bias – as did many of the follow-ups to classic studies such as Rosenthal and Jacobson (Reference Rosenthal and Jacobson1968), Snyder & Swann 1978b, and Darley and Gross (Reference Darley and Gross1983).

Nearly all of the commentators have accepted the fundamental conclusions of SPSR. Of the seventeen commentaries, only two have taken issue with the major conclusions (Bian & Cimpian; Terbeck), and two have disagreed, not over conclusions, but over definitions (Andrews; Church). Two others similarly do not take issue with the tripartite conclusion, but also suggest that phenomena outside the scope of SPSR might have yielded different conclusions (Kahan; Wilson & Huang).

There is abundant embracing of the main conclusions. Trafimow & Raut's consideration of statistical effect sizes concludes that the weight of the evidence supports the tripartite conclusions. Two of the commentators (Little; Marczyk) point out that it would be bizarre if evolution did not lead us to be in touch with reality much of the time. Several commentators point to evidence outside the scope of SPSR that also often yields evidence of accuracy and rationality (Bonnefon, Hopfensitz, & De Neys [Bonnefon et al.]; Kihlstrom; Mousavi & Funder; Wagoner). And one commentary (Madison, Dutton, & Stern [Madison et al.]) endorses and expands on one of the undercurrents of SPSR – that politics and dogmas have probably distorted the scientific conclusions in this and many other areas of social science.

Does this mean psychology has finally turned a corner and is self-correcting towards fully recognizing that social perception is often largely rational and accurate and generally only modestly subject to error and bias? Given the positive nature of most of the commentaries, one might suspect that the answer is “yes indeed.” One would, however, be wrong. Before returning to that question, however, I respond to the specific issues raised in the commentaries.

Despite largely supporting the general perspective in the book, even the vast majority who did not contest the general conclusions often did point out gaps in the theorizing and that other phenomena not addressed in the book often yielded greater evidence of error, bias, and constructionist processes. A smaller number did take issue with some of the book's key conclusions. Each of these are discussed next.

R2. Definitional differences

R2.1. Sensory perception versus social perception

Church correctly points out that the research reviewed in SPSR does not make a hard distinction between sensory perception and more global cognitive representations, such as interpretations, beliefs, judgments, et cetera. In fact, to social psychologists, social perception rarely deals with sensory perception per se, and generally deals with “perception” in this more cognitive and molar sense. When we seek answers to questions like, “How do voters perceive President Obama?” we are almost always interested in beliefs, attitudes, and opinions about Obama, not their sensory perceptions. I am glad to have the opportunity to clarify that SPSR's focus is on molar social perceptions, and not sensory perceptions. However, the idea that beliefs and expectations influence sensory perceptions has even less support than it does with respect to molar social perceptions (Firestone & Scholl Reference Firestone and Scholl2016).

R2.2. If every concept is really a “stereotype,” the term loses all meaning

Andrews' commentary advocates abandoning the distinction between stereotypes and individuating information. I disagree. If anything one person believes about another's behaviors and characteristics is a “stereotype,” then the term loses all meaning and usefulness as a theoretical construct.

Social psychologists (and I suspect many other people; see, e.g., the 14th Amendment to the U.S. Constitution, and the Civil Rights acts of the 1960sFootnote 1 ) usually consider it important to understand where beliefs about groups versus individuals originate – at least, if those beliefs influence behavior. Take, for example, beliefs about a person named Alfonso. Do they result from: (1) his being a male or Latino, or (2) the fact that he acts in his local theater group, earned a law degree from Harvard, likes skinny jeans, and vacations in the Bahamas? It does not matter much, I suspect, whether we call them stereotypes versus individuating information, or “concepts” about groups and concepts about individuals. What is important is that beliefs about Latinos are not the same thing as beliefs about Alfonso. They may have some superordinate similarities (both are beliefs), in the same way that beliefs about apples (they are usually red and good to eat) have some superficial similarity to this apple (which is brown and rotting and not good to eat). Nonetheless, both apples and this apple are concepts and, in that way, have some similarity. The distinction is also important, however, because it highlights important differences between apples and this apple, and between Alfonso in particular, and Latinos in general.

R3. Filling in gaps and expanding the scope

R3.1. Evolutionary explanations for (in)accuracy

Two commentators brought up the important theoretical point that evolutionary theories could explain both why there is relatively high levels of accuracy in social perception and also conditions likely to produce low accuracy because adaptations to advance goals other than accuracy might take precedence (Little; Marczyk). These commentators (1) correctly point out that SPSR did not draw on evolutionary perspectives; and (2) point out that doing so is likely to provide important theoretical advances to understanding conditions under which accuracy is likely to be higher or lower. At its most basic level, it is hard to imagine any successful organism that has evolved to have completely invalid reactions to its environment.

On the other hand, evolution emphasizes adaptations that enhance an organism's ability to produce viable offspring. Thus, as both Marczyk and Little correctly point out, there are many goals that might accomplish this that have little relevance to, or which might even conflict with, accuracy (e.g., identifying and attracting fertile mates, attracting resources to support offspring, etc.). Deception is prominent in the animal world (and even occurs in the plant world; consider carnivorous plants posing as nectar-rich flowers) because there are so many ways that it could have adaptive advantages. Therefore, it also seems implausible that evolution would yield social perceptions that were perfectly accurate (von Hippel & Trivers Reference Von Hippel and Trivers2011). In psychology, error management theory (Haselton & Buss Reference Haselton and & Buss2000) was, as pointed out by Little, an early and constructive attempt to identify when evolutionary pressures were more likely to lead to accuracy versus certain specific types or patterns of errors. Indeed, social psychological theorizing will likely be enhanced and sharpened by further efforts to exploit evolutionary ideas to understand when people are likely to be accurate and when they are likely to be systematically inaccurate.

R3.2. Accuracy in other contexts

Several commentators have pointed out that, in other contexts, there is often: (1) Surprising evidence of accuracy; and (2) a similar pattern of political or theoretical double standards in evaluations of the evidence. Such double standards occur when people hold research that advances their theoretical perspectives or political values to lower standards than research that opposes their theoretical perspectives or political values:

Bonnefon et al., for example, point out these issues in the study of the accuracy of judgments of trust at zero acquaintance. Although the levels of accuracy are much lower than among expectancies, they point out (correctly, in my view) that any accuracy on the basis of a mere photograph is quite striking. Furthermore, their analysis strongly suggests that political/advocacy goals have led to a set of logically contradictory conclusions about accuracy in perceptions of trust, in a way quite reminiscent of the double standards and logical incoherence I identified with respect to self-fulfilling prophecies and stereotypes. In short, perceptions of trustworthiness have been declared both inaccurate and self-fulfilling, and these are mutually exclusive conclusions. A belief that a target is untrustworthy can be, at one moment in time, inaccurate, and the next, self-fulfilling, such that the target becomes untrustworthy. After the target becomes untrustworthy, subsequent perceivers are not wrong for believing the target to be untrustworthy. They are accurate. Consequently, Bonnefon et al. correctly point out that perceivers' beliefs about trustworthiness cannot be generally inaccurate and self-fulfilling.

Mousavi & Funder similarly point out that judgments are often “ecologically rational,” meaning that they are well adapted to their environments. Fast and frugal heuristics, though technically constituting “biases,” especially in laboratory studies, often lead to moderately to highly accurate judgments in much of the rest of daily life. These commentators, too, echo the political implications of accuracy, pointing out that an overweening emphasis on error and bias misses a great deal of evidence of accuracy. This is important, they argue (and I agree), in part, because in such situations, efforts to solve social problems by changing supposedly erroneous beliefs are doomed to failure when the beliefs are not particularly erroneous in the first place.

Wagoner points out that distortions of the scientific record similar to those described in SPSR have long characterized perspectives on memory. The schema concept, which is hypothetically at least neutral with respect to accuracy (as are interpersonal expectancies) has become virtually synonymous with error and bias (as have interpersonal expectancies). That many modern perspectives have just as blithely ignored Bartlett's (Reference Bartlett1932) balanced view of accuracy/error in memory as F. H. Allport's (Reference Allport1955) balanced views on perception is a testament to the long reach of the distorting power of theoretical perspectives emphasizing distortion.

R4. Constructivism (both cognitive and social) lives!

R4.1. Cognitive constructivism

Kihlstrom agrees with the general thrust of SPSR but also urges not to throw out the baby (cognitive constructivism) with the bathwater (the excessive emphasis on error and bias). And nor did I intend to do so. Kihlstrom's commentary presents a very thoughtful and balanced view of realism and constructivism, and is a great primer on how social psychology can be enriched by not dismissing ideas from any of those broad perspectives writ large.

The cognitive constructionist processes highlighted by Kihlstrom undoubtedly can and do influence memory and the types of social perceptual processes addressed in SPSR. And, surely, sometimes those effects do indeed constitute errors and biases. However, constructivism and error/bias are not synonymous. Although Kihlstrom does not argue that they are, because cognitive constructivist is sometimes presumed to mean something like “perceivers making stuff up that supports their pre-existing beliefs, expectations, and values” (Kihlstrom's own emphasis on the over-reach of “stupidism” perspectives; see also commentaries by Madison et al., Mousavi & Funder, Wagoner), it is, perhaps, worth walking through why bias and constructivism are not synonymous.

“Constructive accuracy” refers to the process by which expectancy-induced “biases” can increase accuracy (Jussim Reference Jussim1991). Figure R1 presents the Reflection-Construction Model (Jussim Reference Jussim1991), within which Figure R1A depicts relations among the key variables involved in accuracy, bias, and self-fulfilling prophecy, and Figure R1B depicts constructive accuracy. The latter shows that impression accuracy (correspondence between perceivers' judgments of targets and those targets' behavior or attributes) can be quite high, even when perceivers base their judgments of individual targets exclusively on their own expectations, and are oblivious to (ignore, overlook, do not have access to) targets' actual behavior or attributes. If all three paths shown are high enough, perceiver judgments will correspond to (correlate with) target behavior or attributes, even though perceiver judgments are heavily based on their own expectations and not at all based on target behaviors or attributes. This is because, in Figure R1B, the correlation between perceiver judgments and target behavior or attributes is the multiplicative product of the three paths. For example, if all three equal .8, then impression accuracy equals .83=.51. In psychological, rather than mathematical terms, this means that, if perceivers' expectations are strongly based on highly valid information, the more they rely on those expectations when judging targets, the more accurate they will be.

Figure R1. The Reflection-Construction Model (Jussim, Reference Jussim1991).

A: The Full Model; Figure R1B: Constructive Accuracy: Even when perceivers are completely oblivious to targets' behavior or attributes, their judgments of targets will still correspond to (correlate with) targets behavior or attributes if (1) expectations are based on background information that (2) predicts targets behavior or attributes; and if (3) expectations influence (bias) perceiver judgments.

This is a constructive phenomenon, because, in this example, the judgment is based entirely on perceiver expectations, with no direct input from targets' actual behavior or attributes. Furthermore, even if perceivers do partially base their judgments directly on targets' behavior or attributes, relying on accurate expectations can still increase accuracy further (see Jussim Reference Jussim1991, for details). Thus, I concur with Kihlstrom that constructive processes can and do play an important role in person perception; however, I would emphasize that, even so, such processes may, at least when those expectations are themselves based on valid information, increase rather than reduce accuracy.

R4.2. Social constructionism

Two of the commentaries (Tappin, McKay, & Abrams [Tappin et al.]; Wilson & Huang) make measured appeals to not completely throw out the social constructionist baby with (what I would call) the excessively political bathwater. However, both commentaries, in somewhat different ways, present defenses of social constructionist processes. Tappin et al. do so by arguing for the importance of collective action as a major influence on social reality; Wilson & Huang do so by emphasizing the role of institutions in creating social reality. I see these arguments as mutually reinforcing, so I address them both here.

This reply is not the place for a comprehensive review or critique of social constructionism, which is a single term that refers to quite a variety of perspectives. I would, however, divide social constructionism into two main veins (which are not necessarily mutually exclusive). One is primarily a political liberation perspective, with the goals of combating oppressive and hegemonic practices and discourses, in part, by revealing them, thereby advancing the interests of certain groups that the users of such terms deem unfairly victimized or exploited in some way. Few in psychology capture the politicized nature of the efforts better than Jost and Kruglanski (Reference Jost and Kruglanski2002), who approvingly declared: “From this perspective, we have a professional obligation to weigh in on ideological issues, policies, and decisions” (p. 175) and later on the same page, “The social constructionist movement emerged on the social science scene as a force for change and cultivated a leftist revolutionary spirit that posed a distinctive challenge to established scientific authority.” As a movement with primarily political goals, my view is that this sort of politicization has little place in scientific psychological theorizing.

However, a separate vein of social constructionism aspires to be bona fide social science. Not all forms of social constructionism are blatantly political or liberationist. Some, instead, seek to understand social relationships, including but not restricted to relationships of power and status, and the reciprocal influences among individuals and institutions (regardless of whose interests or advocacy agendas these understandings might advance). In this spirit, Tappin et al. are surely correct in arguing that, under some conditions, collective actions can alter the nature of intergroup relations. Nonetheless, it is also, perhaps, worth pointing out that, although they review abundant evidence that target groups are motivated by perceived slights to and injustices inflicted on their groups, little, if any, of the research they cited links laypeople's actual stereotypes to collective action. I suspect that this failure stems from: (1) Social psychologists taking for granted that laypeople hold unjustified and pernicious stereotypes; so it was (2) (unjustifiably, in my view) deemed not even necessary to assess actual stereotypes, the derogatory nature of stereotypes could simply be taken for granted.

Space does not permit a citation-by-citation critical analysis of the work Tappin et al. have presented in support of their perspective, so perhaps one example will suffice. They cite a series of studies by Ellemers and Barreto (Reference Ellemers and Barreto2009) three times in their short commentary, so they seem to consider it important to their perspective. Ellemers and Barreto (Reference Ellemers and Barreto2009) showed that believing others had an insulting view of one's group (e.g., for women, someone believing that women are unintelligent), motivated collective action. But what if lay people do not routinely believe women are unintelligent? There is evidence that people believe boys are better at math than girls, but the same studies show that people also believed girls have higher verbal skills than boys (Swim Reference Swim1994). Regardless, Ellemers and Barreto (Reference Ellemers and Barreto2009) did not assess anyone's sex stereotypes regarding intelligence, nor do they review research that has done so. Their findings are still interesting, because they say something about how perceived intergroup insults motivate collective action, but not because it says anything about the role of the actual stereotypes held by any actual people in leading to collective action. This is not meant to dismiss the perspective entirely. In fairness, it probably can be interpreted as showing that when people hold derogatory stereotypes, if targets become aware of those stereotypes, collective action may result. Of course, history is filled with counter-examples, cases where people did hold derogatory stereotypes and little collective action resulted over vast periods of time (consider, e.g., the inferior status ascribed women, three hundred years of slavery in the United States, and the Hindu caste system). I suspect, therefore, that the “holding insulting stereotypes–collective action” link is tenuous at best, and subject to many conditions not articulated in either Tappin et al.'s commentary or much of the underlying research. Indeed, I do not doubt the effect exists, but I would suspect that, in the real world, across many situations and contexts, it fits the pattern described in SPSR: occasionally strong, but usually quite weak, fragile, and fleeting.

Wilson & Huang are correct in pointing out that my conceptual analysis “freezes” institutions at a given point in time, and then examines the rationality versus the biased nature of social perception. As they point out, institutions are not actually frozen, and are subject to both slow-moving and, occasionally, dramatic and sudden changes. My argument was never intended to be “there are no conditions under which stereotypes or social beliefs construct reality.” Indeed, SPSR is peppered with both real world examples and scientific studies showing that, sometimes, such effects can be quite powerful.

SPSR, however, did not have as its purpose identifying the nature of collective movements or the inter-relationships between institutions, demographic groups, and individuals. Instead, the purpose of the book was to review evidence regarding the extent to which individuals' beliefs about groups or other individuals were accurate, biased, or self-fulfilling. This is, as these commentaries suggest, a limitation of its scope. It is certainly an important and appropriate social science endeavor to address issues such as collective action and institutions. SPSR made no claims about such issues. It was, however, an attempted corrective to longstanding and unjustified social science claims about how individuals' beliefs relate to social reality – and, on this issue, it is, perhaps, worth noting that both Tappin et al. and Wilson & Huang presented neither argument nor evidence against the central claim that such corrective is justified and past due.

R5. Victims of the processistic fallacy

Two commentaries aspire to refute the conclusion reached in SPSR that stereotypes have been widely found to be at least moderately accurate. Both Terbeck and Bian & Cimpian propose processes that they believe cause inaccuracy in stereotypes. Both critiques fall victim to the processistic fallacy, which was addressed in SPSR. Thus, my response to these critiques begins by quoting that text (p. 394):

The processistic fallacy involves concluding that laypeople's beliefs must be inaccurate because researchers have discovered cognitive processes that the researchers believe to be flawed.

This is a fallacy for several reasons: (1) The process may not be as flawed as the researchers believe, and its degree of “flaw” cannot be assessed without assessing the validity or success of the judgments and decisions by people who do versus do not rely on this process (something social scientists rarely do); (2) even if the process is indeed flawed, in real life, people may rely on many other less flawed processes when making judgments and decisions; and (3) in real life, social reality often intrudes upon people's erroneous beliefs—that is, it provides feedback that permits people to recognize their initial beliefs were wrong and to alter them accordingly. So, again, we cannot know how flawed the outcome is—the judgment or decision—unless we evaluate its success, accuracy, validity, etc. (which is another thing social scientists emphasizing error and bias do not often do). (Jussim Reference Jussim2012, p. 394)

The processistic fallacy is a form of overgeneralization. It occurs whenever researchers who demonstrate some error or bias under a very small set of (typically artificial laboratory) conditions unjustifiably assume or conclude that their findings mean that there is widespread human error under naturally occurring conditions (e.g., Cohen Reference Cohen1981; Funder Reference Funder1987). Lab studies are often well-designed to test basic processes but not to generalize results to naturally occurring conditions (Mook Reference Mook1983). It is hypothetically possible to appropriately generalize widespread error under naturalistic conditions on the basis of studies revealing flawed processes in the laboratory, but only under conditions that are almost never met. One way to justify such generalizations is to discover a process so flawed that it must be definitively known to produce pervasive inaccuracy in situations that go well beyond those studied in the lab. For example, the human visual system cannot detect radio waves, so that it is safe to conclude people will be universally inaccurate in their visual assessment of the presence/absence of such waves.

Such demonstrations are few and far between in psychology. A wide range of judgmental and perceptual errors and biases found in laboratory studies have turned out to be functional outside those studies. For example, Gigerenzer and Brighton (Reference Gigerenzer and Brighton2009) reviewed evidence showing that:

In contrast to the widely held view that less processing reduces accuracy, the study of heuristics shows that less information, computation, and time can in fact improve accuracy. We review the major progress made so far: (a) the discovery of less-is-more effects; (b) the study of the ecological rationality of heuristics, which examines in which environments a given strategy succeeds or fails, and why. (Gigerenzer & Brighton Reference Gigerenzer and Brighton2009, p. 107)

Such findings should give deep pause to modern researchers who, upon discovering some laboratory bias, leap to the assumption that process undermines accuracy in naturalistic conditions. Regardless, I am not aware of any research that has documented a social perceptual process so flawed that it can be definitively known to produce inaccuracy on purely logical grounds comparable to the radio wave example above.

However, even if such a process were discovered, additional conditions must also be met to generalize from lab studies of biased processes to a conclusion of widespread inaccuracy in life. It must be shown either that people are incapable of overcoming the bias or error by relying on alternative, superior or corrective processes, or, empirically, that, across most of a widely representative array of situations, people both rely on the flawed process and rarely enlist superior or corrective processes. If a program of research engages in a sufficiently large representative sampling of situations (Brunswick Reference Brunswick1957; Monin & Oppenheimer Reference Monin and Oppenheimer2014; Westfall et al. Reference Westfall, Judd and Kenny2015), and shows that, in most such situations, a flawed process is heavily relied upon and other more appropriate processes are rarely enlisted, inferring widespread error in real life can be justified. Few programs of research, however, meet these standards, whether in social perception or other areas of psychology (e.g., Cohen Reference Cohen1981; Westfall et al. Reference Westfall, Judd and Kenny2015). This is probably because, as Wells and Windschitl (Reference Wells and Windschitl1999, p. 1115) found, among psychology faculty, there was an “insensitivity to the need for stimulus sampling except when the problem is made rather obvious.”

R5.1 Common flaws in the critiques

Both Terbeck's and Bian & Cimpian's commentaries have identified potentially flawed processes, and both perspectives are capable of generating testable (and falsifiable) hypotheses about potential patterns and sources of stereotype inaccuracy. As such, their perspectives are potentially constructive and generative of new research directions and potentially valuable insights into sources and conditions of stereotype inaccuracy.

Nonetheless, neither Terbeck, nor Bian & Cimpian discuss any research that meets the standards articulated above for concluding that stereotypes must be inaccurate on the basis of the supposedly flawed processes that were identified. Neither present justification for assuming that those supposedly flawed processes inherently produce inaccuracy in other judgments (comparable to showing that vision cannot detect radio waves). Thus, we do not know that those processes produce inaccuracy.

Furthermore, Terbeck and Bian & Cimpian did not discuss any program of research that has shown that those processes produce inaccuracy most of the time in a representative sample of situations. Therefore, it is not knowable from that research whether people commonly rely on those processes, even if they are truly flawed. Last, even if those processes are truly flawed and widely relied upon, neither commentary reviews any research demonstrating that people rely exclusively on supposedly flawed process across situations (even unrepresentative ones). They have not eliminated the possibility that there are other, more appropriate processes that people rely upon, when arriving at stereotypes. Both critiques therefore, declare stereotypes to be inaccurate on the basis of research incapable of justifying such a conclusion. Both commit the processistic fallacy: over-inferring pervasive (“stereotypes must be inaccurate”) error in real life from laboratory studies of processes that are incapable of generalizing to much of real life. It is of course possible that such studies do generalize widely; but that cannot be known without empirical demonstrations that they actually do generalize widely. To make these issues more concrete, the specific evidence each commentator discusses is reviewed next.

R5.2. Terbeck, on categorization, implicit prejudice, and the brain

In her commentary, Terbeck refers to research showing that: (1) Infants and primates categorize; (2) specific brain areas are associated with face recognition; and (3) drugs alter scores on the race implicit association test (IAT). This is all fine as far as it goes. Categorization is ubiquitous, thus, this passes the test for a justified generalization to real life. However, categorization is not inherently universally invalid, in the same way that visual detection of radio waves is. People are not wrong for believing that chairs usually have four legs, that Alaska is colder than Arizona, or that men are, on average, taller than women. Thus, the claim that any particular category is wrong requires evidence, which Terbeck does not provide.

Similarly, specific brain areas may well be associated with face recognition, but the very term “recognition” implies that, at least some and perhaps most of the time, people are correctly distinguishing faces from other features of the stimulus array. It certainly provides no evidence that facial recognition is wrong. Finally, I have no doubt that drugs can alter IAT scores. Racial prejudice IAT scores are attitudes, and individuals and societies may deem certain attitudes morally good or bad, but attitudes cannot be factually correct or incorrect. It is possible that one's reasons for disliking diet soda, the Yankees, and Fred are factually incorrect, but the attitude itself cannot be accurate or inaccurate. Thus, all three phenomena identified by Terbeck may lead to falsifiable hypotheses about sources of stereotype inaccuracy; but, absent direct data on stereotype accuracy, they do not justify concluding that stereotypes are inaccurate.

R5.3. Bian & Cimpian and generic beliefs

Bian & Cimpian's critique similarly fails to meet the standards necessary to infer widespread naturally occurring error from studies of supposedly flawed processes. Their prototypical cases of supposedly inherently erroneous generic beliefs are those such as “mosquitos carry the West Nile virus” and “ducks lay eggs” (which was the example highlighted in the title of one of the articles they cite in support of their view: Leslie et al. Reference Leslie, Khemlani and Glucksberg2011). They cite evidence that people judge such statements to be true. They argue that this renders people inaccurate because few mosquitos carry West Nile virus and not all ducks lay eggs.

Does agreeing that “mosquitos carry West Nile” mean that we can now assume that people's beliefs about mosquitos and West Nile are pervasively inaccurate? If these are absolutist beliefs (“all mosquitos carry West Nile”) then they are clearly wrong and no further evidence is needed. SPSR made exactly this point when discussing absolutist stereotypes, which, because of widespread human variation, are almost always invalid. But there is no evidence that generic beliefs are always, necessarily, or widely absolutist.

Perhaps, instead, they capture the phenomenology of distinctive or salient differences between categories. I can only get West Nile from mosquitos, not from moths, mice, or musk ox. Perhaps people agree that “mosquitos carry West Nile” not because they believe “all mosquitos carry West Nile,” but because they believe that “only mosquitos carry West Nile.” Because generic beliefs, as studied, are not inherently inaccurate, the research does not meet the first standard necessary to avoid the processistic fallacy. We cannot assume all generic beliefs are necessarily inaccurate.

It also fails the second standard (even if not inherently inaccurate, is the process empirically found to be generally invalid?). One of the articles cited by Bian & Cimpian (Leslie et al. Reference Leslie, Khemlani and Glucksberg2011) found that participants rephrased only 18 of 100 experimenter-provided generic statements as absolutist (“universals” in Leslie et al.'s [Reference Leslie, Khemlani and Glucksberg2011] terminology). Furthermore, overwhelming majorities (over 90%) recognized that, in fact, male sheep do not produce milk, male snakes and male ducks do not lay eggs, and so on, for nearly all absolute beliefs studied. Thus, one cannot interpret agreement with the generic beliefs as evidence of widespread reliance on an invalid process. The Leslie et al. (Reference Leslie, Khemlani and Glucksberg2011) research did include a wide range of generic beliefs, so it is reasonable to conclude that their results are broadly generalizable to generic beliefs. What is generalizable, however, is that most generic beliefs do not equate to absolutist or inherently inaccurate beliefs. Of course, it is still possible that when stereotypes are generic beliefs, they are widely inaccurate. That is another falsifiable hypothesis about which there is currently no data. Inferring that stereotypes are inaccurate from such data is unjustified.

Bian & Cimpian cite another paper by Leslie (Reference Lesliein press) in support of the claim that “more people hold the generic belief that Muslims are terrorists than hold the generic belief that Muslims are female.” But Leslie (Reference Lesliein press) provides no data whatsoever that bears on the frequency with which people hold such beliefs. Instead, she quoted headline-seeking politicians and cited a rise in hate crimes post-9/11. Such information may be interesting, but it does not address the frequency of lay beliefs about anything whatsoever.

Of course, even if the claim that more people agree that “Muslims are terrorists” than that “Muslims are women” was valid, it would not constitute evidence that stereotypes in general, or the Muslim stereotype in particular, must be inaccurate. Its status as such evidence does not hinge on researcher assumptions about what people mean when they agree with statements like, “Muslims are terrorists” but on evidence assessing what people actually mean. Because research on generics fails the first two tests necessary to avoid the processistic fallacy (they do not inherently produce inaccuracy and they have not been empirically demonstrated to usually produce inaccuracies), one could not conclude that greater agreement with the view that “Muslims are terrorists” than with “Muslims are women” necessarily means people believe there are more Muslim terrorists than Muslim women. It may simply mean “some Muslims are terrorists” or “Muslim terrorism is more widespread than other forms of terrorism” and that “being female is not an important distinguishing characteristic of Muslims.” Absent data, we just do not know. The bias literature writ large (Cohen Reference Cohen1981; Gigerenzer & Brighton Reference Gigerenzer and Brighton2009; see also Mousavi & Funder's commentary and SPSR) and the stereotyping literature in particular are so strongly riddled with invalid researcher presumptions about lay people's beliefs, that, absent hard empirical evidence about what people actually believe, researcher assumptions of inaccuracy that are not backed up by empirical evidence demonstrating widespread inaccuracy rarely warrant credibility.

Bian & Cimpian acknowledge that statistical beliefs are far more capable of being accurate, but then go on to claim that most stereotypes are not statistical beliefs, or, at least, generically based stereotypes are more potent influences on social perceptions. They present no assessment, however, of the relative frequencies with which people's beliefs about groups are generic versus statistical, and, given Leslie et al.'s (Reference Leslie, Khemlani and Glucksberg2011) evidence that people do not usually translate generics into absolutes, it may well be that agreement with generics such as “ducks lay eggs” and “Muslims are terrorists” does not preclude the statistical understanding that fewer than half of all ducks are even capable of laying eggs or that the proportion of Muslims who are terrorists is tiny.

We can, however, consider the implications of their claim that most people's stereotypes include little or no statistical understanding of the distributions of characteristics among groups. This view leads to another falsifiable hypothesis: Laypeople would have little idea about racial/ethnic differences in high school or college graduate rates, or about the nonverbal skill differences between men and women, and are clueless about differences in the policy positions held by Democrats and Republicans. That leads to a very simple prediction – that people's judgments of these distributions would be almost entirely unrelated to the actual distributions; correlations of stereotypes with criteria would be near zero and discrepancy scores would be high. One cannot have it both ways. If people are statistically clueless, then their beliefs should be unrelated to statistical distributions of characteristics among groups. If people's beliefs do show strong relations to statistical realities, then they cannot be statistically clueless.

We already know that the predictions generated from the “most stereotypes are generic and are therefore statistically clueless” are disconfirmed by the data summarized in SPSR (see also Jussim et al. [2015b], for an updated review of stereotype accuracy that includes additional studies). Bian & Cimpian have developed compelling descriptions of the processes that they believe should lead people to be inaccurate. In point of empirical fact, however, people have mostly been found to be relatively accurate. Disconfirmation of such predictions can occur for any of several reasons: (1) The processes identified as “causing” inaccuracy do not occur with the frequency that those offering them assume (maybe most stereotypes are not generic); (2) The processes are quite common and do cause inaccuracy, but are mitigated by other countervailing processes that increase accuracy (e.g., adjusting beliefs in response to corrective information); or (3) The processes are common, but, in real life, lead to much higher levels of accuracy than those emphasizing inaccuracy presume (see Mousavi & Funder's commentary for exactly such a point). Regardless, making declarations about levels of stereotype inaccuracy on the basis of a speculative prediction that some process causes stereotype inaccuracy, rather than on the basis of evidence that directly bears on accuracy, is a classic demonstration of the processistic fallacy.

R6. Confirmation bias and questionable interpretive practices

Kahan does not disagree with a single claim in SPSR; he does, however, urge me to consider the issues of bias and accuracy more broadly, and I do so here. Kahan correctly points out that there is an extensive literature on confirmation bias especially in politicized judgments that SPSR largely ignores. My goal was to evaluate the literature on social perception – how people view other people, especially individuals and groups; and especially with respect to judgments that could conceivably be assessed for their accuracy. To compare bias, self-fulfilling prophecy, and accuracy, it was necessary to focus on judgments that could be biased, self-fulfilling or accurate. SPSR purposely excluded people's beliefs about scientific or social science facts or evidence because I do not consider them social perception in the classic sense of “how people understand specific other people or groups.” SPSR also excluded moral and political beliefs because they often have no criteria for assessing accuracy. I concur with Kahan's view that confirmation biases can be quite powerful with respect to many of these excluded judgments.

Indeed, the very validity of Kahan's commentary highlights an interesting irony. Exactly the types of confirmation biases in perceptions of science highlighted by Kahan's commentary may characterize social psychological science. There is ample evidence that scientists' confirmation biases about research conclusions are demonstrably powerful in at least many cases. Social psychological perspectives that emphasize the power of lay confirmation biases in person perception do so on the basis of a highly selective review of the evidence. Any review reaching the conclusion that the evidence shows that person perception is powerfully characterized by confirmation biases must be based on researcher confirmation bias because the evidence so overwhelmingly shows that lay person perception is mostly motivated by the desire to be accurate (e.g., Devine et al. Reference Devine, Hirt and Gehrke1990; Trope & Bassok Reference Trope and Bassok1983). Chapters 5 and 8 addressed this issue at length. With respect to seeking information that bears on their interpersonal expectations, in general, the evidence shows that people overwhelmingly seek and prefer diagnostic, not confirmatory, information.

Kahan's perspective, however, which focuses a great deal on the role of confirmation biases in how people evaluate science, exquisitely describes the production of social psychological theories of and conclusions about person perception, and many other topics. There are other examples in SPSR which are consistent with Kahan's confirmation bias perspective applied to how psychologists reach conclusions; these include:

  • Overstated claims about the power of self-fulfilling prophecies

  • Overstated claims about expectancy- or stereotype-induced perceptual biases

  • Underestimations of the power of accuracy, especially though not exclusively stereotype accuracy, and/or dismissals of its “importance”

  • Decades of misinterpretations of studies such as Hastorf and Cantril (Reference Hastorf and Cantril1954) and Rosenhan (Reference Rosenhan1973) as demonstrating the power of bias, when, in fact, they demonstrated overwhelmingly the power of accuracy

That science sometimes goes wrong is a normal part of science. But when science goes off the rails and fails to self-correct for decades, especially when the evidence is sitting in plain daylight from within the original published reports, something other than “pure science” may be going on. Kahan's work points to some likely possibilities. Kahan's work helps explain the prevalence of questionable interpretive practices (QIPs) – narrative, conceptual, and interpretive means by which scientists can and do reach unjustified conclusions, even in the complete absence of statistical or methodological errors and flaws, and even when findings are replicable (Jussim et al. Reference Jussim, Crawford, Anglin, Stevens, Forgas, Fiedler and Crano2015a; Reference Jussim, Crawford, Anglin, Stevens and Duarte2016b; Reference Jussim, Crawford, Stevens, Anglin, Valdesolo and Graham2016c; Reference Jussim, Crawford, Stevens, Anglin, Duarte, Forgas, Jussim and van Lange2016d). QIPs captured in SPSR include:

  • Logical incoherence: Reaching opposite or contradictory conclusions, as long as both advance one's preferred narratives, values, theory, or ideology. Simple example: Claiming there are no good criteria for assessing the accuracy of stereotypes yet accepting “known groups validity” as a reasonable way to validate new measures.

  • Phantom facts: Declaring something to be a fact without evidence. Simple example from SPSR: Declaring stereotypes to be inaccurate without evidence.

  • Blind spots: Overlooking or ignoring research that contests one's preferred perspective. Simple example from SPSR: Citing Darley and Gross's (Reference Darley and Gross1983) single study that they interpreted as showing that stereotypes lead to their own confirmation, and ignoring Baron et al.'s (Reference Baron, Albright and Malloy1995) two failed replications.

  • Double standards: Subjecting the research producing conclusions one dislikes to withering criticisms, and extolling the virtues and value of research producing conclusions one likes, even when the research one dislikes is of equal or higher methodological quality. Simple example: the common claim that there are no “good” criteria for assessing accuracy, while, at the same time, extolling the power of self-fulfilling prophecies. This is a double standard because both accuracy and self-fulfilling prophecies require showing correspondence between belief and reality, so that the criteria for doing so must be identical.

Exposés of major disconnects between accumulated data and common conclusions have been recently published regarding broad areas within cognitive psychology (Firestone & Scholl Reference Firestone and Scholl2016), social psychology (Jussim et al. Reference Jussim, Crawford, Anglin, Stevens and Duarte2016b), social neuroscience (Vul et al. Reference Vul, Harris, Winkielman and Pashler2009), and sociology (Martin Reference Martin2016). Over a decade ago Pinker (Reference Pinker2002) exposed how political motivations led to invalid claims about education, parenting, crime, personality, evolution, and more.

What is going on here? Is it really possible that trained social psychologists, people with PhDs and years of experience, routinely engage in substantial confirmation bias in interpreting scientific research? Many scholarly perspectives answer this question with a clear, “yes indeed (for general reviews of scientific susceptibility to confirmation bias, see: Greenwald et al. Reference Greenwald, Pratkanis, Leippe and Baumgardner1986; Ioannidis Reference Ioannidis2012; Lilienfeld Reference Lilienfeld2010). For a review of how confirmation biases have led social psychology to specific unjustified conclusions in areas such as discrimination, stereotype threat, unconscious influences on sensory perception and more, see Jussim et al. (Reference Jussim, Crawford, Anglin, Stevens and Duarte2016b). “Successful” motivated reasoning driven by the goal of reaching some particular conclusion requires information, experience, and skill with logic and argumentation. People with PhDs and extensive training – especially those with training in telling “compelling narratives” (Bem Reference Bem, Zanna and Darley1987; Jordan & Zanna Reference Jordan, & Zanna, Sternberg, Halpern and Roediger2007) – are more able to dismiss findings they do not like and defend findings they do like in the face of challenges than are less intelligent and less well-trained laypeople. Indeed, Kahan himself (Kahan et al. Reference Kahan, Peters, Wittlin, Slovic, Ouellette, Braman and Mandel2012b) has found that views about climate change become more polarized as people's science knowledge increases (see also Haidt Reference Haidt2012).

An even stronger view is presented by Madison et al., who highlight scholarship on the clever sillies – which presents a perspective suggesting just how extremely distorted “scholarly” conclusions can get. Much of that research suggests that social scientists who are obviously very intelligent and have extraordinary levels of knowledge and expertise express manifestly silly claims primarily to signal their intelligence (Charlton Reference Charlton2009; Dutton & van der Linden 2015). Because manifestly silly ideas are often presented in high-falutin and sophisticated-sounding language, they can appear rigorous and (to paraphrase Stephen Colbert) high in “scientificiness” and, therefore, can create an illusion of plausibility and validity. In the social sciences, such ideas often include the denial of evolutionary or biological bases of human psychology and behavior (see, e.g., Pinker [2002] for a broad review), the denial of stereotype accuracy, and, I would argue, attempts to stigmatize and ostracize those who point out that the data does not always advance social scientific narratives that are presumed to advance the interests of the oppressed (Gottfredson Reference Gottfredson2010; Pinker Reference Pinker2002).

I do not doubt that a desire to signal one's brilliance may indeed be one motivation underlying the clever sillies, but I do not think it is the only one, and, perhaps, not even the most important one in the social sciences. In addition to signaling intelligence, staking out positions that are logically incoherent or disconnected from scientific evidence can signal not just intelligence, but one's political allegiances, one's moral positions, and that one is on the “side” of one's colleagues fighting “the good fight” (Kahan). The extent to which scientific distortions, such as the denial of stereotype accuracy or evolutionary influences on psychology, result from motivation to signal one's egalitarian bona fides to one's colleagues, the desire to advance one's politics, values, and morals, or other less politicized sources is an important empirical question for the burgeoning area of meta-science and scientific integrity (e.g., Ioannidis Reference Ioannidis2012; Jussim et al. Reference Jussim, Crawford, Anglin, Stevens and Duarte2016b; Simmons et al. Reference Simmons, Nelson and Simonsohn2011)

R7. The fundamental publication error: Was Planck right?

As Max Planck wrote in 1950

A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it. (Planck Reference Planck1950, p. 97)

Self-correction is often taken to be a hallmark of science. Whereas religious, political, or moral beliefs may not be subject to change based on evidence, scientific beliefs, presumably, should be subject to change when sufficient new evidence contradicts existing conclusions. For example, Jost (Reference Jost2011) wrote: “This is because we, as a research community, take seriously the institutionalization of methodological safeguards against experimenter effects and other forms of bias. Any research program that is driven more by ideological axe-grinding than valid insight is doomed to obscurity, because it will not stand up to empirical replication and its flaws will be obvious to scientific peers.”

If only it were so. SPSR presented numerous cases where: (1) An initial high impact “wow!” study yielding some dramatic result was published; (2) Many follow-up studies revealed that the conclusions based on that “wow” study were mostly not justified; and (3) The “wow” conclusions continued to march on for decades as if the correctives were never published. SPSR documented case after case of just this pattern with respect to self-fulfilling prophecies, biases, stereotypes, and accuracy.

Many of the commentaries (Bonnefon et al.; Kihlstrom; Little; Madison et al.; Martin; Mousavi & Funder; Trafimow & Raut; Wagoner) seem to welcome SPSR as a much-needed corrective to the “stupidism” (Kihlstrom) emphasized by much of social psychology and the “clever silly” (Madison et al.) perspectives that back up such claims. Many of the rest acknowledge the validity of its main points but raise issues beyond the scope of the book (Church; Kahan; Wilson & Huang). If one is to believe the consensus of the commentaries on SPSR, one might believe that the field's emphasis on “stupidism” is in decline. Although I hope that is true, based on too much evidence from outside these commentaries, such a conclusion is premature, and not only because two of the commentaries have committed the processistic fallacy when attempting to defend claims emphasizing lay “stupidism” (Bian & Cimpian; Terbeck) regarding stereotypes.

My collaborators and I have recently updated the review of stereotype accuracy work that appears in SPSR (Jussim et al. Reference Jussim, Crawford and Rubinstein2015b; Reference Jussim, Crawford, Anglin, Chambers, Stevens, Cohen and Nelson2016a). More than 50 studies have been identified, almost double the number reviewed in SPSR, mainly because there has been an explosion of research on the accuracy of stereotypes about national character and political groups. The main conclusions of SPSR were reconfirmed, especially regarding the demographic stereotypes that social scientists generally seem most concerned about. Stereotype accuracy is one of the largest effects in all of social psychology. It has been replicated in multiple independent labs. Given social psychology's current crisis of replicability, and widespread concerns about questionable research practices (e.g., Open Science Collaboration 2015; Simmons et al. Reference Simmons, Nelson and Simonsohn2011), one might expect that social psychologists would be shouting to the world that we have actually found a valid, independently replicable, powerful phenomena.

But if one did think that, one could not be more wrong. Testaments to the inaccuracy of stereotypes still dominate textbooks and broad reviews of the stereotyping literature that appear in scholarly books (see Table R1). The new generation of scholars is still being brought up to believe that “stereotypes are inaccurate,” a claim many will undoubtedly take for granted as true, and then promote in their own scholarship. Sometimes, these manifest as definitions of stereotypes as inaccurate; and even when stereotypes are not defined as inaccurate, they manifest as declarations that stereotypes are inaccurate, exaggerated, or overgeneralized.

Table R1. Modern Claims about Stereotype (In)Accuracy

Table R1 reprinted from Jussim et al. (Reference Jussim, Crawford and Rubinstein2015b).

R8. Conclusion: Facilitating self-correction regarding accuracy, bias, and self-fulfilling prophecies

Psychology is abuzz with an internal discussion of how it can do better. Greater transparency, pre-registration, replication, and more have all come to the fore. However, most of the unjustified testaments to the power of self-fulfilling prophecies and expectancy or stereotype biases, and most of the attempts to dismiss the power or importance of accuracy did not result primarily from failed replications or questionable statistical or methodological practices, or even lack of transparency. Instead, they are problems of interpretation and (exactly as Kahan's commentary and perspective might predict) researcher confirmation biases. Even when failed replications did get published, they were generally ignored. Effect sizes were largely ignored. Simple contextual factors (such as the number of plays in a football game, or the total number of judgments made by staff at psychiatric institutions) that could have reigned in overstated claims of bias were often simply ignored, not just by the original researchers, but by decades of subsequent scientists perpetuating the erroneous testaments to bias. Attention to contextual, statistical, and methodological details was seemingly short-circuited by the ability or desire to tell compelling “wow!” stories about the power and pervasiveness of expectancy effects.

What, then, can researchers who want to present valid and nuanced descriptions of the findings do to limit their vulnerability to perpetuating false claims that appear in scientific literatures? Unfortunately, psychology does not have a consensus on the answers to this question, and is currently in the process of searching for those answers (e.g., Jussim et al. Reference Jussim, Crawford, Anglin, Stevens and Duarte2016b). Here, I focus specifically on: (1) identifying general principles that may be broadly applicable; and then (2) give examples of how they could be applied to the literatures addressed by SPSR:

  1. 1. Resist the urge to tell compelling narratives by glossing over or ignoring contradictory findings and conclusions.

    • Stop citing Rosenthal and Jacobson (Reference Rosenthal and Jacobson1968) as showing that teacher expectation effects are powerful or pervasive.

    • Do not assume that “story studies”– famous classics around which compelling narratives can be told – are necessarily true or replicable. Review the entire relevant literature before making claims regarding expectancies and stereotypes.

    • Avoid cherry-picking a biased sample of studies about expectancies or stereotypes (or any other topic) to make an argument.

  2. 2. Focus on the actual results of studies, rather than researcher claims about those results.

    • One can often find evidence of substantial accuracy and rationality in studies that emphasize or only reported error and bias.

    • Biases and self-fulfilling prophecies may be quite modest, or even contingent on moderators, even when the discussion touts their power and pervasiveness.

  3. 3. Search for skeptical reviews and meta-analyses, and do not depend exclusively on reviews or meta-analyses that appear to have as an agenda persuading the reader. Avoid repeating conclusions based on famous reviews, without either critically examining the basis for those conclusions, or, at least, searching the literature to find out whether other, perhaps less famous but more persuasive, skeptical or critical reviews or meta-analyses have reached different conclusions. Abide by the Mertonian norm of universalism, that evaluation of scientific claims hinges not at all on the status or prestige of the scientist making them, but on the quality of the evidence, logic, and argument being put forth (Merton Reference Merton and Merton1942/1973).

    • For every review testifying to the power of expectancies, there are now others casting doubt on such conclusions. If one must make a point about expectancies, at minimum, one can reflect the state of the literature with statements such as:

    • “Whereas some reviews have concluded that expectancy effects are powerful and pervasive, others have concluded that such effects are weak, fragile, and fleeting.”

    • “Although stereotypes have long been presumed to be inaccurate, several reviews have concluded that, in general, stereotypes are often at least moderately accurate.”

    • “Although social constructionist phenomena undoubtedly occur and can sometimes be powerful and important, at the level of individuals interacting with other individuals, such effects are usually quite modest.”

    • “Although people undoubtedly cognitively construct their social perceptual worlds to a considerable degree, and, sometimes such constructions can be quite biased, this does not mean their constructions are always or even mostly inaccurate.”

  4. 4. In new original studies, be excessively transparent about methods and results. Provide means, standard deviations, and correlations for all variables. When available and relevant, provide frequency distributions. When reporting regression and structural equation modeling (SEM) results, report standardized and unstandardized coefficients and also the t- and F- values associated with each test of significance. If all this cannot make it into the main report, then at least provide it in supplementary materials. Report effect sizes and confidence intervals. This should be done when reporting new empirical studies; and it should be routine when reviewing empirical literatures.

    • This is especially important when making claims about the relative power of bias versus accuracy. Distorted claims about bias could have been detected decades earlier, if, for example, effect sizes had been routinely reported, and if contextual data (e.g., total number of judgments) had been reported.

  5. 5. Be careful about definitions. Researchers have great latitude in how they define constructs, but then have to own the implications of their definitions.

    • If one defines stereotypes as inaccurate or as exaggerations, then one must be willing to accept that only beliefs about groups that have been demonstrated to be inaccurate and exaggerations among the sample one is studying can be known to be stereotypes.

    • One can avoid this problem by defining stereotypes in ways that permit them to be accurate, avoiding presumptions of inaccuracy, exaggeration, or overgeneralization. Base empirical claims about the state of the world on actual empirical evidence.

    • This may seem obvious, but researchers have been making claims about stereotype inaccuracy without evidence for decades. See Pinker (Reference Pinker2002) for similar claims without evidence regarding a range of issues, such as human malleability and the role of social factors in everything from intelligence to aggression to sex differences.

  6. 6. Avoid the processistic fallacy.

    • Do not make claims about error, bias, or the inaccuracy of stereotypes on the basis of process studies, even ones that identify faulty processes in the lab that one speculatively presumes will cause inaccuracy in people's naturally-occurring judgments. Such processes might have theoretical import (Mook Reference Mook1983), and they might generate predictions regarding patterns or sources of inaccuracy. But they rarely, if ever, constitute evidence of inaccuracy.

  7. 7. Reach conclusions about stereotype accuracy on the basis of studies reporting empirical data rather than sources (even “authoritative” ones such as G. W. Allport, Reference Allport1954/1979; see also Table R1) declaring stereotypes to be inaccurate (or exaggerations) without data.

    • Do not claim that characterizing stereotypes as possessing a “kernel of truth” constitutes some sort of acknowledgement that stereotypes are often substantially accurate. This functions as a disingenuous attempt to maintain the emphasis on inaccuracy, which can readily be seen with a “turnabout test” (Duarte et al. Reference Duarte, Crawford, Stern, Haidt, Jussim and Tetlock2015; Tetlock Reference Tetlock1994): Would declaring, “Psychological research has a kernel of truth” be a great testament to the validity of psychological science?

    • If stereotypes do influence judgments regarding an individual target do not assume that increases inaccuracy without testing for accuracy.

  8. 8. Build rational judgment processes into theoretical perspectives on social perception.

    • Because of social psychology's infatuation with error and bias, almost any result, no matter how reasonable and rational, has been framed as flawed. However, such conclusions regarding lay judgments require showing that some particular perceptual result deviates from some normative model. In social psychology, this is rarely done, thereby liberating researchers to cast almost any result as irrational.

    • Social psychologists should stop casting results as irrational absent development of normative model of rational judgment and an assessment of the extent to which lay judgments both correspond to and deviate from that model.

    • Social psychologists studying social perception should start developing models of rational judgment processes if they wish to continue reaching judgments about irrationality.

  9. 9. Be clear and consistent with respect to levels of analysis.

    • If one is discussing perceptions of groups, then accuracy refers to correspondence between beliefs about groups and what those groups are like.

    • If one is discussing perceptions of individuals, then accuracy refers to correspondence between beliefs about an individual and what that individual is like.

    • Cease confounding levels of analysis by declaring that stereotypes are inaccurate because they do not apply to every individual.

Science can tolerate errors, even a great many errors, if it also has strong and largely successful and efficient mechanisms for self-correction. In this spirit, it is worth pointing out that none of the commentaries, not even those few who most strongly disagree with my conclusions, present any data showing that self-fulfilling prophecies or expectancy-based biases are generally large, or that stereotypes are generally inaccurate. The strongest arguments for modifying the conclusions reached in SPSR, in my view, have come from those suggesting that the emphasis on accuracy and the de-emphasis of bias and self-fulfilling prophecy might not be quite so applicable beyond the specific types of interpersonal contexts addressed in SPSR (Kahan, Kihlstrom, Tappan et al., Wilson & Huang). Perhaps, therefore, we can agree that, even if SPSR does not spell the death knell for social or cognitive constructivism, with respect to the topics that it has addressed – teacher expectations, person perception, beliefs about groups and how those beliefs influence social perceptions – a little scientific self-correction is overdue.

Footnotes

1. Strictly speaking, the 14th amendment and Civil Rights acts focus on behaviors (discrimination), rather than beliefs. Failure to provide Alfonso service because the provider believes Latinos are mostly criminals is a violation of those acts; however, failure to provide Alfonso service because the provider believes Alfonso is a criminal is not. Whether the behavior is based on a stereotype or on individuating information is taken as extremely important, thereby highlighting the perceived value of the distinction in legal contexts.

References

REFERENCES

Allport, F. H. (1955) Theories of perception and the concept of structure. Wiley.Google Scholar
Allport, G. W. (1954/1979) The nature of prejudice, 2nd edition. Perseus.Google Scholar
Aronson, E. (2011) The social animal, 11th edition. Worth.Google Scholar
Banaji, M. R. & Greenwald, A. G. (2013) Blindspot: Hidden biases of good people. Delacorte Press.Google Scholar
Baron, R. M., Albright, L. & Malloy, T. E. (1995) The effects of behavioral and social class information on social judgment. Personality and Social Psychology Bulletin 21:308–15.CrossRefGoogle Scholar
Bartlett, F. C. (1932) Remembering: A study in experimental and social psychology. Cambridge University Press.Google Scholar
Baumeister, R. F. & Bushman, B. J. (2014) Social psychology and human nature, 3rd edition. Wadsworth.Google Scholar
Bem, D. (1987) Writing the empirical journal article. In: The compleat academic: A career guide, ed. Zanna, M. P. & Darley, J. M.. 8, pp. 171201. Erlbaum.Google Scholar
Brown, R. (2010) Prejudice: Its social psychology, 2nd edition. Wiley-Blackwell.Google Scholar
Brunswick, E. (1957) Systematic and representative design of psychological experiments. University of California Press.Google Scholar
Charlton, B. G. (2009) Clever sillies: Why high IQ people tend to be deficient in common sense. Medical Hypotheses 73:867–70.Google Scholar
Cohen, C. E. (1981) Personal categories and social perception: Testing some boundaries of the processing effects of prior knowledge. Journal of Personality and Social Psychology 40:441–52.CrossRefGoogle Scholar
Crisp, R. J. & Turner, R. N. (2014) Essential social psychology, 3rd edition. Sage.Google Scholar
Darley, J. M. & Gross, P. H. (1983) A hypothesis-confirming bias in labeling effects. Journal of Personality and Social Psychology 44:2033.CrossRefGoogle Scholar
Devine, P. G., Hirt, E. R. & Gehrke, E. M. (1990) Diagnostic and confirmation strategies in trait hypothesis testing. Journal of Personality and Social Psychology 58:952–63.CrossRefGoogle Scholar
Duarte, J. L., Crawford, J. T., Stern, C., Haidt, J., Jussim, L. & Tetlock, P. E. (2015) Political diversity will improve social psychological science. Behavioral and Brain Sciences 38:e130.CrossRefGoogle ScholarPubMed
Ellemers, N. & Barreto, M. (2009) Collective action in modern times: How modern expressions of prejudice prevent collective action. Journal of Social Issues 65:749–68. Available at: http://doi.org/10.1111/j.1540-4560.2009.01621.x Google Scholar
Firestone, C. & Scholl, B. J. (2016) Cognition does not affect perception: Evaluating the evidence for “top-down” effects. Behavioral and Brain Sciences 39: e229. Available at: https://doi.org/10.1017/S0140525X15000965 CrossRefGoogle Scholar
Fiske, S. T. & Taylor, S. E. (2008) Social cognition: From brains to culture. McGraw-Hill.Google Scholar
Funder, D. C. (1987) Errors and mistakes: Evaluating the accuracy of social judgment. Psychological Bulletin 101:7590.CrossRefGoogle ScholarPubMed
Gigerenzer, G. & Brighton, H. (2009) Homo heuristicus: Why biased minds make better inferences. Topics in Cognitive Science 1:107–43.Google Scholar
Gottfredson, L. S. (2010) Lessons in academic freedom as lived experience. Personality and Individual Differences 49:272–80.Google Scholar
Greenberg, J., Schmader, T., Arndt, J. & Landau, M. (2015) Social psychology: The science of everyday life. Worth.Google Scholar
Greenwald, A. G., Pratkanis, A. R., Leippe, M. R. & Baumgardner, M. H. (1986) Under what conditions does theory obstruct research progress? Psychological Review 93:216–29.Google Scholar
Grison, S., Heatherton, T. F. & Gazzaniga, M. S. (2015) Psychology in your life. W. W. Norton.Google Scholar
Haidt, J. (2012) The Righteous Mind: Why good people are divided by religion and politics. Pantheon Books.Google Scholar
Haselton, M. G. & Buss, D. M. (2000) Error management theory: A new perspective on biases in cross-sex mind reading. Journal of Personality and Social Psychology 78(1):8191.Google Scholar
Hastorf, A. H. & Cantril, H. (1954) They saw a game: A case study. Journal of Abnormal and Social Psychology 49(1):129–34.Google Scholar
Ioannidis, J. P. (2012) Why science is not necessarily self-correcting. Perspectives on Psychological Science 7:645–54.Google Scholar
Jordan, C. H. & Zanna, M. P. (2007) Not all experiments are created equal: On conducting and reporting persuasive experiments. In: Critical thinking in psychology, ed. Sternberg, R. J., Halpern, D. & Roediger, H. L. III, pp. 160–76. Cambridge University Press.Google Scholar
Jost, J. T. (2011) Ideological bias in social psychology? The Situationist. Available at: https://thesituationist.wordpress.com/2011/03/02/ideological-bias-in-social-psychology/ Google Scholar
Jost, J. T. & Kruglanski, A. W. (2002) The estrangement of social constructionism and experimental social psychology: History of the rift and prospects for reconciliation. Personality and Social Psychology Review 6:168–87.Google Scholar
Jussim, L. (1991) Social perception and social reality: A reflection-construction model. Psychological Review 98:5473.Google Scholar
Jussim, L. (2012) Social perception and social reality: Why accuracy dominates bias and self-fulfilling prophecy. Oxford University Press.CrossRefGoogle Scholar
Jussim, L., Crawford, J. T., Anglin, S. M., Chambers, J. R., Stevens, S. T. & Cohen, F. (2016a) Stereotype accuracy: One of the largest and most replicable effects in all of social psychology. In: Handbook of prejudice, stereotyping, and discrimination, 2nd edition, ed. Nelson, T., pp. 3163. Psychology Press.Google Scholar
Jussim, L., Crawford, J. T., Anglin, S. M. & Stevens, S. T. (2015a) Ideological bias in social psychological research. In: Social psychology and politics, ed. Forgas, J. P., Fiedler, K. & Crano, W., pp. 91109. Psychology Press/Taylor & Francis.Google Scholar
Jussim, L., Crawford, J. T., Anglin, S. M., Stevens, S. M. & Duarte, J. L. (2016b) Interpretations and methods: Towards a more effectively self-correcting social psychology. Journal of Experimental Social Psychology 66:116–33.Google Scholar
Jussim, L., Crawford, J. T. & Rubinstein, R. S. (2015b) Stereotype (in)accuracy in perceptions of groups and individuals. Current Directions in Psychological Science 24:490–97.CrossRefGoogle Scholar
Jussim, L., Crawford, J. T., Stevens, S. T. & Anglin, S. M. (2016c) The politics of social psychological science: Distortions in the social psychology of intergroup relations. In: Social psychology of political polarization, ed. Valdesolo, P., & Graham, J., pp. 165–96. Routledge.Google Scholar
Jussim, L., Crawford, J. T., Stevens, S. T., Anglin, S. M. & Duarte, J. L. (2016d) Can high moral purposes undermine scientific integrity? In: The social psychology of morality, ed. Forgas, J., Jussim, L. & van Lange, P., pp. 173–95. Taylor and Francis.Google Scholar
Kahan, D. M., Peters, E., Wittlin, M., Slovic, P., Ouellette, L. L., Braman, D. & Mandel, G. (2012b) The polarizing impact of science literacy and numeracy on perceived climate change risks. Nature: Climate Change 2:732–35.Google Scholar
King, L. A. (2013) Experience psychology, 2nd edition. McGraw-Hill.Google Scholar
Leslie, S. J. (in press) The original sin of cognition: Fear, prejudice and generalization. The Journal of Philosophy.Google Scholar
Leslie, S. J., Khemlani, S. & Glucksberg, S. (2011) Do all ducks lay eggs? The generic overgeneralization effect. Journal of Memory and Language 65(1):1531.Google Scholar
Lilienfeld, S. O. (2010) Can psychology become a science? Personality and Individual Differences 49:281–88.CrossRefGoogle Scholar
Martin, C. (2016) How ideology has hindered sociological insight. The American Sociologist 47(1):115–30. doi: 10.1007/s12108-015-9263-z.Google Scholar
Merton, R. K. (1942/1973) The normative structure of science. In: The sociology of science: Theoretical and empirical investigations, ed. Merton, R. K.. University of Chicago Press.Google Scholar
Monin, B. & Oppenheimer, D. M. (2014) The limitations of direct replications and the virtues of stimulus sampling. Social Psychology 45:299311.Google Scholar
Mook, D. G. (1983) In defense of external invalidity. American Psychologist 38:379–87.Google Scholar
Open Science Collaboration (2015) Estimating the reproducibility of psychological science. Science 349(6251): Article No. 4716. doi: 10.1126/science.aac4716.Google Scholar
Pinker, S. (2002) The blank slate. Penguin.Google Scholar
Planck, M. K. (1950) Scientific autobiography and other papers. Philosophical Library.Google Scholar
Rosenhan, D. L. (1973) On being sane in insane places. Science 179:250–68.Google Scholar
Rosenthal, R. & Jacobson, L. (1968) Pygmalion in the classroom: Teacher expectations and pupils' intellectual development. Holt, Rinehart, and Winston.CrossRefGoogle Scholar
Schacter, D. L., Gilbert, D. T., Wegner, D. M. & Nock, M. K. (2015) Introducing psychology, 3rd edition. Worth.Google Scholar
Simmons, J. P., Nelson, L. D. & Simonsohn, U. (2011) False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22:1359–66.Google Scholar
Snyder, M. & Swann, W. B. Jr. (1978a) Behavioral confirmation in social interaction: From social perception to social reality. Journal of Experimental Social Psychology 14:148–62.Google Scholar
Swim, J. K. (1994) Perceived versus meta-analytic effect sizes: An assessment of the accuracy of gender stereotypes. Journal of Personality and Social Psychology 66:2136.Google Scholar
Tetlock, P. E. (1994) Political psychology or politicized psychology: Is the road to scientific hell paved with good moral intentions? Political Psychology 509–29.Google Scholar
Trope, Y. & Bassok, M. (1983) Information-gathering strategies in hypothesis-testing. Journal of Experimental Social Psychology 19:560–76.Google Scholar
Von Hippel, W. & Trivers, R. (2011) The evolution and psychology of self-deception. Behavioral and Brain Sciences 34(1):156.Google Scholar
Vul, E., Harris, C., Winkielman, P. & Pashler, H. (2009) Puzzlingly high correlations in fMRI studies in emotion, personality, and social cognition. Perspectives on Psychological Science 4:274–90.CrossRefGoogle ScholarPubMed
Wells, G. L. & Windschitl, P. D. (1999) Stimulus sampling and social psychological experimentation. Personality and Social Psychology Bulletin 25:1115–25.Google Scholar
Westfall, J., Judd, C. M. & Kenny, D. A. (2015) Replicating studies in which samples of participants respond to samples of stimuli. Perspectives on Psychological Science 10:390–99.CrossRefGoogle ScholarPubMed
Whitley, B. & Kite, M. (2009) The psychology of prejudice and discrimination. Cengage Learning.Google Scholar
Figure 0

Figure R1. The Reflection-Construction Model (Jussim, 1991).A: The Full Model; Figure R1B: Constructive Accuracy: Even when perceivers are completely oblivious to targets' behavior or attributes, their judgments of targets will still correspond to (correlate with) targets behavior or attributes if (1) expectations are based on background information that (2) predicts targets behavior or attributes; and if (3) expectations influence (bias) perceiver judgments.

Figure 1

Table R1. Modern Claims about Stereotype (In)Accuracy