The internal validity obsession

Gregory Mitchell; Philip E. Tetlock

doi:10.1017/S0140525X21000637

The internal validity obsession

Published online by Cambridge University Press: 13 May 2022

Gregory Mitchell

and

Philip E. Tetlock

Show author details

Gregory Mitchell: Affiliation:
University of Virginia, School of Law, Charlottesville, VA 22903, USAgreg.mitchell@law.virginia.edu; https://www.law.virginia.edu/faculty/profile/pgm6u/1191856
Philip E. Tetlock: Affiliation:
Psychology Department and Wharton School, University of Pennsylvania, Solomon Labs, Philadelphia, PA 19104, USA. tetlock@wharton.upenn.edu; https://www.sas.upenn.edu/tetlock/bio

Article contents

Abstract
References

Rights & Permissions

Abstract

Until social psychology devotes as much attention to construct and external validity as it does to internal validity, the field will continue to produce theories that fail to replicate in the field and cannot be used to meliorate social problems.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 45 , 2022 , e83

DOI: https://doi.org/10.1017/S0140525X21000637 [Opens in a new window]
Copyright: Copyright © The Author(s), 2022. Published by Cambridge University Press

The target article joins a long line of compelling critiques of social psychology methodology. We suspect the latest critique, like its predecessors, will have little effect on how social psychologists study discrimination. A design ethos of “experimental realism” that relies on engaging but manufactured social settings (Aronson & Carlsmith, Reference Aronson, Carlsmith, Lindzey and Aronson1968) makes gathering data much easier than an approach that demands fidelity to real-world contingencies. The retort to critics is that the goal is to find theories that generalize and experimental control is essential to theory development (e.g., Banaji & Crowder, Reference Banaji and Crowder1989).

Unfortunately, social psychologists rarely examine whether their theories do, in fact, generalize; when they do, the results are not pretty (Mitchell, Reference Mitchell2012). Nor do social psychology journals demand much evidence that experimental constructions actually measure or manipulate the hypothesized processes of interest (Chester & Lasko, Reference Chester and Lasko2021), which helps explain why more than 20 years after the racial-attitudes implicit association test was introduced, we still do not know what it actually measures (Schimmack, Reference Schimmack2021). The career calculus is clear. With journals happy for authors to speculate about the real-world implications of a statistically significant correlation or mean difference observed using convenience samples under artificial conditions, why embark on the arduous task of establishing external and construct validity? Any possible confound in design will spell doom for publication, while obvious shortcomings in the sample and manipulations chosen to test what passes for a theory will merit only cursory mention in a concluding section on limitations of the study.

As long as social psychology journals exalt internal validity over all other forms of validity, we should not expect social psychology to produce any theories that can really explain, much less help meliorate, social problems. Making passage of reality checks essential to publication (e.g., requiring comparison of an online convenience sample to a sample of persons with experience in the domain of interest or requiring that a theory be tested on archival data and not only on materials constructed for an experiment) would move the field away from exalting effects that prove to be the product of a quirky design decision that ignored key features of the situations or persons of theoretical interest. Such reality checks would serve as a form of “consistency test” of the kind that mature sciences employ (Meehl, Reference Meehl1978), and making reality checks a condition for publication would encourage greater care in theory development, pushing theorists to spell out boundary conditions and necessary auxiliary assumptions to narrow the range of reality checks that must be passed for the theory to survive.

We can understand why an exasperated Bayesian observer might conclude that until reality checks become a required part of theory validation within the field, the default assumption should be the best base-rate guess: neither social psychological theories nor effects will generalize. To those who worry that this default assumption would protect an oppressive status quo, we propose to locate the debate in signal detection framework. A false-negative error would be to dismiss a truly generalizable social psychological effect. A false-positive error would be to embrace an effect that proves to be a hot-house flower that wilts fast in the wild. We see the latter error as vastly more common today – hence our sympathy for the exasperated Bayesian. Our view is that it is better – for both the science and society – to require investigators to test the practical utility of their ideas using rigorous evaluation methods than to give politicians or consultants open-ended scientific license to invent popular or profitable interventions that they hope will work but that they never intend to subject to rigorous evaluation (see, e.g., Paluck, Porat, Clark, & Green, Reference Paluck, Porat, Clark and Green2021).

Take the case of implicit bias. To our knowledge, no implicit bias training program implemented by a police department or other organization has ever been shown to have net behavioral benefits or to be justified under any cost–benefit analysis, yet countless dollars and work hours are being spent on such programs rather than other programs that might prove more effective. Certainly the belief that implicit bias explains many group disparities is widespread, and that belief may well have positive political consequences for some groups and may even reduce discrimination through increased sensitivity to its occurrence, but that belief continues to exist despite, not because of, social psychological research on the predictive (in)validity of measures of implicit bias. If the goal of social psychology is to create an ideology, rather than a science of social behavior, then it appears to have succeeded in the short term, but we suspect that success will erode its long-term credibility and its ability to provide long-term solutions to social problems.

Financial support

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Conflict of interest

None.

References

Aronson, E. R., & Carlsmith, J. M. (1968). Experimentation in social psychology. In Lindzey, G. & Aronson, E. (Eds.), The handbook of social psychology (Vol. 2, pp. 1–79). Reading, MA: Addison-Wesley.Google Scholar

Banaji, M. R. & Crowder, R. C. (1989). The bankruptcy of everyday memory. American Psychologist, 44, 1185–1193. https://doi.org/10.1037/0003-066X.44.9.1185 CrossRef Google Scholar

Chester, D. S., & Lasko, E. M. (2021). Construct validation of experimental manipulations in social psychology: Current practices and recommendations for the future. Perspectives on Psychological Science, 16, 377–395. https://doi.org/10.1177/1745691620950684 CrossRef Google Scholar PubMed

Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806–834. https://doi.org/10.1016/j.appsy.2004.02.001 CrossRef Google Scholar

Mitchell, G. (2012). Revisiting truth or triviality: The external validity of research in the psychological laboratory. Perspectives on Psychological Science, 7, 109–117.CrossRef Google Scholar PubMed

Paluck, E. L., Porat, R., Clark, C. S., & Green, D. P. (2021). Prejudice reduction: Progress and challenges. Annual Review of Psychology, 72, 533–560. https://doi.org/10.1146/annurev-psych-071620-030619 CrossRef Google Scholar PubMed

Schimmack, U. (2021). The implicit association test: A method in search of a construct. Perspectives on Psychological Science, 16, 396–414. https://doi.org/10.1177/1745691619863798 CrossRef Google Scholar PubMed