The costs and benefits of replication studies

Nicholas A. Coles; Leonid Tiokhin; Anne M. Scheel; Peder M. Isager; Daniël Lakens

doi:10.1017/S0140525X18000596

The costs and benefits of replication studies

Published online by Cambridge University Press: 27 July 2018

Peder M. Isager and

Nicholas A. Coles: Affiliation:
Department of Psychology, Austin Peay Building, University of Tennessee, Knoxville, TN 37996. colesn@utk.edu
Leonid Tiokhin: Affiliation:
School of Human Evolution and Social Change, Arizona State University, Tempe, AZ 85281. ltiokhin@asu.eduhttp://leotiokhin.com/
Anne M. Scheel: Affiliation:
Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5600 MB, Eindhoven, The Netherlands. a.m.scheel@tue.nlp.isager@tue.nlD.Lakens@tue.nlhttp://www.tue.nl/staff/a.m.scheelhttp://www.tue.nl/staff/p.isagerhttp://www.tue.nl/staff/d.lakens
Peder M. Isager: Affiliation:
Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5600 MB, Eindhoven, The Netherlands. a.m.scheel@tue.nlp.isager@tue.nlD.Lakens@tue.nlhttp://www.tue.nl/staff/a.m.scheelhttp://www.tue.nl/staff/p.isagerhttp://www.tue.nl/staff/d.lakens
Daniël Lakens: Affiliation:
Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5600 MB, Eindhoven, The Netherlands. a.m.scheel@tue.nlp.isager@tue.nlD.Lakens@tue.nlhttp://www.tue.nl/staff/a.m.scheelhttp://www.tue.nl/staff/p.isagerhttp://www.tue.nl/staff/d.lakens

Article contents

Abstract
References

Rights & Permissions

Abstract

The debate about whether replication studies should become mainstream is essentially driven by disagreements about their costs and benefits and the best ways to allocate limited resources. Determining when replications are worthwhile requires quantifying their expected utility. We argue that a formalized framework for such evaluations can be useful for both individual decision-making and collective discussions about replication.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 41 , 2018 , e124

DOI: https://doi.org/10.1017/S0140525X18000596 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2018

In a summary of recent discussions about the role of direct replications in psychological science, Zwaan et al. argue that replications should be more mainstream and discuss six common objections to direct replication studies. We believe that the debate about the importance of replication research is essentially driven by disagreements about the value of replication studies and the best way to allocate limited resources. We suggest that a decision theory framework (Wald Reference Wald1950) can provide a tool for researchers to (a) evaluate costs and benefits to determine when replication studies are worthwhile, and (b) specify their assumptions in quantifiable terms, facilitating more productive discussions in which the sources of disagreement about the value of replications can be identified.

The main goal of decision theory is to quantify the expected utility (the result of a cost-benefit analysis, incorporating uncertainty about the state of the world) of possible actions to make an optimal decision. To determine when a replication study is valuable enough to perform, we must compare the expected utility of a replication study against alternative options (e.g., performing a conceptual replication or pursuing novel lines of research). In this commentary, we explore some of the costs and benefits of direct replications and emphasize how different assumptions can lead to different expected-utility judgments.

Costs and benefits of direct replications

The expected utility of replication studies depends on several factors, such as judgments about the reliability of the literature, the perceived public interest in a finding, or the judged importance of a theory. Importantly, these assessments are subjective and can lead to disagreements among researchers. Consider the concerns addressed by Zwaan et al.: Should we continue to examine highly context-dependent effects or limit ourselves to effects that are robust across contexts? Should we spend more resources on direct or conceptual replications? Are direct replications prohibitively costly in large-scale observational studies? The answer is: It depends.

Highly context-dependent effects might, as Zwaan et al. note, make it “difficult, if not impossible, for new knowledge to build on the solid ground of previous work” (sect. 5.1.1, para. 8, concern I). However, to argue against pursuing these research lines, one must make the case that such costs outweigh the expected benefits. In some research areas, such as personalized medicine, highly context-dependent effects may be deemed worthwhile to pursue. If researchers believe some (perhaps even all) effects are highly context dependent, they should be able to argue why these effects are important enough to study, even when progress is expected to be slow and costly.

Some researchers argue that even a single replication can be prohibitively costly (sect. 5.3, concern III). For example, Goldin-Meadow stated that “it's just too costly or unwieldy to generate hypotheses on one sample and test them on another when, for example, we're conducting a large field study or testing hard-to-find participants” (Reference Goldin-Meadow2016). Some studies may be deemed valuable enough to justify even quite substantial investments in a replication, which can often be incorporated into the design of a research project. For instance, because it is unlikely that anyone will build a Large Hadron Collider to replicate the studies at the European Organization for Nuclear Research (CERN), there are two detectors (ATLAS and CMS) so that independent teams can replicate each other's work. Thus, high cost is not by itself a conclusive argument against replication. Instead, one must make the case that the benefits do not justify the costs.

The expected utility of a direct replication (compared with a conceptual replication) depends on the probability that a specific theory or effect is true. If you believe that many published findings are false, then directly replicating prior work may be a cost-efficient way to prevent researchers from building on unreliable findings. If you believe that psychological theories usually make accurate predictions, then conceptual extensions may lead to more efficient knowledge gains than direct replications (sect. 5.2, concern II). An evaluation of costs might even reveal that neither direct nor conceptual replications are optimal, but that scientists should instead focus their resources on cheaper methods to increase the reliability of science (sect. 5.4, concern IV).

The value of replication studies is also influenced by the anticipated interpretation of their outcomes (sect. 5.6, concern VI). If we cannot reach agreement about how to evaluate a given result, its benefit to the field may be close to zero. The outcome of a replication study should increase or decrease our belief in an effect, or raise new questions about auxiliary assumptions that can be resolved in future studies. Replications may thus have higher subjective value when consensus about the interpretation of outcomes can be determined a priori (e.g., via pre-registered adversarial collaboration).

Replication attempts may also have social costs and benefits for researchers who perform replication studies, or whose work is replicated. One strength of decision theory is that it allows us to incorporate such social components in cost-benefit analyses. For example, researchers currently seem to disagree about when, and how much, reputations should suffer when findings do not replicate (sect. 5.5, concern V). If the reputational costs of unsuccessful replications are too high, scholars may be overly reluctant to publish novel or exploratory findings. If the reputational costs are nonexistent, scholars may not exert ideal levels of rigor in their work. The social norms influencing these costs and benefits are shaped by the scientific community. Explicitly discussing those norms can help us change them in ways that incentivize direct replications when they, ignoring the social consequences, would have high utility.

Conclusion

It is unlikely that directly replicating every study, or never directly replicating any study, is optimally efficient. A better balance would be achieved if researchers performed direct replications when the expected utility exceeded that of alternative options. Decision theory provides a useful framework to discuss the expected utility of direct replications based on a quantification of costs and benefits. A more principled approach to deciding when to perform direct replications has the potential to both help researchers optimize their behavior and facilitate a more productive discussion among researchers with different evaluations of the utility of replication studies.

References

Goldin-Meadow, S. (2016) Preregistration, replication, and nonexperimental studies. Association for Psychological Science Observer 29(8):2.Google Scholar

Wald, A. (1950) Statistical decision functions. Wiley.Google Scholar

A Bayesian decision-making framework for replication

Tom E. Hardwicke , Michael Henry Tessler , Benjamin N. Peloquin and Michael C. Frank

Behavioral and Brain Sciences , Volume 41

A pragmatist philosophy of psychological science and its implications for replication

Ana Gantman , Robin Gomila , Joel E. Martinez , J. Nathan Matias , Elizabeth Levy Paluck , Jordan Starck , Sherry Wu and Nechumi Yaffe

Behavioral and Brain Sciences , Volume 41

An argument for how (and why) to incentivise replication

Piers D. L. Howe and Amy Perfors

Behavioral and Brain Sciences , Volume 41

Bayesian belief updating after a replication experiment

Alex O. Holcombe and Samuel J. Gershman

Behavioral and Brain Sciences , Volume 41

Conceptualizing and evaluating replication across domains of behavioral research

Jennifer L. Tackett and Blakeley B. McShane

Behavioral and Brain Sciences , Volume 41

Constraints on generality statements are needed to define direct replication

Daniel J. Simons , Yuichi Shoda and D. Stephen Lindsay

Behavioral and Brain Sciences , Volume 41

Data replication matters to an underpowered study, but replicated hypothesis corroboration counts

Erich H. Witte and Frank Zenker

Behavioral and Brain Sciences , Volume 41

Direct replication and clinical psychological science

Scott O. Lilienfeld

Behavioral and Brain Sciences , Volume 41

Direct replications in the era of open sampling

Gabriele Paolacci and Jesse Chandler

Behavioral and Brain Sciences , Volume 41

Don't characterize replications as successes or failures

Andrew Gelman

Behavioral and Brain Sciences , Volume 41

Enhancing research credibility when replication is not feasible

Robert J. MacCoun

Behavioral and Brain Sciences , Volume 41

Holding replication studies to mainstream standards of evidence

Duane T. Wegener and Leandre R. Fabrigar

Behavioral and Brain Sciences , Volume 41

How to make replications mainstream

Hans IJzerman , Jon Grahe and Mark J. Brandt

Behavioral and Brain Sciences , Volume 41

If we accept that poor replication rates are mainstream

David M. Alexander and Pieter Moors

Behavioral and Brain Sciences , Volume 41

Introducing a replication-first rule for Ph.D. projects

Arnold R. Kochari and Markus Ostarek

Behavioral and Brain Sciences , Volume 41

Making prepublication independent replication mainstream

Warren Tierney , Martin Schweinsberg and Eric Luis Uhlmann

Behavioral and Brain Sciences , Volume 41

Making replication prestigious

Krzysztof J. Gorgolewski , Thomas Nichols , David N. Kennedy , Jean-Baptiste Poline and Russell A. Poldrack

Behavioral and Brain Sciences , Volume 41

Putting replication in its place

Evan Heit and Caren M. Rotello

Behavioral and Brain Sciences , Volume 41

Replication is already mainstream: Lessons from small-N designs

Daniel R. Little and Philip L. Smith

Behavioral and Brain Sciences , Volume 41

Replications can cause distorted belief in scientific progress

Michał Białek

Behavioral and Brain Sciences , Volume 41

Scientific progress is like doing a puzzle, not building a wall

Alexa M. Tullett and Simine Vazire

Behavioral and Brain Sciences , Volume 41

Selecting target papers for replication

Anton Kuehberger and Michael Schulte-Mecklenbeck

Behavioral and Brain Sciences , Volume 41

Strong scientific theorizing is needed to improve replicability in psychological science

Timothy Carsel , Alexander P. Demos and Matt Motyl

Behavioral and Brain Sciences , Volume 41

The costs and benefits of replication studies

Nicholas A. Coles , Leonid Tiokhin , Anne M. Scheel , Peder M. Isager and Daniël Lakens

Behavioral and Brain Sciences , Volume 41

The importance of exact conceptual replications

Richard E. Petty

Behavioral and Brain Sciences , Volume 41

The meaning of a claim is its reproducibility

Jan P. de Ruiter

Behavioral and Brain Sciences , Volume 41

The replicability revolution

Ulrich Schimmack

Behavioral and Brain Sciences , Volume 41

Three strong moves to improve research and replications alike

Roger Giner-Sorolla , David M. Amodio and Gerben A. van Kleef

Behavioral and Brain Sciences , Volume 41

Three ways to make replication mainstream

Morton Ann Gernsbacher

Behavioral and Brain Sciences , Volume 41

To make innovations such as replication mainstream, publish them in mainstream journals

Boris Egloff

Behavioral and Brain Sciences , Volume 41

Verifiability is a core principle of science

Sanjay Srivastava

Behavioral and Brain Sciences , Volume 41

Verify original results through reanalysis before replicating

Michèle B. Nuijten , Marjan Bakker , Esther Maassen and Jelte M. Wicherts

Behavioral and Brain Sciences , Volume 41

What have we learned? What can we learn?

Fritz Strack and Wolfgang Stroebe

Behavioral and Brain Sciences , Volume 41

What the replication reformation wrought

Barbara A. Spellman and Daniel Kahneman

Behavioral and Brain Sciences , Volume 41

Why replication has more scientific value than original discovery

John P. A. Ioannidis

Behavioral and Brain Sciences , Volume 41

You are not your data

Gordon Pennycook

Behavioral and Brain Sciences , Volume 41

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Field, Sarahanne M. Hoekstra, Rink Bringmann, Laura van Ravenzwaaij, Don Savalei, Victoria and Savalei, Victoria 2019. When and Why to Replicate: As Easy as 1, 2, 3?. Collabra: Psychology, Vol. 5, Issue. 1,

Tiokhin, Leonid Hackman, Joseph Munira, Shirajum Jesmin, Khaleda and Hruschka, Daniel 2019. Generalizability is not optional: insights from a cross-cultural study of social discounting. Royal Society Open Science, Vol. 6, Issue. 2, p. 181386.

Lewandowsky, Stephan and Oberauer, Klaus 2020. Low replicability can support robust and efficient science. Nature Communications, Vol. 11, Issue. 1,

Gordon, Michael Viganola, Domenico Bishop, Michael Chen, Yiling Dreber, Anna Goldfedder, Brandon Holzmeister, Felix Johannesson, Magnus Liu, Yang Twardy, Charles Wang, Juntao and Pfeiffer, Thomas 2020. Are replication rates the same across academic fields? Community forecasts from the DARPA SCORE programme. Royal Society Open Science, Vol. 7, Issue. 7, p. 200566.

Büttner, Fionn Toomey, Elaine McClean, Shane Roe, Mark and Delahunt, Eamonn 2020. Are questionable research practices facilitating new discoveries in sport and exercise medicine? The proportion of supported hypotheses is implausibly high. British Journal of Sports Medicine, Vol. 54, Issue. 22, p. 1365.

Chandrashekar, Subramanya Prasad Weber, Jasmin Chan, Sze Ying Cho, Won Young Chu, Tsz Ching Connie Cheng, Bo Ley and Feldman, Gilad 2021. Accentuation and compatibility: Replication and extensions of Shafir (1993) to rethink choosing versus rejecting paradigms. Judgment and Decision Making, Vol. 16, Issue. 1, p. 36.

Ziano, Ignazio Xiao, Qinyu Yeung, Siu Kit Wong, Cho Yan Cheung, Mei Yee Lo, Chung Yi Joey Yan, Ho Ching Narendra, Gregorius Ivan Kwan, Li Wing Chow, Ching Sum Man, Chak Yam and Feldman, Gilad 2021. Numbing or sensitization? Replications and extensions of Fetherstonhaugh et al. (1997)'s “Insensitivity to the Value of Human Life”. Journal of Experimental Social Psychology, Vol. 97, Issue. , p. 104222.

Martin, Jeffrey and Martin, Drew 2021. The N-Pact Factor, Replication, Power, and Quantitative Research in Adapted Physical Activity Quarterly. Kinesiology Review, Vol. 10, Issue. 3, p. 363.

Rohrer, Julia M. Tierney, Warren Uhlmann, Eric L. DeBruine, Lisa M. Heyman, Tom Jones, Benedict Schmukle, Stefan C. Silberzahn, Raphael Willén, Rebecca M. Carlsson, Rickard Lucas, Richard E. Strand, Julia Vazire, Simine Witt, Jessica K. Zentall, Thomas R. Chabris, Christopher F. and Yarkoni, Tal 2021. Putting the Self in Self-Correction: Findings From the Loss-of-Confidence Project. Perspectives on Psychological Science, Vol. 16, Issue. 6, p. 1255.

Heirene, Robert M. 2021. A call for replications of addiction research: which studies should we replicate and what constitutes a ‘successful’ replication?. Addiction Research & Theory, Vol. 29, Issue. 2, p. 89.

Anvari, Farid Olsen, Jerome Hung, Wing Yiu and Feldman, Gilad 2021. Misprediction of affective outcomes due to different evaluation modes: Replication and extension of two distinction bias experiments by Hsee and Zhang (2004). Journal of Experimental Social Psychology, Vol. 92, Issue. , p. 104052.

Barczak, Gloria Hopp, Christian Kaminski, Jermain Piller, Frank and Pruschak, Gernot 2022. How open is innovation research? – An empirical analysis of data sharing among innovation scholars. Industry and Innovation, Vol. 29, Issue. 2, p. 186.

Miller, Jeff and Ulrich, Rolf 2022. Optimizing Research Output: How Can Psychological Research Methods Be Improved?. Annual Review of Psychology, Vol. 73, Issue. 1, p. 691.

Mesquida, Cristian Murphy, Jennifer Lakens, Daniël and Warne, Joe 2022. Replication concerns in sports and exercise science: a narrative review of selected methodological issues in the field. Royal Society Open Science, Vol. 9, Issue. 12,

Sparrow-Downes, Victoria M. Trincao-Batra, Sara Cloutier, Paula Helleman, Amanda R. Salamatmanesh, Mina Gardner, William Baksh, Anton Kapur, Rishi Sheridan, Nicole Suntharalingam, Sinthuja Currie, Lisa Carrie, Liam D. Hamilton, Arthur and Pajer, Kathleen 2022. Peripheral and neural correlates of self-harm in children and adolescents: a scoping review. BMC Psychiatry, Vol. 22, Issue. 1,

Pittelkow, Merle-Marie Field, Sarahanne M. Isager, Peder M. van’t Veer, Anna E. Anderson, Thomas Cole, Scott N. Dominik, Tomáš Giner-Sorolla, Roger Gok, Sebahat Heyman, Tom Jekel, Marc Luke, Timothy J. Mitchell, David B. Peels, Rik Pendrous, Rosina Sarrazin, Samuel Schauer, Jacob M. Specker, Eva Tran, Ulrich S. Vranka, Marek A. Wicherts, Jelte M. Yoshimura, Naoto Zwaan, Rolf A. and van Ravenzwaaij, Don 2023. The process of replication target selection in psychology: what to consider?. Royal Society Open Science, Vol. 10, Issue. 2,

Marcoci, Alexandru Wilkinson, David P. Vercammen, Ans Wintle, Bonnie C. Abatayo, Anna Lou Baskin, Ernest Berkman, Henk Buchanan, Erin M. Capitán, Sara Capitán, Tabaré Chan, Ginny Cheng, Kent Jason G. Coupé, Tom Dryhurst, Sarah Duan, Jianhua Edlund, John E. Errington, Timothy M. Fedor, Anna Fidler, Fiona Field, James G. Fox, Nicholas Fraser, Hannah Freeman, Alexandra L. J. Hanea, Anca Holzmeister, Felix Hong, Sanghyun Huggins, Raquel Huntington-Klein, Nick Johannesson, Magnus Jones, Angela M. Kapoor, Hansika Kerr, John Kline Struhl, Melissa Kołczyńska, Marta Liu, Yang Loomas, Zachary Luis, Brianna Méndez, Esteban Miske, Olivia Mody, Fallon Nast, Carolin Nosek, Brian A. Simon Parsons, E. Pfeiffer, Thomas Reed, W. Robert Roozenbeek, Jon Schlyfestone, Alexa R. Schneider, Claudia R. Soh, Andrew Song, Zhongchen Tagat, Anirudh Tutor, Melba Tyner, Andrew H. Urbanska, Karolina and van der Linden, Sander 2024. Predicting the replicability of social and behavioural science claims in COVID-19 preprints. Nature Human Behaviour,

Chan, Chi Fung and Feldman, Gilad 2024. The link between Empathy and Forgiveness: Replication and extensions Registered Report of McCullough et al. (1997)'s Study 1. Cognition and Emotion, p. 1.

Tufanaru, Catalin Surian, Didi Scott, Anna Mae Glasziou, Paul and Coiera, Enrico 2024. The 2-week systematic review (2weekSR) method was successfully blind-replicated by another team: a case study. Journal of Clinical Epidemiology, Vol. 165, Issue. , p. 111197.

Holzmeister, Felix Johannesson, Magnus Camerer, Colin F. Chen, Yiling Ho, Teck-Hua Hoogeveen, Suzanne Huber, Juergen Imai, Noriko Imai, Taisuke Jin, Lawrence Kirchler, Michael Ly, Alexander Mandl, Benjamin Manfredi, Dylan Nave, Gideon Nosek, Brian A. Pfeiffer, Thomas Sarafoglou, Alexandra Schwaiger, Rene Wagenmakers, Eric-Jan Waldén, Viking and Dreber, Anna 2024. Examining the replicability of online experiments selected by a decision market. Nature Human Behaviour,

Download full list

Article contents

The costs and benefits of replication studies

Abstract

Costs and benefits of direct replications

Conclusion

References

Target article

Related commentaries (36)

Author response

Linked content

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests