Hostname: page-component-7b9c58cd5d-hpxsc Total loading time: 0 Render date: 2025-03-13T11:24:54.205Z Has data issue: false hasContentIssue false

What the replication reformation wrought

Published online by Cambridge University Press:  27 July 2018

Barbara A. Spellman
Affiliation:
University of Virginia School of Law, Charlottesville, VA 22903. bas6g@virginia.eduhttp://content.law.virginia.edu/faculty/profile/bas6g/1211027
Daniel Kahneman
Affiliation:
Woodrow Wilson School of Public and International Affairs, Princeton University, Princeton, NJ, 08544. kahneman@princeton.edu

Abstract

Replication failures were among the triggers of a reform movement which, in a very short time, has been enormously useful in raising standards and improving methods. As a result, the massive multilab multi-experiment replication projects have served their purpose and will die out. We describe other types of replications – both friendly and adversarial – that should continue to be beneficial.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2018 

As two old(er) researchers who were involved early in the current science reform movement (pro-reform, to the chagrin of many of our peers), we believe that the target article barely addresses an essential point about the “replication crisis”: In a very short time, the resulting reform movement, including all of the fuss, anger, and passion that it generated, has been enormously useful in raising standards and improving methods in psychological science. Rather than believing that the field is still in crisis, some highly influential members of our community recently announced that psychology is now experiencing a “renaissance” (Nelson et al. Reference Nelson, Simmons and Simonsohn2018). One of us calls what has happened a civil war–like revolution (Spellman Reference Spellman2015), suggesting an insurrection in which one group overthrows the structures put in place by another group. But here we use the term “reformation,” suggesting that the profession has become enlightened and changed itself for the better.

The reform movement has prompted changes across the entire research and publication process. As a result, experimental results are more reliable because researchers are increasing sample sizes. Researchers are posting methods, data, and analysis plans (sometimes encouraged by journals), thus promoting more accurate replications and vetting of data integrity. Researchers are pre-registering hypotheses and journals are pre-accepting registered reports, making conclusions more credible. Also the experimental record is more complete because of preprint services, open access journals, and the increasing publication of replication studies. Therefore, we believe that the reformation's success results from actions by individuals, journals, and societies, combined with various environmental factors (e.g., technology, demographics, the cross-disciplinary recognition of the problem [Spellman Reference Spellman2015]) that allowed the changes to take hold now, whereas reform movements with similar goals in the past had failed.

Amazingly, all of this has transpired in seven plus or minus two years. The early revelations that an assortment of high-profile studies failed to replicate, and then the later various mass replications – both those in which many different labs worked on many different studies (e.g., Nosek et al. Reference Nosek, Alter, Banks, Borsboom, Bowman, Breckler, Buck, Chambers, Chin, Christensen, Contestabile, Dafoe, Eich, Freese, Glennerster, Goroff, Green, Hesse, Humphreys, Ishiyama, Karlan, Kraut, Lupia, Mabry, Madon, Malhotra, Mayo-Wilson, McNutt, Miguel, Levy Paluck, Simonsohn, Soderberg, Spellman, Turitto, VandenBos, Vazire, Wagenmakers, Wilson and Yarkoni2015) and those in which many different labs worked on the same studies (e.g., Simons et al. Reference Simons, Holcombe and Spellman2014)–provided existence proofs that non-replicable published studies were widespread in psychology. The ground-breaking gem of a paper by Simmons et al. (Reference Simmons, Nelson and Simonsohn2011) gave our field a way to understand how this could have happened by scientists simply following the norms as they understood them, without any evil intent. But the norms were defective.

We believe that the quality of psychological science has been improving so fast and so broadly–mainly because of the replication crisis–that replications are likely to become rarer rather than routine. The massive multi-lab multi-experiment replication projects have served their purpose and will die out. What should happen, and indeed become mainstream, is the extension of original research should routinely include replication. The design of experiments and their execution are separable: Friendly laboratories should routinely exchange replication services in a shared effort to improve the transparency of their methods. Most replications should be friendly and adversarial replications should be collegial and regulated. How might this be done?

In one approach (Kahneman Reference Kahneman2014), after developing but before running a study, replicators send the original authors a complete plan of the procedure. Original authors have a set time to respond with comments and suggested modifications. Replicators choose whether and how to change the protocol but must explain how and why. These exchanges should be available for reviewers and readers to use when evaluating the claims of each side (e.g., whether it was a “faithful” replication).

In a second approach, the negotiation is refereed. For example, journals that take pre-registered replications may require careful vetting of the replicator's protocol before giving it a go-ahead stamp of “true direct replication.” But journal intercession is not necessary; authors and replicators could agree to mediation by knowledgeable individuals or teams of appointed researchers.

The two proposals above, however, are limited to checking the replicability of individual studies – individual “bricks in the wall” – in the same way current reforms directly affect only the integrity of individual studies (Spellman Reference Spellman2015). Science involves groups–groups of studies that connect together to define and develop (or destroy) theories (i.e., to create buildings or bridges from individual bricks) and communities of scientists who can work together, or in constructive competition (note: not opposition), to hone their shared ideas. Below we suggest two ways in which communities of scientists can engage in replication and theory development.

A third approach to replication is the daisy-chain approach. A group of laboratories that share a theoretical orientation that others question could get together, with each lab offering its favorite experiment for exact replication by the next lab in the chain – with all results published together, win or lose. Even if not all of the replications are successful, such an exercise would improve the quality of communications about research methods within a field, and improve the credibility of the field as a whole.

A fourth ambitious form of replication, called “paradigmatic replication,” has been implemented by Kathleen Vohs (Reference Vohs2018). Vohs recognizes that massive replication attempts of one study, particularly in an area where different researchers use different methods that change over time, is not a useful indicator of an evolving theory's robustness. In this procedure, the major proponents of a theory jointly resolve what the core elements of the theory are, and then decide what the best methods had been (or would be) to demonstrate/test its workings. A few diverse methods (e.g., varying independent or dependent measures) are devised, the protocols are pre-registered, and then multiple labs, both “believers” and “non-believers,” run the studies. Data are analyzed by a neutral third party.

Overall, we believe that the replication reform movement has already succeeded in valuable ways. Improvements of research methods are raising the credibility of results and reducing the need for replications by skeptics. We also believe that routine exchanges of “replication services” between cooperating laboratories (e.g., through StudySwap [https://osf.io/view/StudySwap/]) will further enhance the community's confidence in the clarity and completeness of methods, as well as in the stability of findings.

References

Kahneman, D. (2014) A new etiquette for replication. Social Psychology 45(4):310–11.Google Scholar
Nelson, L. D., Simmons, J. P. & Simonsohn, U. (2018) Psychology's renaissance. Annual Review of Psychology 69:511–34. Available at: http://doi.org/10.1146/annurev-psych-122216-011836.Google Scholar
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., Buck, S., Chambers, C. D., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E., Freese, J., Glennerster, R., Goroff, D., Green, D. P., Hesse, B., Humphreys, M., Ishiyama, J., Karlan, D., Kraut, A., Lupia, A., Mabry, P., Madon, T. A., Malhotra, N., Mayo-Wilson, E., McNutt, M., Miguel, E., Levy Paluck, E., Simonsohn, U., Soderberg, C., Spellman, B. A., Turitto, J., VandenBos, G., Vazire, S., Wagenmakers, E. J., Wilson, R. & Yarkoni, T. (2015) Promoting an open research culture. Science 348:1422–25.Google Scholar
Simmons, J. P., Nelson, L. D. & Simonsohn, U. (2011) False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22:1359–66. Available at: http://doi.org/10.1177/0956797611417632.Google Scholar
Simons, D. J., Holcombe, A. O. & Spellman, B. A. (2014) An introduction to Registered Replication Reports at Perspectives on Psychological Science. Perspectives on Psychological Science 9(5):552–55.Google Scholar
Spellman, B. A. (2015) A short (personal) future history of Revolution 2.0. Perspectives on Psychological Science 10:886–99.Google Scholar
Vohs, K. D. (2018) A pre-registered depletion replication project: The paradigmatic replication approach. Presented at the Symposium at the 2018 Society of Personality and Social Psychology Annual Convention, Atlanta, GA.Google Scholar