Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-06T06:12:47.473Z Has data issue: false hasContentIssue false

The role of meta-analysis and preregistration in assessing the evidence for cleansing effects

Published online by Cambridge University Press:  18 February 2021

Robert M. Ross
Affiliation:
Department of Philosophy, Macquarie University, Macquarie Park, NSW 2109, Australiarobross46@gmail.comhttps://researchers.mq.edu.au/en/persons/robert-ross
Robbie C. M. van Aert
Affiliation:
Department of Methodology & Statistics, 5037 AB Tilburg University, Tilburg, The Netherlandsr.c.m.vanaert@tilburguniversity.eduhttp://www.robbievanaert.com/; ovdakker@gmail.comhttps://www.ovdakker.com
Olmo R. van den Akker
Affiliation:
Department of Methodology & Statistics, 5037 AB Tilburg University, Tilburg, The Netherlandsr.c.m.vanaert@tilburguniversity.eduhttp://www.robbievanaert.com/; ovdakker@gmail.comhttps://www.ovdakker.com
Michiel van Elk
Affiliation:
Department of Psychology, University of Leiden, 2311 EZLeiden, The Netherlands. m.vanelk@uva.nlhttps://www.uva.nl/profiel/e/l/m.vanelk/m.vanelk.html

Abstract

Lee and Schwarz interpret meta-analytic research and replication studies as providing evidence for the robustness of cleansing effects. We argue that the currently available evidence is unconvincing because (a) publication bias and the opportunistic use of researcher degrees of freedom appear to have inflated meta-analytic effect size estimates, and (b) preregistered replications failed to find any evidence of cleansing effects.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press

Lee and Schwarz (L&S) present a “theory of grounded procedures” that aims to account for empirical findings relating to cleansing and other physical actions (henceforth “cleansing effects”). In sect. 1.2, they report two forms of evidence that they argue indicate that cleansing effects are robust: (a) meta-analytic research and (b) replication studies. Although we applaud their consideration of robustness issues, we argue that they have not provided convincing evidence for the existence of cleansing effects.

L&S summarize the results of meta-analysis (currently unpublished and data unavailable) of experimental studies of cleansing effects (Lee, Chen, Ma, & Hoang, Reference Lee, Chen, Ma and Hoang2020a) that estimates the overall effect size to be “in the small-to-medium range and highly significant” (sect. 1.2., para. 2). Moreover, they claim that converging evidence from fail-safe n, trim-and-fill, and normal quantile plots shows that “publication bias alone was unlikely to account for the existence of cleansing effects” (sect. 1.2, para. 2). However, we agree with Ropovik et al. (this treatment) that this conclusion is unwarranted because these bias detection methods rely on untestable assumptions and have been superseded by more sophisticated methods. In addition, we note that these methods are particularly inappropriate for assessing this literature because, as L&S note, effect sizes are “highly heterogeneous” (sect. 1.2, para. 2). Fail-safe n does not take heterogeneity in effect sizes into account at all (Iyengar & Greenhouse, Reference Iyengar and Greenhouse1988), whereas trim-and-fill provides misleading results when heterogeneity is present (Peters, Sutton, Jones, Abrams, & Rushton, Reference Peters, Sutton, Jones, Abrams and Rushton2007; Terrin, Schmid, Lau, & Olkin, Reference Terrin, Schmid, Lau and Olkin2003). Removing large positive effects identified in a normal quantile plot is also inappropriate because these large effects may be genuine if the studies are heterogeneous. Consequently, we encourage Lee and colleagues to re-examine the evidence for publication bias in their upcoming meta-analysis using state-of-the-art methods such as Bayesian fill-in meta-analysis (Du, Liu, & Wang, Reference Du, Liu and Wang2017), PET-PEESE (Stanley & Doucouliagos, Reference Stanley and Doucouliagos2014), and p-uniform* (van Aert & van Assen, Reference van Aert and van Assen2020).

Another serious concern is that the p-curve analysis conducted by Ropovik et al. (this treatment) indicates that the statistically significant replication effects reported in the target article contain no evidential value and that the large proportion of p-values just below 0.05 may have been caused by the opportunistic use of researcher degrees of freedom (Simonsohn, Nelson, & Simmons, Reference Simonsohn, Nelson and Simmons2014a).

We argue that the evaluation of evidence for cleansing effects should be largely focused on preregistered studies. Preregistration is an effective approach for restricting researcher degrees of freedom and, thus, has an important role to play in resolving the replication crisis in psychology (Lakens, Reference Lakens2019; Nosek, Ebersole, DeHaven, & Mellor, Reference Nosek, Ebersole, DeHaven and Mellor2018). Among other things, a high-quality preregistration includes a specification of a target sample size that prevents optional stopping, a description of primary and secondary outcomes that prevents outcome switching, and an analysis plan that constrains the use of other researcher degrees of freedom (Bakker et al., Reference Bakker, Veldkamp, van Assen, Crompvoets, Ong, Nosek and Wicherts2020; Wicherts et al., Reference Wicherts, Veldkamp, Augusteijn, Bakker, van Aert and van Assen2016). By contrast, meta-analytic methods that aim to correct for biases necessarily rely on untestable assumptions about the processes that generate biases and the magnitudes of these biases, which means we cannot be confident that biases have been corrected (Carter, Schonbrodt, Gervais, & Hilgard, Reference Carter, Schonbrodt, Gervais and Hilgard2019). In other words, meta-analysis is no substitute for preregistered replications (van Elk et al., Reference van Elk, Matzke, Gronau, Guana, Vandekerckhove and Wagenmakers2015).

We identified 22 replication studies (that reported results) in the target article and found that only four of them (from two publications) were preregistered (Camerer et al., Reference Camerer, Dreber, Holzmeister, Ho, Huber, Johannesson and Wu2018; Johnson, Cheung, & Donnellan, Reference Johnson, Cheung and Donnellan2014b; see https://osf.io/7ehr8). Notably, each of these preregistered studies had much larger samples (N = 219, N = 132, N = 123, and N = 286) than the studies they attempted to replicate (all N = 40) (Lee & Schwarz, Reference Lee and Schwarz2010a; Schnall, Benton, & Harvey, Reference Schnall, Benton and Harvey2008) and none of them found any evidence for the cleansing effects reported in the original studies. In fact, in all four studies the point-estimate for the effect size was very close to zero (d = −0.01, d = 0.01, r = −0.07, and r = −0.05). In addition, we have identified a large multisite replication project (N = 7,001) not cited by L&S that included a test of a cleansing effect (Klein et al., Reference Klein, Fasselman, Adams, Adamsn, Alper, Aveyard and Nosek2018). This study attempted to replicate Study 2 of Zhong and Liljenquist (Reference Zhong and Liljenquist2006) (N = 27) across 50 sites and found no evidence for the predicted effect (d = 0.00). This fits a general pattern in the psychology literature: preregistered replication studies fail to replicate at a much higher rate than one would expect given the large effect sizes reported in original studies (Camerer et al., Reference Camerer, Dreber, Holzmeister, Ho, Huber, Johannesson and Wu2018; Open Science Collaboration, 2015), including for effects that had been supported by meta-analyses of studies that were not preregistered (Kvarven, Stromland, & Johannesson, Reference Kvarven, Stromland and Johannesson2020).

Because researcher degrees of freedom are curtailed in preregistered studies (if not entirely absent, see Bakker et al., Reference Bakker, Veldkamp, van Assen, Crompvoets, Ong, Nosek and Wicherts2020; Claesen, Gomes, Tuerlinckx, & Vanpaemel, Reference Claesen, Gomes, Tuerlinckx and Vanpaemel2020) we suggest that Lee and colleagues could enhance the informativeness of their upcoming meta-analysis of cleansing effects by supplementing it with a targeted meta-analysis that includes only those studies that were preregistered. Finding meta-analytic evidence for cleansing effects in preregistered studies would considerably strengthen the case for cleansing effects being robust phenomena, whereas a failure to find evidence would be cause for concern. A meta-analysis of the money priming effect provides an interesting example of the extent to which results can diverge (Lodder, Ong, Grasman, & Wicherts, Reference Lodder, Ong, Grasman and Wicherts2019). The full meta-analysis of 246 money priming studies estimated an overall effect size of small to medium magnitude (g = 0.31; see Fig. 1 (top-left plot), p. 701). By contrast, the targeted meta-analysis of the 47 preregistered studies found an average effect size that was non-significant (g = 0.01; see Fig. 1 (middle-right plot), p. 701).

In summary, we have argued that a scientific assessment of the evidence for cleansing effects requires the application of state-of-the-art publication bias methods and a meta-analysis of preregistered studies. As things stand, the empirical foundation for the theory of grounded procedures is tenuous.

Financial support

Robert M. Ross is supported by the Australian Research Council, Grant Number: DP180102384. Robbie C.M. van Aert and Olmo R. van den Akker are supported by the European Research Council, Grant Number: 726361 (IMPROVE).

Conflict of interest

None.

References

Bakker, M., Veldkamp, C. L. S., van Assen, M. A. L. M., Crompvoets, E. A. V., Ong, H. H., Nosek, B. A., … Wicherts, J. M. (2020). Ensuring the quality and specificity of preregistrations. https://doi.org/10.31234/osf.io/cdgyh.CrossRefGoogle Scholar
Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., … Wu, H. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behavior, 2(9), 637644. doi: 10.1038/s41562-018-0399-z.CrossRefGoogle ScholarPubMed
Carter, E. C., Schonbrodt, F. D., Gervais, W. M., & Hilgard, J. (2019). Correcting for bias in psychology: A comparison of meta-analytic methods. Advances in Methods and Practices in Psychological Science, 2(2), 115144. https://doi.org/10.1177/2515245919847196.CrossRefGoogle Scholar
Claesen, A., Gomes, S., Tuerlinckx, F., & Vanpaemel, W. (2020). Preregistration: Comparing dream to reality. https://doi.org/10.31234/osf.io/d8wex.CrossRefGoogle Scholar
Du, H., Liu, F., & Wang, L. (2017). A Bayesian “fill-in” method for correcting for publication bias in meta-analysis. Psychological Methods, 22(4), 799817. doi: 10.1037/met0000164.CrossRefGoogle ScholarPubMed
Iyengar, S., & Greenhouse, J. B. (1988). Selection models and the file drawer problem. Statistical Science, 3(1), 109135.CrossRefGoogle Scholar
Johnson, D. J., Cheung, F., & Donnellan, M. B. (2014b). Does cleanliness influence moral judgments?: A direct replication of Schnall, Benton, and Harvey (2008). Social Psychology, 45(3), 209215. https://doi.org/10.1027/1864-9335/a000186.CrossRefGoogle Scholar
Klein, R. A., Fasselman, F., Adams, B. G., Adamsn, R. B., Alper, S., Aveyard, M., … Nosek, B. A. (2018). Many Labs 2: Investigating variation in replicability across sample and setting. Advances in Methods and Practices in Psychological Science, 1(4), 443490. https://doi.org/10.1177/2515245918810225.CrossRefGoogle Scholar
Kvarven, A., Stromland, E., & Johannesson, M. (2020). Comparing meta-analyses and preregistered multiple-laboratory replication projects. Nature Human Behaviour, 4, 423434. doi: 10.1038/s41562-019-0787-z.CrossRefGoogle ScholarPubMed
Lakens, D. (2019). The value of preregistration for psychological science: A conceptual analysis. Japanese Psychological Review, 62(3), 221230.Google Scholar
Lee, S. W. S., Chen, K., Ma, C., & Hoang, J. (2020a). Psychological antecedents and consequences of physical cleansing: A meta-analytic review [manuscript in preparation].Google Scholar
Lee, S. W. S., & Schwarz, N. (2010a). Washing away postdecisional dissonance. Science (New York, N.Y.), 328, 709. https://doi.org/10.1126/science.1186799.CrossRefGoogle Scholar
Lodder, P., Ong, H. H., Grasman, R. P. P. P., & Wicherts, J. M. (2019). A comprehensive meta-analysis of money priming. Journal of Experimental Psychology: General, 148(4), 688712. doi: 10.1037/xge0000570.CrossRefGoogle ScholarPubMed
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 26002606. doi: 10.1073/pnas.1708274114.CrossRefGoogle ScholarPubMed
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science (New York, N.Y.), 349(6251), aac4716. doi: 10.1126/science.aac4716.CrossRefGoogle Scholar
Peters, J. L., Sutton, A. J., Jones, D. R., Abrams, K. R., & Rushton, L. (2007). Performance of the trim and fill method in the presence of publication bias and between-study heterogeneity. Statistics in Medicine, 26(25), 45444562. doi: 10.1002/sim.2889.CrossRefGoogle ScholarPubMed
Schnall, S., Benton, J., & Harvey, S. (2008). With a clean conscience cleanliness reduces the severity of moral judgments. Psychological Science, 19(12), 12191222. doi: 10.1111/j.1467-9280.2008.02227.x.CrossRefGoogle ScholarPubMed
Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014a). p-Curve and effect size: Correcting for publication bias using only significant results. Perspectives on Psychological Science, 9(6), 666681. doi: 10.1177/1745691614553988.CrossRefGoogle Scholar
Stanley, T. D., & Doucouliagos, C. H. (2014). Meta-regression approximations to reduce publication selection bias. Research Synthesis Methods, 5(1), 6078. doi: 10.1002/jrsm.1095.CrossRefGoogle ScholarPubMed
Terrin, N., Schmid, C. H., Lau, J., & Olkin, I. (2003). Adjusting for publication bias in the presence of heterogeneity. Statistics in Medicine, 22(13), 21132126. doi: 10.1002/sim.1461.CrossRefGoogle ScholarPubMed
van Aert, R. C. M., & van Assen, M. A. L. M. (2020). Correcting for publication bias in a meta-analysis with the p-uniform* method. https://doi.org/10.31222/osf.io/zqjr9.CrossRefGoogle Scholar
van Elk, M., Matzke, D., Gronau, Q. F., Guana, M., Vandekerckhove, J., & Wagenmakers, E. J. (2015). Meta-analyses are no substitute for registered replications: A skeptical perspective on religious priming. Frontiers in Psychology, 6, 17. doi: 10.3389/fpsyg.2015.01365.CrossRefGoogle ScholarPubMed
Wicherts, J. M., Veldkamp, C. L. S., Augusteijn, H. E. M., Bakker, M., van Aert, R. C., & van Assen, M. A. (2016). Degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Frontiers in Psychology, 7, 112. doi: 10.3389/fpsyg.2016.01832.CrossRefGoogle ScholarPubMed
Zhong, C. B., & Liljenquist, K. (2006). Washing away your sins: Threatened morality and physical cleansing. Science (New York, N.Y.), 313(5792), 14511452. https://doi.org/10.1126/science.1130726.CrossRefGoogle ScholarPubMed