Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-02-11T09:30:19.808Z Has data issue: false hasContentIssue false

Direct replication and clinical psychological science

Published online by Cambridge University Press:  27 July 2018

Scott O. Lilienfeld*
Affiliation:
Department of Psychology, Emory University, Atlanta, GA 30322; School of Psychological Sciences, University of Melbourne, Melbourne, VIC 3010, Australia. slilien@emory.eduhttp://psychology.emory.edu/home/people/faculty/lilienfeld-scott.html

Abstract

Zwaan et al. make a compelling case for the necessity of direct replication in psychological science. I build on their arguments by underscoring the necessity of direct implication for two domains of clinical psychological science: the evaluation of psychotherapy outcome and the construct validity of psychological measures.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2018 

In their clearly reasoned target article, Zwaan et al. make a persuasive case that direct replication is essential for the health of psychological science. The principle of the primacy of internal validity (Cook et al. Reference Cook, Campbell, Peracchio, Dunnette and Hough1990) underscores the point that one must convincingly demonstrate a causal effect (internal validity) before generalizing it to similar settings, participants, measures, and the like (external validity). Some scholars appear to have overlooked the importance of this mandate. In an otherwise incisive article, my Ph.D. mentor David Lykken (Reference Lykken1968) wrote that “Since operational replication [what Zwaan et al. term direct replication] must really be done by an independent second investigator and since constructive replication [what Zwaan et al. term conceptual replication] has greater generality, its success strongly impl[ies] that an operational replication would have succeeded also” (p. 159). Lykken, like many scholars, underestimated the myriad ways (e.g., p-hacking, file-drawering of negative results) in which conceptual replications can yield significant but spurious results (Lindsay et al. Reference Lindsay, Simons and Lilienfeld2016). Hence, an apparently successful conceptual replication does not imply that the direct replication would have succeeded, as well.

I build on Zwaan et al.'s well-reasoned arguments by extending them to a subdiscipline they did not explicitly address: clinical psychological science. Probably because recent replicability debates have been restricted largely to scholars in cognitive, social, and personality psychology (Tackett et al. Reference Tackett, Lilienfeld, Patrick, Johnson, Krueger, Miller, Oltmans and Shrout2017a), the implications of these discussions for key domains of clinical psychology, especially psychotherapy and assessment, have been insufficiently appreciated. I contend that an overemphasis on conceptual replication at the expense of direct replication can generate misleading conclusions that are potentially detrimental to clinical research and patient care.

In the psychotherapy field, attention has turned increasingly to the development and identification of empirically supported therapies (ESTs; Chambless & Ollendick Reference Chambless and Ollendick2001), which are treatments demonstrated to be efficacious for specific disorders in independently replicated trials. Their superficial differences notwithstanding, all EST taxonomies require these interventions to be manualized or at least delineated in sufficient detail to permit replication by independent researchers. Although direct replications of psychotherapy outcome studies are often impractical (Coyne Reference Coyne2016) given the formidable difficulties of recruiting comparable patients and ensuring comparably trained therapists, investigators can still undertake concerted efforts to ascertain whether a carefully described psychotherapy protocol that yields positive effects in one study does so in future studies. Herein lies the problem: Without an independently replicated demonstration that the original protocol generates positive effects, practitioners and researchers can interpret a successful conceptual replication of a modified protocol as evidence that the treatment is ready for routine clinical application. Such a conclusion would be premature and potentially harmful, because the original protocol has demonstrated its mettle in a single study alone.

Conversely, practitioners and researchers may assume that a conceptual replication failure implies that the initial psychotherapy protocol was ineffective, but this conclusion could likewise be erroneous. Admittedly, research on the extent to which adaptations of EST protocols tend to degrade their efficacy is inconsistent (Stirman et al. Reference Stirman, Gamarra, Bartlett, Calloway and Gutner2017). Nevertheless, in certain instances, seemingly minor changes in psychotherapy protocols may produce detrimental effects. For example, studies of exposure therapy for anxiety disorders suggest that the commonplace practice of encouraging patients to engage in safety behaviors (e.g., practicing relaxation skills) during exposure often adversely affects treatment outcomes (Blakey & Abramowitz Reference Blakey and Abramowitz2016). The same overarching conclusion may hold for self-help interventions. Rosen (Reference Rosen1993) observed that even seemingly trivial changes to self-help programs can result in unanticipated changes in treatment compliance, effectiveness, or both. For example, in one study the addition of a self-reward contracting manipulation to an effective program for snake phobia decreased treatment compliance from 50% to zero (Barrera & Rosen Reference Barrera and Rosen1977), perhaps because clients perceived the supplementary component as onerous. Consequently, failed conceptual replications can lead to the mistaken conclusion that effective treatment protocols are impractical, ineffective, or both.

In the clinical assessment field, an overemphasis on conceptual replication can contribute to what Pinto and I (Lilienfeld & Pinto Reference Lilienfeld and Pinto2015) termed the illusion of replication. This illusion can arise when investigators fail to delineate an explicit nomological network (Cronbach & Meehl Reference Cronbach and Meehl1955) of predictions for the construct validation of a measure, permitting them to engage in a program of ad hoc validation (Kane Reference Kane2001). In such a research program, psychologists are free to hand-pick from an assortment of findings on diverse indicators to justify support for a measure's construct validity. In some cases, they may conclude that a measure has been validated for a given clinical purpose even in the absence of a single directly replicated finding.

Research on the widely used “Suicide Constellation” of the Rorschach Inkblot Test affords a potential illustration. Based on a meta-analysis of Rorschach variables, an author team concluded that the Suicide Constellation is a well-validated indicator of suicide risk (Mihura et al. Reference Mihura, Meyer, Dumitrascu and Bombel2013, p. 572). Nevertheless, this conclusion hinged on only four studies (see Wood et al. [Reference Wood, Garb, Nezworski, Lilienfeld and Duke2015] for a discussion), one on completed suicides, one on attempted suicides, one on ratings of suicidality, and one on levels of serotonin in cerebrospinal fluid (low levels of which have been tied to suicide risk [Glick Reference Glick2015]). As a result, the validity of the Suicide Constellation is uncertain given that its support rests on correlations with four ostensibly interrelated, but separable, indicators, with no evidence of direct replication.

Conversely, researchers may assume that a conceptual replication failure following a seemingly minor change to a measure calls into question the initial positive finding. For example, in efforts to save time, investigators frequently administer abbreviated forms of well-established measures, such as the Minnesota Multiphasic Personality Inventory–2. Nevertheless, such short forms often exhibit psychometric properties inferior to those of their parent measures (Smith et al. Reference Smith, McCarthy and Anderson2000). Hence, failed conceptual replications using such measures do not mean that the original result was untrustworthy.

When it comes to psychological treatments and measures, generalizability cannot simply be assumed. Direct replications of initial positive results, or at least close approximations of them, are not merely a research formality. They are indispensable for drawing firm conclusions regarding the use of clinical methods.

References

Barrera, M. Jr. & Rosen, G. M. (1977) Detrimental effects of a self-reward contracting program on subjects' involvement in self-administered desensitization. Journal of Consulting and Clinical Psychology 45:1180–81.Google Scholar
Blakey, S. M. & Abramowitz, J. S. (2016) The effects of safety behaviors during exposure therapy for anxiety: Critical analysis from an inhibitory learning perspective. Clinical Psychology Review 49:115.Google Scholar
Chambless, D. L. & Ollendick, T. H. (2001) Empirically supported psychological interventions: Controversies and evidence. Annual Review of Psychology 52:685716.Google Scholar
Cook, T. D., Campbell, D. T. & Peracchio, L. (1990) Quasi-experimentation. In: Handbook of industrial and organizational psychology, vol. 1, 2nd edition, ed. Dunnette, M. D. & Hough, L. M., pp. 491576. Consulting Psychologists.Google Scholar
Coyne, J. C. (2016) Replication initiatives will not salvage the trustworthiness of psychology. BMC Psychology 4:28. Available at: http://doi.org/10.1186/s40359-016-0134-3.Google Scholar
Cronbach, L. J. & Meehl, P. E. (1955) Construct validity in psychological tests. Psychological Bulletin 52:281302.Google Scholar
Glick, A. R. (2015) The role of serotonin in impulsive aggression, suicide, and homicide in adolescents and adults: A literature review. International Journal of Adolescent Medicine and Health 27:143–50.Google Scholar
Kane, M. T. (2001) Current concerns in validity theory. Journal of Educational Measurement 38:319–42.Google Scholar
Lilienfeld, S. O. & Pinto, M. D. (2015) Risky tests of etiological models in psychopathology research: The need for meta-methodology. Psychological Inquiry 26:253–58.Google Scholar
Lindsay, D. S., Simons, D. J. & Lilienfeld, S. O. (2016) Research preregistration 101. Association for Psychological Science Observer 29:1417.Google Scholar
Lykken, D. T. (1968) Statistical significance in psychological research. Psychological Bulletin 70:151–59.Google Scholar
Mihura, J. L., Meyer, G. J., Dumitrascu, N. & Bombel, G. (2013). The validity of individual Rorschach variables: Systematic reviews and meta-analyses of the comprehensive system. Psychological Bulletin 139, 548605.Google Scholar
Rosen, G. M. (1993). Self-help or hype? Comments on psychology's failure to advance self-care. Professional Psychology: Research and Practice 24:340–45.Google Scholar
Smith, G. T. McCarthy, D. M. & Anderson, K. G. (2000) On the sins of short-form development. Psychological Assessment 12:102–11.Google Scholar
Stirman, S. W., Gamarra, J. M., Bartlett, B. A., Calloway, A. & Gutner, C. A. (2017) Empirical examinations of modifications and adaptations to evidence based psychotherapies: Methodologies, impact, and future directions. Clinical Psychology: Science and Practice 24:396420.Google Scholar
Tackett, J. L., Lilienfeld, S. O., Patrick, C. J., Johnson, S. L, Krueger, R. F, Miller, J. D., Oltmans, T. F. & Shrout, P. E. (2017a) It's time to broaden the replicability conversation: Thoughts for and from clinical psychological science. Perspectives on Psychological Science 12(5):742–56.Google Scholar
Wood, J. M., Garb, H. N., Nezworski, M. T., Lilienfeld, S. O. & Duke, M. C. (2015) A second look at the validity of widely used Rorschach indices: Comment on Mihura, Meyer, Dumitrascu, and Bombel (2013). Psychological Bulletin 141:236–49.Google Scholar