Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-11T17:04:02.275Z Has data issue: false hasContentIssue false

What the Bayesian framework has contributed to understanding cognition: Causal learning as a case study

Published online by Cambridge University Press:  25 August 2011

Keith J. Holyoak
Affiliation:
Department of Psychology, University of California, Los Angeles, CA 90095-1563. holyoak@lifesci.ucla.eduhttp://cvl.psych.ucla.edu/
Hongjing Lu
Affiliation:
Department of Psychology, University of California, Los Angeles, CA 90095-1563. holyoak@lifesci.ucla.eduhttp://cvl.psych.ucla.edu/ Department of Statistics, University of California, Los Angeles, CA 90095-1563. hongjing@ucla.eduhttp://www.reasoninglaboratory.dreamhosters.com

Abstract

The field of causal learning and reasoning (largely overlooked in the target article) provides an illuminating case study of how the modern Bayesian framework has deepened theoretical understanding, resolved long-standing controversies, and guided development of new and more principled algorithmic models. This progress was guided in large part by the systematic formulation and empirical comparison of multiple alternative Bayesian models.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2011

Jones & Love (J&L) raise the specter of Bayesian Fundamentalism sweeping through cognitive science, isolating it from algorithmic models and neuroscience, ushering in a Dark Ages dominated by an unholy marriage of radical behaviorism with evolutionary “just so” stories. While we agree that a critical assessment of the Bayesian framework for cognition could be salutary, the target article suffers from a serious imbalance: long on speculation grounded in murky metaphors, short on discussion of actual applications of the Bayesian framework to modeling of cognitive processes. Our commentary aims to redress that imbalance.

The target article virtually ignores the topic of causal inference (citing only Griffiths & Tenenbaum Reference Griffiths and Tenenbaum2009). This omission is odd, as causal inference is both a core cognitive process and one of the most prominent research areas in which modern Bayesian models have been applied. To quote a recent article by Holyoak and Cheng in Annual Review of Psychology, “The most important methodological advance in the past decade in psychological work on causal learning has been the introduction of Bayesian inference to causal inference. This began with the work of Griffiths & Tenenbaum (Reference Griffiths and Tenenbaum2005, Reference Griffiths and Tenenbaum2009; Tenenbaum & Griffiths Reference Tenenbaum, Griffiths, Leen, Dietterich and Tresp2001; see also Waldmann & Martignon Reference Waldmann, Martignon, Gernsbacher and Derry1998)” (Holyoak & Cheng Reference Holyoak and Cheng2011, pp. 142–43). Here we recap how and why the Bayesian framework has had its impact.

Earlier, Pearl's (Reference Pearl1988) concept of “causal Bayes nets” had inspired the hypothesis that people learn causal models (Waldmann & Holyoak Reference Waldmann and Holyoak1992), and it had been argued that causal induction is fundamentally rational (the power PC [probabilistic contrast] theory of Cheng Reference Cheng1997). However, for about a quarter century, the view that people infer cause-effect relations from non-causal contingency data in a fundamentally rational fashion was pitted against a host of alternatives based either on heuristics and biases (e.g., Schustack & Sternberg Reference Schustack and Sternberg1981) or on associative learning models, most notably Rescorla and Wagner's (Reference Rescorla, Wagner, Black and Prokasy1972) learning rule (e.g., Shanks & Dickinson Reference Shanks, Dickinson and Bower1987). A decisive resolution of this debate proved to be elusive in part because none of the competing models provided a principled account of how uncertainty influences human causal judgments (Cheng & Holyoak Reference Cheng, Holyoak, Roitblat and Meyer1995).

J&L assert that, “Taken as a psychological theory, the Bayesian framework does not have much to say” (sect. 2.2, para. 3). In fact, the Bayesian framework says that the assessment of causal strength should not be based simply on a point estimate, as had previously been assumed, but on a probability distribution that explicitly quantifies the uncertainty associated with the estimate. It also says that causal judgments should depend jointly on prior knowledge and the likelihoods of the observed data. Griffiths and Tenenbaum (Reference Griffiths and Tenenbaum2005) made the critical contribution of showing that different likelihood functions are derived from the different assumptions about cause-effect representations postulated by the power PC theory versus associative learning theory. Both theories can be formulated within a common Bayesian framework, with each being granted exactly the same basis for representing uncertainty about causal strength. Hence, a comparison of these two Bayesian models can help identify the fundamental representations underlying human causal inference.

A persistent complaint that J&L direct at Bayesian modeling is that, “Comparing multiple Bayesian models of the same task is rare” (target article, Abstract); “[i]t is extremely rare to find a comparison among alternative Bayesian models of the same task to determine which is most consistent with empirical data” (sect. 1, para. 6). One of J&L's concluding admonishments is that, “there are generally many Bayesian models of any task. . . . Comparison among alternative models would potentially reveal a great deal” (sect. 7, para. 2). But as the work of Griffiths and Tenenbaum (Reference Griffiths and Tenenbaum2005) exemplifies, a basis for comparison of multiple models is exactly what the Bayesian framework provided to the field of causal learning.

Lu et al. (Reference Lu, Yuille, Liljeholm, Cheng and Holyoak2008b) carried the project a step further, implementing and testing a 2×2 design of Bayesian models of learning causal strength: the two likelihood functions crossed with two priors (uninformative vs. a preference for sparse and strong causes). When compared to human data, model comparisons established that human causal learning is better explained by the assumptions underlying the power PC theory, rather than by those underlying associative models. The sparse-and-strong prior accounted for subtle interactions involving generative and preventive causes that could not be explained by uninformative priors.

J&L acknowledge that, “An important argument in favor of rational over mechanistic modeling is that the proliferation of mechanistic modeling approaches over the past several decades has led to a state of disorganization” (sect. 4.1, para. 2). Perhaps no field better exemplified this state of affairs than causal learning, which had produced roughly 40 algorithmic models by a recent count (Hattori & Oaksford Reference Hattori and Oaksford2007). Almost all of these are non-normative, defined (following Perales & Shanks Reference Perales and Shanks2007) as not derived from a well-specified computational analysis of the goals of causal learning. Lu et al. (Reference Lu, Yuille, Liljeholm, Cheng and Holyoak2008b) compared their Bayesian models to those which Perales and Shanks had tested in a large meta-analysis. The Bayesian extensions of the power PC theory (with zero or one parameter) accounted for up to 92% of the variance, performing at least as well as the most successful non-normative model (with four free parameters), and much better than the Rescorla-Wagner model (see also Griffiths & Tenenbaum Reference Griffiths and Tenenbaum2009).

New Bayesian models of causal learning have thus built upon and significantly extended previous proposals (e.g., the power PC theory), and have in turn been extended to completely new areas. For example, the Bayesian power PC theory has been applied to analogical inferences based on a single example (Holyoak et al. Reference Holyoak, Lee and Lu2010). Rather than blindly applying some single privileged Bayesian theory, alternative models have been systematically formulated and compared to human data. Rather than preempting algorithmic models, the advances in Bayesian modeling have inspired new algorithmic models of sequential causal learning, addressing phenomena related to learning curves and trial order (Daw et al. 2007; Kruschke Reference Kruschke2006; Lu et al. Reference Lu, Rojas, Beckers, Yuille, Love, McRae and Sloutsky2008a). Efforts are under way to link computation-level theory with algorithmic and neuroscientific models. In short, rather than monolithic Bayesian Fundamentalism, normal science holds sway. Perhaps J&L will happily (if belatedly) acknowledge the past decade of work on causal learning as a shining example of “Bayesian Enlightenment.”

References

Cheng, P. W. (1997) From covariation to causation: A causal power theory. Psychological Review 104:367405.CrossRefGoogle Scholar
Cheng, P. W. & Holyoak, K. J. (1995) Complex adaptive systems as intuitive statisticians: Causality, contingency, and prediction. In: Comparative approaches to cognitive science, ed. Roitblat, H. L. & Meyer, J.-A., pp. 271302. MIT Press.Google Scholar
Daw, N. D., Courville, A. C. & Dayan, P. (2008) Semi-rational models: The case of trial order. In: The probabilistic mind: Prospects for rational models of cognition, ed. Oaksford, M. & Chater, N., pp. 431–52. Oxford University Press.CrossRefGoogle Scholar
Griffiths, T. L. & Tenenbaum, J. B. (2005) Structure and strength in causal induction. Cognitive Psychology 51:354–84.CrossRefGoogle ScholarPubMed
Griffiths, T. L. & Tenenbaum, J. B. (2009) Theory-based causal induction. Psychological Review 116:661716.Google Scholar
Hattori, M. & Oaksford, M. (2007) Adaptive non-interventional heuristics for covariation detection in causal induction: Model comparison and rational analysis. Cognitive Science 31:765814.CrossRefGoogle ScholarPubMed
Holyoak, K. J. & Cheng, P. W. (2011) Causal learning and inference as a rational process: The new synthesis. Annual Review of Psychology 62:135–63.CrossRefGoogle ScholarPubMed
Holyoak, K. J., Lee, H. S. & Lu, H. (2010) Analogical and category-based inference: A theoretical integration with Bayesian causal models. Journal of Experimental Psychology: General 139:702–27.CrossRefGoogle ScholarPubMed
Kruschke, J. K. (2006) Locally Bayesian learning with applications to retrospective revaluation and highlighting. Psychological Review 113:677–99.CrossRefGoogle ScholarPubMed
Lu, H., Rojas, R. R., Beckers, T. & Yuille, A. L. (2008a) Sequential causal learning in humans and rats. In: Proceedings of the Thirtieth Annual Conference of the Cognitive Science Society, ed. Love, B. C., McRae, K. & Sloutsky, V. M., pp. 195–88. Cognitive Science Society.Google Scholar
Lu, H., Yuille, A. L., Liljeholm, M., Cheng, P. W. & Holyoak, K. J. (2008b) Bayesian generic priors for causal learning. Psychological Review 115:955–82.Google Scholar
Pearl, J. (1988) Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann.Google Scholar
Perales, J. C. & Shanks, D. R. (2007) Models of covariation-based causal judgment: A review and synthesis. Psychonomic Bulletin and Review 14:577–96.Google Scholar
Rescorla, R. A. & Wagner, A. R. (1972) A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In: Classical conditioning II: Current theory and research, ed. Black, A. H. & Prokasy, W. F., pp. 6499. Appleton-Century-Crofts.Google Scholar
Schustack, M. W. & Sternberg, R. J. (1981) Evaluation of evidence in causal inference. Journal of Experimental Psychology: General 110:101–20.Google Scholar
Shanks, D. R. & Dickinson, A. (1987) Associative accounts of causality judgment. In: The psychology of learning and motivation, vol. 21, ed. Bower, G. H., pp. 229–61. Academic Press.Google Scholar
Tenenbaum, J. B. & Griffiths, T. L. (2001) Structure learning in human causal induction. In: Advances in neural information processing systems, vol. 13, ed. Leen, T. K., Dietterich, T. G. & Tresp, V., pp. 5965. MIT Press.Google Scholar
Waldmann, M. R. & Holyoak, K. J. (1992) Predictive and diagnostic learning within causal models: Asymmetries in cue competition. Journal of Experimental Psychology: General 121:222–36.CrossRefGoogle ScholarPubMed
Waldmann, M. R. & Martignon, L. (1998) A Bayesian network model of causal learning. In Proceedings of the 20th Annual Conference of the Cognitive Science Society, ed. Gernsbacher, M. A. & Derry, S. J., pp. 1102–107. Erlbaum.Google Scholar