Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-02-06T06:26:28.296Z Has data issue: false hasContentIssue false

Base-rate respect: From ecological rationality to dual processes

Published online by Cambridge University Press:  29 October 2007

Aron K. Barbey
Affiliation:
Cognitive Neuroscience Section, National Institute of Neurological Disorders and Stroke, Bethesda, MD 20892-1440, barbeya@ninds.nih.gov
Steven A. Sloman
Affiliation:
Cognitive and Linguistics Science, Brown University, Providence, RI 02912, Steven_Sloman@brown.eduhttp://www.cog.brown.edu/~sloman/
Rights & Permissions [Opens in a new window]

Abstract

The phenomenon of base-rate neglect has elicited much debate. One arena of debate concerns how people make judgments under conditions of uncertainty. Another more controversial arena concerns human rationality. In this target article, we attempt to unpack the perspectives in the literature on both kinds of issues and evaluate their ability to explain existing data and their conceptual coherence. From this evaluation we conclude that the best account of the data should be framed in terms of a dual-process model of judgment, which attributes base-rate neglect to associative judgment strategies that fail to adequately represent the set structure of the problem. Base-rate neglect is reduced when problems are presented in a format that affords accurate representation in terms of nested sets of individuals.

Type
Main Articles
Copyright
Copyright © Cambridge University Press 2007

1. Introduction

Diagnosing whether a patient has a disease, predicting whether a defendant is guilty of a crime, and other everyday as well as life-changing decisions reflect, in part, the decision-maker's subjective degree of belief in uncertain events. Intuitions about probability frequently deviate dramatically from the dictates of probability theory (e.g., Gilovich et al. Reference Gilovich, Griffin and Kahneman2002). One form of deviation is notorious: people's tendency to neglect base-rates in favor of specific case data. A number of theorists (e.g., Brase Reference Brase2002a; Cosmides & Tooby Reference Cosmides and Tooby1996; Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995) have argued that such neglect reveals little more than experimenters' failure to ask about uncertainty in a form that naïve respondents can understand – specifically, in the form of a question about natural frequencies. The brunt of our argument in this target article is that this perspective is far too narrow. After surveying the theoretical perspectives on the issue, we show that both data and conceptual considerations demand that judgment be understood in terms of dual processing systems: one that is responsible for systematic error and another that is capable of reasoning not just about natural frequencies, but about relations among any kind of set representation.

Base-rate neglect has been extensively studied in the context of Bayes' theorem, which provides a normative standard for updating the probability of a hypothesis in light of new evidence. Research has evaluated the extent to which intuitive probability judgment conforms to the theorem by employing a Bayesian inference task in which the respondent is presented a word problem and has to infer the probability of a hypothesis (e.g., the presence versus absence of breast cancer) on the basis of an observation (e.g., a positive mammography). Consider the following Bayesian inference problem presented by Gigerenzer and Hoffrage (Reference Gigerenzer and Hoffrage1995; adapted from Eddy Reference Eddy, Kahneman, Slovic and Tversky1982):

The probability of breast cancer is 1% for a woman at age forty who participates in routine screening [base-rate]. If a woman has breast cancer, the probability is 80% that she will get a positive mammography [hit-rate]. If a woman does not have breast cancer, the probability is 9.6% that she will also get a positive mammography [false-alarm rate]. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer? _%. (Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995, p. 685)

According to Bayes' theorem,Footnote 1 the probability that the patient has breast cancer given that she has a positive mammography is 7.8%. Evidence that people's judgments on this problem accord with Bayes' theorem would be consistent with the claim that the mind embodies a calculus of probability, whereas the lack of such a correspondence would demonstrate that people's judgments can be at variance with sound probabilistic principles and, as a consequence, that people can be led to make incoherent decisions (Ramsey Reference Ramsey, Kyburg and Smokler1964; Savage Reference Savage1954). Thus, the extent to which intuitive probability judgment conforms to the normative prescriptions of Bayes' theorem has implications for the nature of human judgment (for a review of the theoretical debate on human rationality, see Stanovich Reference Stanovich1999). In the case of Eddy's study, fewer than 5% of the respondents generated the Bayesian solution.

Early studies evaluating Bayesian inference under single-event probabilities also showed systematic deviations from Bayes' theorem. Hammerton (Reference Hammerton1973), for example, found that only 10% of the physicians tested generated the Bayesian solution, with the median response approximating the hit-rate of the test. Similarly, Casscells et al. (Reference Casscells, Schoenberger and Graboys1978) and Eddy (Reference Eddy, Kahneman, Slovic and Tversky1982) found that a low proportion of respondents generated the Bayesian solution: 18% in the former and 5% in the latter, with the modal response in each study corresponding to the hit-rate of the test. All of this suggests that the mind does not normally reason in a way consistent with the laws of probability theory.

1.1. Base-rate facilitation

However, this conclusion has not been drawn universally. Eddy's (Reference Eddy, Kahneman, Slovic and Tversky1982) problem concerned a single event, the probability that a particular woman has breast cancer. In some problems, when probabilities that refer to the chances of a single event occurring (e.g., 1%) are reformulated and presented in terms of natural frequency formats (e.g., 10 out of 1,000), people more often draw probability estimates that conform to Bayes' theorem. Consider the following mammography problem presented in a natural frequency format by Gigerenzer and Hoffrage (Reference Gigerenzer and Hoffrage1995):

10 out of every 1,000 women at age forty who participate in routine screening have breast cancer [base-rate]. 8 out of every 10 women with breast cancer will get a positive mammography [hit-rate]. 95 out of every 990 women without breast cancer will also get a positive mammography [false-alarm rate]. Here is a new representative sample of women at age forty who got a positive mammography in routine screening. How many of these women do you expect to actually have breast cancer? __ out of __ . (Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995, p. 688)

The proportion of responses conforming to Bayes' theorem increased by a factor of about three in this case, 46% under natural frequency formats versus 16% under a single-event probability format. The observed facilitation has motivated researchers to argue that coherent probability judgment depends on representing events in the form of natural frequencies (e.g., Brase Reference Brase2002a; Cosmides & Tooby Reference Cosmides and Tooby1996; Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995).

Cosmides and Tooby (Reference Cosmides and Tooby1996) also conducted a series of experiments that employed Bayesian inference problems that had previously elicited judgmental errors under single-event probability formats. In Experiment 1, they replicated Casscells et al. (Reference Casscells, Schoenberger and Graboys1978), demonstrating that only 12% of their respondents produced the Bayesian answer when presented with single-event probabilities. Cosmides and Tooby then transformed the single-event probabilities into natural frequencies, resulting in a remarkably high proportion of Bayesian responses: 72% of respondents generated the Bayesian solution, supporting the authors' conclusion that Bayesian inference depends on the use of natural frequencies.

Gigerenzer (Reference Gigerenzer1996) explored whether physicians, who frequently assess and diagnose medical illness, would demonstrate the same pattern of judgments as that of clinically untrained college undergraduates. Consistent with the judgments drawn by college students (e.g., Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995), Gigerenzer found that the sample of 48 physicians tested generated the Bayesian solution in only 10% of the cases under single-event probability formats, whereas 46% did so with natural frequency formats. Physicians spent about 25% more time on the single-event probability problems, which suggests that they found these problems more difficult to solve than problems presented in a natural frequency format. Thus, the physician's judgments were consistent with those of non-physicians, suggesting that formal training in medical diagnosis does not lead to more accurate Bayesian reasoning and that natural frequencies facilitate probabilistic inference across populations.

Further studies have demonstrated that the facilitory effect of natural frequencies on Bayesian inference observed in the laboratory has the potential for improving the predictive accuracy of professionals in important real-world settings. Gigerenzer and his colleagues have shown, for example, that natural frequencies facilitate Bayesian inference in AIDS counseling (Gigerenzer et al. Reference Gigerenzer, Hoffrage and Ebert1998), in the assessment of statistical information by judges (Lindsey et al. Reference Lindsey, Hertwig and Gigerenzer2003), and in teaching Bayesian reasoning to college undergraduates (Kuzenhauser & Hoffrage Reference Hoffrage, Gigerenzer, Krauss and Martignon2002; Sedlmeier & Gigerenzer Reference Sedlmeier and Gigerenzer2001). In summary, the reviewed findings demonstrate facilitation in Bayesian inference when single-event probabilities are translated into natural frequencies, consistent with the view that coherent probability judgment depends on natural frequency representations.

1.2. Theoretical accounts

Explanations of facilitation in Bayesian inference can be grouped into five types that can be arrayed along a continuum of cognitive control, from accounts that ascribe facilitation to processes that have little to do with strategic cognitive processing to those that appeal to general-purpose reasoning procedures. The five accounts we discuss can be contrasted at the coarsest level on five dimensions (see Table 1). We do not claim that theorists have consistently made these distinctions in the past, only that these distinctions are in fact appropriate ones.

Table 1. Prerequisites for reduction of base-rate neglect according to 5 theoretical frameworks

Note. The prerequisites of each theory are indicated by an ‘X’.

A parallel taxonomy for theories of categorization can be found in Sloman et al. (in press). We briefly introduce the theoretical frameworks here. The discussion of each will be elaborated as required to reveal assumptions and derive predictions in the following sections in order to compare and contrast them.

1.2.1. Mind as Swiss army knife

Several theorists have argued that the human mind consists of a number of specialized modules (Cosmides & Tooby Reference Cosmides and Tooby1996; Gigerenzer & Selten Reference Gigerenzer and Selten2001). Each module is assumed to be unavailable to conscious awareness or deliberate control (i.e., cognitively impenetrable), and also assumed to be able to process only a specific type of information (i.e., informationally encapsulated; see Fodor Reference Fodor1983). One module in particular is designed to process natural frequencies. This module is thought to have evolved because natural frequency information is what was available to our ancestors in the environment of evolutionary adaptiveness. In this view, facilitation occurs because natural frequency data are processed by a computationally effective processing module.

Two arguments have been advanced in support of the ecological validity of natural frequency data. First, as natural frequency information is acquired, it can be “easily, immediately, and usefully incorporated with past frequency information via the use of natural sampling, which is the method of counting occurrences of events as they are encountered and storing the resulting knowledge base for possible use later” (Brase Reference Brase2002b, p. 384). Second, information stored in a natural frequency format preserves the sample size of the reference class (e.g., 10 out of 1,000 women have breast cancer), and are arranged into subset relations (e.g., of the 10 women that have breast cancer, 8 are positively diagnosed) that indicate how many cases of the total sample there are in each subcategory (i.e., the base-rate, the hit-rate, and false-alarm rate). Because natural frequency formats entail the sample and effect sizes, posterior probabilities consistent with Bayes' theorem can be calculated without explicitly incorporating base-rates, thereby allowing simple calculationsFootnote 2 (Kleiter Reference Kleiter, Fischer and Laming1994). Thus, proponents of this view argue that the mind has evolved to process natural frequency formats over single-event probabilities, and that, in particular, it includes a cognitive module that “maps frequentist representations of prior probabilities and likelihoods onto a frequentist representation of a posterior probability in a way that satisfies the constraints of Bayes' theorem” (Cosmides & Tooby Reference Cosmides and Tooby1996, p. 60).

Theorists who take this position uniformly motivate their hypothesis via a process of natural selection. However, the cognitive and evolutionary claims are in fact conceptually independent. The mind could consist of cognitively impenetrable and informationally encapsulated modules whether or not any or all of those modules evolved for the specific reasons offered.

1.2.2. Natural frequency algorithm

A weaker claim is that the mind includes a specific algorithm for effectively processing natural frequency information (Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995). Unlike the mind-as-Swiss-army-knife view, this hypothesis makes no general claim about the architecture of mind. Despite their difference in scope, however, these two theories adopt the same computational and evolutionary commitments.

Consistent with the mind-as-Swiss-army-knife view, the algorithm approach proposes that coherent probability judgment derives from a simplified form of Bayes' theorem. The proposed algorithm computes the number of cases where the hypothesis and observation co-occur, N(H and D), out of the total number of cases where the observation occurs, N(H and D)+N(not-H and D)=N(D) (Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995; Kleiter Reference Kleiter, Fischer and Laming1994). Because this form of Bayes' theorem expresses a simple ratio of frequencies, we refer to it as “the Ratio.”

Following the mind-as-Swiss-army knife view, proponents of this approach have ascribed the origin of the Bayesian ratio to evolution. Gigerenzer and Hoffrage (Reference Gigerenzer and Hoffrage1995, p. 686), for example, state

The evolutionary argument that cognitive algorithms were designed for frequency information, acquired through natural sampling, has implications for the computations an organism needs to perform when making Bayesian inferences …. Bayesian algorithms are computationally simpler when information is encoded in a frequency format rather than a standard probability format.

As a consequence, the algorithm view predicts that “Performance on frequentist problems will satisfy some of the constraints that a calculus of probability specifies, such as Bayes' rule. This would occur because some inductive reasoning mechanisms in our cognitive architecture embody aspects of a calculus of probability” (Cosmides & Tooby Reference Cosmides and Tooby1996, p. 17).

The proposed algorithm is necessarily informationally encapsulated, as it operates on a specific information format – natural frequencies; but it is not necessarily cognitively impenetrable, as no one has claimed that other cognitive processes cannot affect or cannot use the algorithm's computations. The primary motivation for the existence of this algorithm has been computational (Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995; Kleiter Reference Kleiter, Fischer and Laming1994). As reviewed above, the value of natural frequencies is that these formats entail the sample and effect sizes and, as a consequence, simplify the calculation of Bayes' theorem: Probability judgments are coherent with Bayesian prescriptions even without explicit consideration of base-rates.

1.2.3. Natural frequency heuristic

A claim which puts facilitation under more cognitive control is that people use heuristics to make judgments (Gigerenzer & Selten Reference Gigerenzer and Selten2001; Tversky & Kahneman Reference Tversky and Kahneman1974) and that the Ratio is one such heuristic (Gigerenzer et al. Reference Gigerenzer and Todd1999). According to this view, “heuristics can perform as well, or better, than algorithms that involve complex computations …. The astonishingly high accuracy of these heuristics indicates their ecological rationality; fast and frugal heuristics exploit the statistical structure of the environment, and they are adapted to this structure” (Gigerenzer Reference Gigerenzer2006). Advocates of this approach motivate the proposed heuristic by pointing to the ecological validity of natural frequency formats, as Gigerenzer further states (p. 52):

To evaluate the performance of the human mind, one needs to look at its environment and, in particular, the external representation of the information. For most of the time during which the human mind evolved, information was encountered in the form of natural frequencies …

Thus, this view proposes that the mind evolved to process natural frequencies and that this evolutionary adaptation gave rise to the proposed heuristic that computes the Bayesian Ratio from natural frequencies.

1.2.4. Non-evolutionary natural frequency heuristic

Evolutionary arguments about the ecological validity of natural frequency representations provide part of the motivation for the preceding theories. In particular, proponents of the theories argue that throughout the course of human evolution natural frequencies were acquired via natural sampling (i.e., encoding event frequencies as they are encountered, and storing them in the appropriate reference class).

In contrast, the non-evolutionary natural frequency theory proposes that natural sampling is not necessarily an evolved procedure for encoding statistical regularities in the environment, but rather, a useful sampling method that, one way or another, people can appreciate and use. The natural frequency representations that result from natural sampling, on this view, simplify the calculation of Bayes' theorem and, as a consequence, facilitate Bayesian inference (Kleiter Reference Kleiter, Fischer and Laming1994). Thus, this non-evolutionary view differs from the preceding accounts by resting on a purely computational argument that is independent of any commitments as to which cognitive processes have been selected for by evolution.

This theory proposes that the computational simplicity afforded by natural frequencies gives rise to a heuristic that computes the Bayesian Ratio from natural frequencies. The proposed heuristic implies a higher degree of cognitive control than the preceding modular algorithms.

1.2.5. Nested sets and dual processes

The most extreme departure from the modular view claims that facilitation is a product of general-purpose reasoning processes (Evans et al. Reference Evans, Handley, Perham, Over and Thompson2000; Fox & Levav Reference Fox and Levav2004; Girotto & Gonzales Reference Girotto and Gonzalez2001; Johnson-Laird et al. Reference Johnson-Laird, Legrenzi, Girotto, Legrenzi and Caverni1999; Kahneman & Frederick Reference Kahneman, Frederick, Gilovich, Griffin and Kahneman2002; Reference Kahneman, Frederick, Holyoak and Morris2005; Over Reference Over and Over2003; Reyna Reference Reyna1991; Sloman et al. Reference Sloman, Over, Slovak and Stibel2003). In this view, people use two systems to reason (Evans & Over Reference Evans and Over1996; Kahneman & Frederick Reference Kahneman, Frederick, Gilovich, Griffin and Kahneman2002; Reference Kahneman, Frederick, Holyoak and Morris2005; Reyna & Brainerd Reference Reyna, Brainerd, Wright and Ayton1994; Sloman Reference Sloman1996a; Stanovich & West Reference Stanovich2000), often called Systems 1 and 2. But in an effort to use more expressive labels, we will employ Sloman's terms “associative” and “rule-based.”

The dual-process model attributes responses based on associative principles like similarity or retrieval from memory to a primitive associative judgment system. It attributes responses based on more deliberative processing that involves working memory, such as the elementary set operations that respect the logic of set inclusion and facilitate Bayesian inference, to a second rule-based system. Judgmental errors produced by cognitive heuristics are generated by associative processes, whereas the induction of a representation of category instances that makes nested set relations transparent also induces use of rules about elementary set operations – operations of the sort perhaps described by Fox and Levav (Reference Fox and Levav2004) or Johnson-Laird et al. (Reference Johnson-Laird, Legrenzi, Girotto, Legrenzi and Caverni1999).

According to this theory, base-rate neglect results from associative responding and facilitation occurs when people correctly use rules to make the inference. Rule-based inference is more cognitively demanding than associative inference, and is therefore more likely to occur when participants have more time, more incentives, or more external aids to make a judgment and are under fewer other demands at the moment of judgment. It is also more likely for people who have greater skill in employing the relevant rules. This last prediction is supported by Stanovich and West (Reference Stanovich2000) who find correlations between intelligence and use of base rates.

Rules are effective devices for solving a problem to the extent that the problem is represented in a way compatible with the rules. For example, long division is an effective method for solving division problems, but only if numbers are represented using Arabic numerals; division with Roman numerals requires different rules. By analogy, this view proposes that natural frequencies facilitate use of base-rates because the rules people have access to and are able to use to solve the specific kind of problem studied in the base-rate neglect literature are more compatible with natural frequency formats than single-event probability formats.

Specifically, people are adept at using rules consisting of simple elementary set operations. But these operations are only applicable when problems are represented in terms of sets, as opposed to single events (Reyna Reference Reyna1991; Reyna & Brainerd Reference Reyna and Brainerd1995). According to this view, facilitation in Bayesian inference occurs under natural frequencies because these formats are an effective cue to the representation of the set structure underlying a Bayesian inference problem. This is the nested sets hypothesis of Tversky and Kahneman (Reference Tversky and Kahneman1983). In this framework, natural frequency formats prompt the respondent to adopt an outside view by inducing a representation of category instances (e.g., 10 out of 1,000 women have breast cancer) that reveals the set structure of the problem and makes the nested set relations transparent for problem solving.Footnote 3 We refer to this hypothesis as the nested sets theory (Ayton & Wright Reference Ayton, Wright, Wright and Ayton1994; Evans et al. Reference Evans, Handley, Perham, Over and Thompson2000; Fox & Levav Reference Fox and Levav2004; Girotto & Gonzalez Reference Girotto and Gonzalez2001; Reference Girotto and Gonzalez2002; Johnson-Laird et al. Reference Johnson-Laird, Legrenzi, Girotto, Legrenzi and Caverni1999; Reyna Reference Reyna1991; Tversky & Kahneman Reference Tversky and Kahneman1983; Macchi Reference Macchi2000; Mellers & McGraw Reference Mellers and McGraw1999; Sloman et al. Reference Sloman, Over, Slovak and Stibel2003). Unlike the other theories, it predicts that facilitation should be observable in a variety of different tasks, not just posterior probability problems, when nested set relations are made transparent.

2. Overview of empirical and conceptual issues reviewed

We now turn to an evaluation of these five theoretical frameworks. We evaluate a range of empirical and conceptual issues that bear on the validity of these frameworks.

2.1. Review of empirical literature

The theories are evaluated with respect to the empirical predictions summarized in Table 2. The predictions of each theory derive from (1) the degree of cognitive control attributed to probability judgment (see Table 1), and (2) the proposed cognitive operations that underlie estimates of probability.

Table 2. Empirical predictions of the five theoretical frameworks

Note. The predictions of each theory are indicated by an ‘X.’

Theories that adopt a low degree of cognitive control – proposing cognitively impenetrable modules or informationally encapsulated algorithms – restrict Bayesian inference to contexts that satisfy the assumptions of the processing module or algorithm. In contrast, theories that adopt a high degree of cognitive control – appealing to a natural frequency heuristic or a domain general capacity to perform set operations – predict Bayesian inference in a wider range of contexts. The latter theories are distinguished from one another in terms of the cognitive operations they propose: The evolutionary and non-evolutionary natural frequency heuristics depend on structural features of the problem, such as question form and reference class. They imply the accurate encoding and comprehension of natural frequencies and an accurate weighting of the encoded event frequencies to calculate the Bayesian ratio. In contrast, the nested sets theory does not rely on natural frequencies and, instead, predicts facilitation in Bayesian inference, and in a range of other deductive and inductive reasoning tasks, when the set structure of the problem is made transparent, thereby promoting use of elementary set operations and inferences about the logical (i.e., extensional) properties they entail.

2.2. Information format and judgment domain

The preceding review of the literature found that natural frequency formats consistently reduced base-rate neglect relative to probability formats. However, the size of this effect varied considerably across studies (see Table 3).

Table 3. Percent correct for Bayesian inference problems reported in the literature (sample sizes in parentheses)

Note. Probability problems require that the respondent compute a conditional-event probability from data presented in a non-partitive form, whereas frequency problems include questions that prompt the respondent to evaluate the two terms of the Bayesian ratio and present data that is partitioned into these components.

p > 0.05

Cosmides and Tooby (Reference Cosmides and Tooby1996), for example, observed a 60-point % difference between the proportions of Bayesian responses under natural frequencies versus single-event probabilities, whereas Gigerenzer and Hoffrage (Reference Gigerenzer and Hoffrage1995) reported a difference only half that size. The wide variability in the size of the effects makes it clear that in no sense do natural frequencies eliminate base-rate neglect, though they do reduce it.

Sloman et al. (Reference Sloman, Over, Slovak and Stibel2003) conducted a series of experiments that attempted to replicate the effect sizes observed by the previous studies (e.g., Cosmides & Tooby Reference Cosmides and Tooby1996; Experiment 2, Condition 1). Although Sloman et al. found facilitation with natural frequencies, the size of the effect was smaller than that observed by Cosmides and Tooby: The percent of Bayesian solutions generated under single-event probabilities (20%) was comparable to Cosmides and Tooby (12%), but the percentage of Bayesian answers generated under natural frequencies was smaller (i.e., 72% versus 51% for Sloman et al.). In a further replication, Sloman et al. found that only 31% of their respondents generated the Bayesian solution, a statistically non-significant advantage for natural frequencies.

Evans et al. (Reference Evans, Handley, Perham, Over and Thompson2000, Experiment 1) similarly found only a small effect of information format. They report 24% Bayesian solutions under single-event probabilities and 35% under natural frequencies, a difference that was not reliable.

Brase et al. (Reference Brase, Fiddick and Harries2006) examined whether methodological factors contribute to the observed variability in effect size. They identified two factors that modulate the facilitory effect of natural frequencies in Bayesian inference: (1) the academic selectivity of the university the participants attend, and (2) whether or not the experiment offered a monetary incentive for participation. Experiments whose participants attended a top-tier national university and were paid reported a significantly higher proportion of Bayesian responses (e.g., Cosmides & Tooby Reference Cosmides and Tooby1996) than experiments whose participants attended a second-tier regional university and were not paid (e.g., Brase et al. Reference Brase, Fiddick and Harries2006, Experiments 3 and 4). These results suggest that a higher proportion of Bayesian responses is observed in experiments that (a) select participants with a higher level of general intelligence, as indexed by the academic selectivity of the university the participant attends (Stanovich & West Reference Stanovich and West1998a), and (b) increase motivation by providing a monetary incentive. The former observation is consistent with the view that Bayesian inference depends on domain general cognitive processes to the degree that intelligence is domain general. The latter suggests that Bayesian inference is strategic, and not supported by automatic (e.g., modularized) reasoning processes.

2.3. Question form

One methodological factor that may mediate the effect of problem format is the form of the Bayesian inference question presented to participants (Girotto & Gonzalez Reference Girotto and Gonzalez2001). The Bayesian solution expresses the ratio between the size of the subset of cases in which the hypothesis and observation co-occur and the total number of observations. Thus, it follows that the respondent should be more likely to arrive at this solution when prompted to adopt an outside view by utilizing the sample of category instances presented in the problem (e.g., “Here is a new sample of patients who have obtained a positive test result in routine screening. How many of these patients do you expect to actually have the disease? __ out of __”) versus a question that presents information about category properties (e.g., “… Pierre has a positive reaction to the test …”) and prompts the respondent to adopt an inside view by considering the fact about Pierre to compute a probability estimate. As a result, the form of the question should modulate the observed facilitation.

In the preceding studies, however, information format and judgment domain were confounded with question form: Only problems that presented natural frequencies prompted use of the sample of category instances presented in the problem to compute the two terms of the Bayesian solution, whereas single-event probability problems prompted the use of category properties to compute a conditional probability.

To dissociate these factors, Girotto and Gonzalez (Reference Girotto and Gonzalez2001) proposed that single-event probabilities (e.g., 1%) can be represented as chancesFootnote 4 (e.g., “One chance out of 100”). Under the chance formulation of probability, the respondent can be asked either for the standard conditional probability or for values that correspond more closely to the ratio expressed by Bayes' theorem. The latter question asks the respondent to evaluate the chances that Pierre has a positive test for a particular infection, out of the total chances that Pierre has a positive test, thereby prompting consideration of the chances that Pierre – who could be anyone with a positive test in the sample – has the infection. In addition to encouraging an outside view by prompting the respondent to represent the sample of category instances presented in the problem, this question prompts the computation of the Bayesian ratio in two clearly defined steps: First calculate the overall number of chances where the conditioning event is observed, then compare this quantity to the number of chances where the conditioning event is observed in the presence of the hypothesis.

To evaluate the role of question form in Bayesian inference, Girotto and Gonzalez (Reference Girotto and Gonzalez2001, Study 1) conducted an experiment that manipulated question form independently of information format and judgment domain. The authors presented the following Bayesian inference scenario to 80 college undergraduates of the University of Provence, France:

A person who was tested had 4 chances out of 100 of having the infection. 3 of the 4 chances of having the infection were associated with a positive reaction to the test. 12 of the remaining 96 chances of not having the infection were also associated with a positive reaction to the test (Girotto & Gonzalez Reference Girotto and Gonzalez2001, p. 253).

Half of the respondents were then asked to compute a conditional probability (i.e., “If Pierre has a positive reaction, there will be __ chance(s) out of __ that the infection is associated with his positive reaction”), whereas the remaining respondents were asked to evaluate the ratio of probabilities expressed in the Bayesian solution (i.e., “Imagine that Pierre is tested now. Out of the total 100 chances, Pierre has __ chances of having a positive reaction, __ of which will be associated with having the infection”).

Girotto and Gonzalez (Reference Girotto and Gonzalez2001) found that only 8% of the respondents generated the Bayesian solution when asked to compute a conditional probability, consistent with the earlier literature. But the proportion of Bayesian answers increased to 43% when the question prompted the respondent to evaluate the two terms of the Bayesian solution. The same pattern was observed with the natural frequency format problem. Only 18% of the respondents generated the Bayesian solution when asked to compute a conditional frequency, whereas this proportion increased to 58% when asked to evaluate the two terms separately. This level of performance is comparable to that observed under standard natural frequency formats (e.g., Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995), and supports Girotto and Gonzalez's claim that the two-step question approximates the question asked with standard natural frequency formats. In further support of Girotto and Gonzalez's predictions, there were no reliable effects of information format or judgment domain across all the reported comparisons.

These findings suggest that people are not predisposed against using single-event probabilities but instead appear to be highly sensitive to the form of the question: When asked to reason about category instances to compute the two terms of the Bayesian ratio, respondents were able to draw the normative solution under single-event probabilities. Facilitation in Bayesian inference under natural frequencies need not imply that the mind is designed to process these formats, but instead can be attributed to the facilitory effect of prompting use of the sample of category instances presented in the problem to evaluate the two terms of the Bayesian ratio.

2.4. Reference class

To assess the role of problem structure in Bayesian inference, we review studies that have manipulated structural features of the problem. Girotto and Gonzalez (Reference Girotto and Gonzalez2001) report two experiments that systematically assess performance under different partitionings of the data: Defective frequency partitions and non-partitive frequency problems. Consider the following medical diagnosis problem, which presents natural frequencies under what Girotto and Gonzalez (Reference Girotto and Gonzalez2001, Study 5) term a defective partition:

4 out of 100 people tested were infected. 3 of the 4 infected people had a positive reaction to the test. 84 of the 96 uninfected people did not have a positive reaction to the test. Imagine that a group of people is now tested. In a group of 100 people, one can expect __ individuals to have a positive reaction, __ of whom will have the infection.

In contrast to the standard partitioning of the data under natural frequencies, here the frequency of uninfected people who did not have a positive reaction to the test is reported, instead of the frequency of uninfected, positive reactions. As a result, to derive the Bayesian solution, the first value must be subtracted from the total population of uninfected individuals to obtain the desired value (96 – 84=12), and the result can be used to determine the proportion of infected, positive people out of the total number of people who obtain a positive test (e.g., 3/15=0.2). Although this problem exhibits a partitive structure, Girotto and Gonzalez predicted that the defective partitioning of the data would produce a greater proportion of errors than observed under the standard data partitioning, because the former requires an additional computation. Consistent with this prediction, only 35% of respondents generated the Bayesian solution, whereas 53% did so under the standard data partitioning. Nested set relations were more likely to facilitate Bayesian reasoning when the data were partitioned into the components that are needed to generate the solution.

Girotto and Gonzalez (Reference Girotto and Gonzalez2001, Study 6) also assessed performance under natural frequency formats that were not partitioned into nested set relations (i.e., unpartitioned frequencies). As in the case of standard natural frequency format problems (e.g., Cosmides & Tooby Reference Cosmides and Tooby1996), these multiple-sample problems employed natural frequencies and prompted the respondent to compute the two terms of the Bayesian solution.Footnote 5 Such a problem must be treated in the same way as a single-event probability problem (i.e., using the conditional probability and additivity laws) to determine the two terms of the Bayesian solution. Girotto and Gonzalez therefore predicted that performance under multiple samples would be poor, approximating the performance observed under standard probability problems. As predicted, none of the respondents generated the Bayesian solution under the multiple sample or standard single-event probability frames. Natural frequency formats facilitate Bayesian inference only when they partition the data into components needed to draw the Bayesian solution.

Converging evidence is provided by Macchi (Reference Macchi2000), who presented Bayesian inference problems in either a partitive or non-partitive form. Macchi found that only 3% of respondents generated the Bayesian solution when asked to evaluate the two terms of the Bayesian ratio with non-partitive frequency problems. Similarly, only 6% of the respondents generated the Bayesian solution when asked to compute a conditional probability under non-partitive probability formats (see also Sloman et al. Reference Sloman, Over, Slovak and Stibel2003, Experiment 4). But when presented under a partitive formulation and asked to evaluate the two terms of the Bayesian ratio the proportions increased to 40% under partitive natural frequency formats, 33% under partitive single-event probabilities, and 36% under the modified partitive single-event probability problems. The findings reinforce the nested sets view that information structure is the factor determining predictive accuracy.

To further explore the contribution of information structure and question form in Bayesian inference, Sloman et al. (Reference Sloman, Over, Slovak and Stibel2003) assessed performance using a conditional chance question. In contrast to the standard conditional probability question that presents information about a particular individual (e.g., “Pierre has a positive reaction to the test”), their conditional probability question asked the respondent to evaluate “the chance that a person found to have a positive test result actually has the disease.” This question requests the probability of an unknown category instance and therefore prompts the respondent to consult the data presented in the problem to assess the probability that this person – who could be any randomly chosen person with a positive result in the sample – has the disease. In Experiment 1, Sloman et al. looked for facilitation in Bayesian inference on a partitive single-event probability problem by prompting use of the sample of category instances presented in the problem to compute a conditional probability, as the nested sets hypothesis predicts. Forty-eight percent of the 48 respondents tested generated the Bayesian solution, demonstrating that making partitive structure transparent facilitates Bayesian inference.

In summary, the reviewed findings suggest that when the data are partitioned into the components needed to arrive at the solution and participants are prompted to use the sample of category instances in the problem to compute the two terms of the Bayesian ratio, the respondent is more likely to (1) understand the question, (2) see the underlying nested set structure by partitioning the data into exhaustive subsets, and (3) select the pieces of evidence that are needed for the solution. According to the nested sets theory, accurate probability judgments derive from the ability to perform elementary set operations whose computations are facilitated by external cues (for recent developmental evidence, see Girotto & Gonzalez, in press).

2.5. Diagrammatic representations

Sloman et al. (Reference Sloman, Over, Slovak and Stibel2003, Experiment 2) explored whether Euler circles, which were employed to construct a nested set structure for standard non-partitive single-event probability problems (e.g., Cosmides & Tooby Reference Cosmides and Tooby1996), would facilitate Bayesian inference (see Fig. 1 here). These authors found that 48% of the 25 respondents tested generated the Bayesian solution when presented non-partitive single-event probability problems with an Euler diagram that depicted the underlying nested set relations. This finding demonstrates that the set structure of standard non-partitive single-event probability problems can be represented by Euler diagrams to produce facilitation. Supporting data can be found in Yamagishi (Reference Yamagishi2003) who used diagrams to make nested set relations transparent in other inductive reasoning problems. Similar evidence is provided by Bauer and Johnson-Laird (Reference Bauer and Johnson-Laird1993) in the context of deductive reasoning.

Figure 1. A diagrammatic representation of Bayes theorem: Euler circles (Sloman et al., Reference Sloman, Over, Slovak and Stibel2003).

2.6. Accuracy of frequency judgments

Theories based on natural frequency representations (i.e., the mind-as-Swiss-army-knife, natural frequency algorithm, natural frequency heuristic, and non-evolutionary natural frequency heuristic theories) propose that “the mind is a frequency monitoring device” and that the cognitive algorithm that computes the Bayesian ratio encodes and processes event frequencies in naturalistic settings (Gigerenzer Reference Gigerenzer, Keren and Lewis1993, p. 300). The literature that evaluates the encoding and retrieval of event frequencies is large and extensive and includes assessments of frequency judgments under well-controlled laboratory settings based on relatively simple and distinct stimuli (e.g., letters, pairs of letters, or words), and naturalistic settings in which respondents report the frequency of their own behaviors (e.g., the medical diagnosis of patients). Laboratory studies tend to find that frequency judgments are surprisingly accurate (for a recent review, see Zacks & Hasher Reference Zacks, Hasher, Sedlmeier and Betsch2002), whereas naturalistic studies often find systematic errors in frequency judgments (see Bradburn et al. Reference Bradburn, Rips and Shevell1987). Recent efforts have been made to integrate these findings under a unified theoretical framework (e.g., Schwartz & Sudman Reference Schwartz and Sudman1994; Schwartz & Wanke Reference Schwartz, Wanke, Sedlmeier and Betsch2002; Sedlmeier & Betsch Reference Sedlmeier and Betsch2002).

Are frequency judgments relatively accurate under the naturalistic settings described by standard Bayesian inference problems? Bayesian inference problems tend to involve hypothetical situations that, if real, would be based on autobiographical memories encoded under naturalistic conditions, such as the standard medical diagnosis problem in which a particular set of patients is hypothetically encountered (cf. Sloman & Over Reference Over and Over2003). Hence, the present review focuses on the accuracy of frequency judgments for the autobiographical events alluded to by standard Bayesian inference problems (see sects. 2.1, 2.2, and 2.3) to assess whether Bayesian inference depends on the accurate encoding of autobiographical events.

Gluck and Bower (Reference Gluck and Bower1988) conducted an experiment that employed a learning paradigm to assess the accuracy of frequency judgments in medical diagnosis. The respondents in the experiment learned to diagnose a rare (25%) or a common (75%) disease on the basis of four potential symptoms exhibited by the patient (e.g., stomach cramps, discolored gums). During the learning phase, the respondents diagnosed 250 hypothetical patients and in each case were provided feedback on the accuracy of their diagnosis. After the learning phase, the respondents estimated the relative frequency of patients who had the diseases given each symptom. Gluck and Bower found that relative frequency estimates of the disease were determined by the diagnosticity of the symptom (the degree to which the respondent perceived that the symptom provided useful information in diagnosing the disease) and not the base-rate frequencies of the disease. These findings were replicated by Estes et al. (Reference Estes, Campbell, Hatsopoulos and Hurwitz1989, Experiment 1) and Nosofsky et al. (Reference Nosofsky, Kruschke and McKinley1992, Experiment 1).

Bradburn et al. (Reference Bradburn, Rips and Shevell1987) evaluated the accuracy of autobiographical memory for event frequencies by employing a range of surveys that assessed quantitative facts, such as “During the last two weeks, on days when you drank liquor, about how many drinks did you have?” These questions require the simple recall of quantitative facts, in which the respondent “counts up how many individuals fall within each category” (Cosmides & Tooby Reference Cosmides and Tooby1996, p. 60). Recalling the frequency of drinks consumed over the last two weeks, for example, is based on counting the total number of individual drinking occasions stored in memory.

Bradburn et al. (Reference Bradburn, Rips and Shevell1987) found that autobiographical memory for event frequencies exhibits systematic errors characterized by (a) the failure to recall the entire event or the loss of details associated with a particular event (e.g., Linton Reference Linton, Norman and Rumelhart1975; Wagenaar Reference Wagenaar1986), (b) the combining of similar distinct events into a single generalized memory (e.g., Linton Reference Linton, Norman and Rumelhart1975; Reference Linton and Neisser1982), or (c) the inclusion of events that did not occur within the reference period specified in the question (e.g., Pillemer et al. Reference Pillemer, Rhinehart and White1986). As a result, Bradburn et al. propose that the observed frequency judgments do not reflect the accurate encoding of event frequencies, but instead entail a more complex inferential process that typically operates on the basis of incomplete, fragmentary memories that do not preserve base-rate frequencies.

These findings suggest that the observed facilitation in Bayesian inference under natural frequencies cannot be explained by an (evolved) capacity to encode natural frequencies. Apparently, people don't have that capacity.

2.7. Comprehension of formats

Advocates of the nested sets view have argued that the facilitation of Bayesian inference under natural frequencies can be fully explained via elementary set operations that deliver the same result as Bayes' theorem, without appealing to (an evolved) capacity to process natural frequencies (e.g., Johnson-Laird et al. Reference Johnson-Laird, Legrenzi, Girotto, Legrenzi and Caverni1999). The question therefore arises whether the ease of processing natural frequencies goes beyond the reduction in computational complexity of Bayes' theorem that they provide (Brase Reference Brase2002a). To assess this issue, we review evidence that evaluates whether natural frequencies are understood more easily than single-event probabilities.

Brase (Reference Brase2002b) conducted a series of experiments to evaluate the relative clarity and ease of understanding a range of statistical formats, including natural frequencies (e.g., 1 out of 10) and percentages (e.g., 10%). Brase distinguished natural frequencies that have a natural sampling structure (e.g., 1 out 10 have the property, 9 out of 10 do not) from “simple frequencies” that refer to single numerical relations (e.g., 1 out of 10 have the property). This distinction, however, is not entirely consistent with the literature, as natural frequency theorists have often used single numerical statements for binary hypotheses to express natural frequencies (e.g., Zue & Gigerenzer Reference Gigerenzer2006). In any case, for binary hypotheses the natural sampling structure can be directly inferred from simple frequencies. If we observe, for example, that I win the weekly poker game “1 out of 10 nights,” we can infer that I lose “9 out of 10 nights” and construct a natural sampling structure that represents the size of the reference class and is arranged into subset relations. Thus, single numerical statements of this type have a natural sampling structure, and, therefore, we refer to Brase's “simple frequencies” as natural frequencies in the following discussion.

Percentages express single-event probabilities in that they are normalized to an arbitrary reference class (e.g., 100) and can refer to the likelihood of a single-event (Brase Reference Brase2002b; Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995). We therefore examine whether natural frequencies are understood more easily and have a greater impact on judgment than percentages.

To test this prediction, Brase (Reference Brase2002b, Experiment 1) assessed the relative clarity of statistical information presented in a natural frequency format versus percentage format at small, intermediate, and large magnitudes. Respondents received four statements in one statistical format, each statement at a different magnitude, and rated the clarity, impressiveness, and “monetary pull” of the presented statistics according to a 5-point scale. Example questions are shown in Table 4.

Table 4. Example questions presented by Brase (2002b)

Brase (Reference Brase2002b) found that across all statements and magnitudes both natural frequencies and percentages were rated as “Very Clear,” with average ratings of 3.98 and 3.89, respectively. These ratings were not reliably different, demonstrating that percentages are perceived as clearly and are as understandable as natural frequencies. Furthermore, Brase found no reliable differences in the impressiveness ratings (from question 2) of natural frequencies and percentages at intermediate and large statistical magnitudes, suggesting that these formats are typically viewed as equally impressive. A significant difference between these formats was observed, however, at low statistical magnitudes: On average, natural frequencies were rated as “Impressive,” whereas percentages were viewed as “Fairly Impressive.” The observed difference in the impressiveness ratings at low statistical magnitudes did not accord with the respondent's monetary pull ratings – their willingness to allocate funds to support research studying the issue at hand – which were approximately equal for the two formats across all statements and magnitudes. Hence the difference in the impressiveness ratings at low magnitudes does not denote differences in people's willingness to act.

These data are consistent with the conclusion that percentages and natural frequency formats (a) are perceived equally clearly and are equally understandable; (b) are typically viewed as equally impressive (i.e., at intermediate and large statistical magnitudes); and (c) have the same degree of impact on behavior. Natural frequency formats do apparently increase the perceptual contrast of small differences. Overall, however, the two formats are perceived similarly, suggesting that the mind is not designed to process natural frequency formats over single-event probabilities.

2.8. Are base-rates and likelihood ratios equally weighted?

Does the facilitation of Bayesian inference under natural frequencies entail that the mind naturally incorporates this information according to Bayes' theorem, or that elementary set operations can be readily computed from problems that are structured in a partitive form? Natural frequencies preserve the sample size of the reference class and are arranged into subset relations that preserve the base-rates. As a result, judgments based on these formats will entail the sample and effect sizes; the respondent need not calculate them. To assess whether the cognitive operations that underlie Bayesian inference are consistent with the application of Bayes' theorem, studies that evaluate how the respondent derives Bayesian solutions are reviewed.

Griffin and Buehler (Reference Griffin and Buehler1999) employed the classic lawyer-engineer paradigm developed by Kahneman and Tversky (Reference Kahneman and Frederick1973), involving personality descriptions randomly drawn from a population of either 70 engineers and 30 lawyers or 30 engineers and 70 lawyers. Participants' task in this study is to predict whether the description was taken from an engineer or a lawyer (e.g., “My probability that this man is one of the engineers in this sample is __%”). Kahneman and Tversky's original findings demonstrated that the respondent consistently relied upon category properties (i.e., how representative the personality description is of an engineer or a lawyer) to guide their judgment, without fully incorporating information about the population base-rates (for a review, see Koehler Reference Koehler1996). However, when the base-rates were presented via a counting procedure that induces a frequentist representation of each population and the respondent is asked to generate a natural frequency prediction (e.g., “I would expect that __ out of the 10 descriptions would be engineers”), base-rate usage increased (Gigerenzer et al. Reference Gigerenzer, Hell and Blank1988).

To assess whether the observed increase in base-rate usage reflects the operation of a Bayesian algorithm that is designed to process natural frequencies, Griffin and Buehler (Reference Griffin and Buehler1999) evaluated whether participants derived the solution by utilizing event frequencies according to Bayes' theorem. This was accomplished by first collecting estimates of each of the components of Bayes' theorem in odds formFootnote 6: Respondents estimated (a) the probability that the personality description was taken from the population of engineers or lawyers; (b) the degree to which the personality description was representative of these populations; and (c) the perceived population base-rates. Each of these estimates was then divided by their compliment to yield the posterior odds, likelihood ratio, and prior odds, respectively. Theories based on the Bayesian ratio predict that under frequentist representations, the likelihood ratios and prior odds will be weighted equally (Griffin & Buehler Reference Griffin and Buehler1999).

Griffin and Buehler evaluated this prediction by conducting a regression analysis using the respondent's estimated likelihood ratios and prior odds to predict their posterior probability judgments (cf. Keren & Thijs Reference Keren and Thijs1996). Consistent with the observed increase in base-rate usage under frequentist representations (Gigerenzer et al. Reference Gigerenzer, Hell and Blank1988), Griffin and Buehler (Reference Griffin and Buehler1999, Experiment 3b) found that the prior odds (i.e., the base-rates) were weighted more heavily than the likelihood ratios, with corresponding regression weights (β values) of 0.62 and 0.39. The failure to weight them equally violates Bayes' theorem. Although frequentist representations may enhance base-rate usage, they apparently do not induce the operation of a mental analogue of Bayes' theorem.

Further support for this conclusion is provided by Evans et al. (Reference Evans, Handley, Over and Perham2002) who conducted a series of experiments demonstrating that probability judgments do not reflect equal weighting of the prior odds and likelihood ratio. Evans et al. (Reference Evans, Handley, Over and Perham2002, Experiment 5) employed a paradigm that extended the classic lawyer-engineer experiments by assessing Bayesian inference under conditions where the base-rates are supplied by commonly held beliefs and only the likelihood ratios are explicitly provided. These authors found that when prior beliefs about the base-rate probabilities were rated immediately before the presentation of the problem, the prior odds (i.e., the base-rates) were weighted more heavily than the likelihood ratios, with corresponding regression weights (β values) of 0.43 and 0.19.

Additional evidence supporting this conclusion is provided by Kleiter et al. (Reference Kleiter, Krebs, Doherty, Gavaran, Chadwick and Brake1997) who found that participants assessing event frequencies in a medical diagnosis setting employed statistical evidence that is irrelevant to the calculation of Bayes' theorem. Kleiter et al. (Reference Kleiter, Krebs, Doherty, Gavaran, Chadwick and Brake1997, Experiment 1) presented a list of event frequencies to respondents, which included those that were necessary for the calculation of Bayes' theorem (e.g., Pr(D | H)) and other statistics that were irrelevant (e.g., Pr(~D)). Participants were then asked to identify the event frequencies that were needed to diagnose the probability of the disease, given the symptom (i.e., the posterior probability). Of the four college faculty and 26 graduate students tested, only three people made the optimal selection by identifying only the event frequencies required to calculate Bayes' theorem.

These data suggest that the mind does not utilize a Bayesian algorithm that “maps frequentist representations of prior probabilities and likelihoods onto a frequentist representation of a posterior probability in a way that satisfies the constraints of Bayes' theorem” (Cosmides & Tooby Reference Cosmides and Tooby1996, p. 60). Importantly, the findings that the prior odds and likelihood ratio are not equally weighted according to Bayes' theorem (Evans et al. Reference Evans, Handley, Over and Perham2002; Griffin & Buehler Reference Griffin and Buehler1999) imply that Bayesian inference does not rely on Bayesian computations per se.

Thus, the findings are inconsistent with the mind-as-Swiss-army-knife, natural frequency algorithm, natural frequency heuristic, and non-evolutionary natural frequency heuristic theories, which propose that coherent probability judgment reflects the use of the Bayesian ratio. The finding that base-rate usage increases under frequentist representations (Evans et al. Reference Evans, Handley, Over and Perham2002; Griffin & Buehler Reference Griffin and Buehler1999) supports the proposal that the facilitation in Bayesian inference from natural frequency formats is due to the property of these formats to induce a representation of category instances that preserves the sample and effect sizes, thereby clarifying the underlying set structure of the problem and making the relevance of base-rates more obvious without providing an equation that generates Bayesian quantities.

2.9. Convergence with disparate data

A unique characteristic of the dual process position is that it predicts that nested sets should facilitate reasoning whenever people tend to rely on associative rather than extensional, rule-based processes; facilitation should be observed beyond the context of Bayesian probability updating. The natural frequency theories expect facilitation only in the domain of probability estimation.

In support of the nested sets position, facilitation through nested set representations has been observed in a number of studies of deductive inference. Grossen and Carnine (Reference Grossen and Carnine1990) and Monaghan and Stenning (Reference Monaghan, Stenning, Gernsbacher and Derry1998) reported significant improvement in syllogistic reasoning when participants were taught using Euler circles. The effect was restricted to participants who were “learning impaired” (Grossen & Carnine Reference Grossen and Carnine1990) or had a low GRE score (Monaghan & Stenning Reference Monaghan, Stenning, Gernsbacher and Derry1998). Presumably, those that did not show improvement did not require the Euler circles because they were already representing the nested set relations.

Newstead (Reference Newstead1989, Experiment 2) evaluated how participants interpreted syllogisms when represented by Euler circles versus quantified statements. Newstead found that although Gricean errors of interpretation occurred when syllogisms were represented by Euler circles and quantified statements, the proportion of conversion errors, such as converting “Some A are not B” to “Some B are not A,” was significantly reduced in the Euler circle task. For example, less than 5% of the participants generated a conversion error for “Some … not” on the Euler circle task, whereas this error occurred on 90% of the responses for quantified statements.

Griggs and Newstead (Reference Griggs and Newstead1982) tested participants on the THOG problem, a difficult deductive reasoning problem involving disjunction. They obtained a substantial amount of facilitation by making the problem structure explicit, using trees. According to the authors, the structure is normally implicit due to negation and the tree structure facilitates performance by cuing formation of a mental model similar to that of nested sets.

Facilitation has also been obtained by making extensional relations more salient in the domain of categorical inductive reasoning. Sloman (Reference Sloman1998) found that people who were told that all members of a superordinate have some property (e.g., all flowers are susceptible to thrips), did not conclude that all members of one of its subordinates inherited the property (e.g., they did not assert that this guaranteed that all roses are susceptible to thrips). This was true even for those people who believed that roses are flowers. But if the assertion that roses are flowers was included in the argument, then people did abide by the inheritance rule, assigning a probability of one to the statement about roses. Sloman argued that this occurred because induction is mediated by similarity and not by class inclusion, unless the categorical – or set – relation is made transparent within the statement composing the argument (for an alternative interpretation, see Calvillo & Revlin Reference Calvillo and Revlin2005).

Facilitation in other types of probability judgment can also be obtained by manipulating the salience and structure of set relations. Sloman et al. (Reference Sloman, Over, Slovak and Stibel2003) found that almost no one exhibited the conjunction fallacy when the options were presented as Euler circles, a representation that makes set relations explicit. Fox and Levav (Reference Fox and Levav2004) and Johnson-Laird et al. (Reference Johnson-Laird, Legrenzi, Girotto, Legrenzi and Caverni1999) also improved judgments on probability problems by manipulating the set structure of the problem.

2.10. Empirical summary and conclusions

In summary, the empirical review supports five main conclusions. First, the facilitory effect of natural frequencies on Bayesian inference varied considerably across the reviewed studies (see Table 3), potentially resulting from differences in the general intelligence level and motivation of participants (Brase et al. Reference Brase, Fiddick and Harries2006). These findings support the nested sets hypothesis to the degree that intelligence and motivation reflect the operation of domain general and strategic – rather than automatic (i.e., modular) – cognitive processes.

Second, questions that prompt use of category instances and divide the solution into the sets needed to compute the Bayesian ratio, facilitate probability judgment. This suggests that facilitation depends on cues to the set structure of the problem rather than (an evolved) capacity to process natural frequencies. In further support of this conclusion, partitioning the data into nested sets facilitates Bayesian inference regardless of whether natural frequencies or single-event probabilities are employed (see Table 5).

Table 5. Percent correct for Bayesian inference problems reported in the literature (sample sizes in parentheses)

Note. Studies that present questions that require the respondent to compute a conditional-event probability are indicated by*. The remaining studies present questions that prompt the respondent to compute the two terms of the Bayesian solution.

Third, frequency judgments are guided by inferential strategies that reflect incomplete, fragmentary memories that do not entail the base-rates (e.g., Bradburn et al. Reference Bradburn, Rips and Shevell1987; Gluck & Bower Reference Gluck and Bower1988). This suggests that Bayesian inference does not derive from the accurate encoding and retrieval of natural frequencies. In addition, natural frequencies and single-event probabilities are rated similarly in their perceived clarity, understandability, and impact on the respondent's behavior (Brase Reference Brase2002b), further suggesting that the mind does not embody inductive reasoning mechanisms (that are designed) to process natural frequencies.

Fourth, people (a) do not accurately weight and combine event frequencies, and (b) utilize event frequencies that are irrelevant in the calculation of Bayes' theorem (e.g., Griffin & Buehler Reference Griffin and Buehler1999; Kleiter et al. Reference Kleiter, Krebs, Doherty, Gavaran, Chadwick and Brake1997). This suggests that the cognitive operations that underlie Bayesian inference do not conform to Bayes' theorem. Furthermore, base-rate usage increases under frequentist representations (e.g., Griffin & Buehler Reference Griffin and Buehler1999), suggesting that facilitation results from the property of natural frequencies to represent the sample and effect sizes, which highlight the set structure of the problem and make transparent what is relevant for problem solving.

Finally, nested set representations facilitate reasoning in a range of classic deductive and inductive reasoning tasks. This supports the nested set hypothesis that the mind embodies a domain general capacity to perform elementary set operations and that these operations can be induced by cues to the set structure of the problem to facilitate reasoning in any context where people tend to rely on associative rather than extensional, rule-based processes.

3. Conceptual issues

This section provides a conceptual analysis that addresses (1) the plausibility of the natural frequency assumptions, and (2) whether natural frequency representations support properties that are central to human inductive reasoning competence, including reasoning about statistical independence, estimating the probability of unique events, and reasoning on the basis of similarity, analogy, association, and causality.

3.1. Plausibility of natural frequency assumptions

The natural sampling framework was established by the seminal work of Kleiter (Reference Kleiter, Fischer and Laming1994), who assessed “the correspondence between the constraints of the statistical model of natural sampling on the one hand, and the constraints under which human information is acquired on the other” (p. 376). Kleiter proved that under natural sampling and other conditions (e.g., independent identical sampling), the frequencies corresponding to the base-rates are redundant and can be ignored. Thus, conditions of natural sampling can simplify the calculation of the relevant probabilities and, as a consequence, facilitate Bayesian inference (see Note 2 of the target article). Kleiter's computational argument does not appeal to evolution and was advanced with careful consideration of the assumptions upon which natural sampling are based. Kleiter noted, for example, that the natural sampling framework (a) is limited to hypotheses that are mutually exclusive and exhaustive, and (b) depends on collecting a sufficiently large sample of event frequencies to reliably estimate population parameters.

Although people may sometimes treat hypotheses as mutually exclusive (e.g., “this person is a Democrat, so they must be anti-business”), this constraint is not always satisfied: many hypotheses are nested (e.g., “she has breast cancer” vs. “she has a particular type of breast cancer”) or overlapping (e.g., “this patient is anxious or depressed”). People's causal models typically provide a wealth of knowledge about classes and properties, allowing consideration of many kinds of hypotheses that do not necessarily come in mutually exclusive, exhaustive sets. As a consequence, additional principles are needed to broaden the scope of the natural sampling framework to address probability estimates drawn from hypotheses that are not mutually exclusive and exhaustive. In this sense, the nested sets theory is more general: It can represent nested and overlapping hypotheses by taking the intersection (e.g., “she has breast cancer and it is type X”) and union (e.g., “the patient is anxious or depressed) of sets, respectively.

As Kleiter further notes, inferences about hypotheses from encoded event frequencies are warranted to the extent that the sample is sufficiently large and provides a reliable estimate of the population parameters. The efficacy of the natural sampling framework therefore depends on establishing (1) the approximate number of event frequencies that are needed for a reliable estimate, (2) whether this number is relatively stable or varies across contexts, and (3) whether or not people can encode and retain the required number of events.

3.2. Representing qualitative relations

In contrast to single-event probabilities, natural frequencies preserve information about the size of the reference class and, as a consequence, do not directly indicate whether an observation and hypothesis are statistically independent. For example, probability judgments drawn from natural frequencies do not tell us that a symptom present in (a) 640 out of 800 patients with a particular disease and (b) 160 out of 200 patients without the disease, is not diagnostic because 80% have the symptom in both cases (Over Reference Over2000a; Reference Over2000b; Over & Green Reference Over and Green2001; Sloman & Over Reference Over and Over2003). Thus, probability estimates drawn from natural frequencies do not capture important qualitative properties.

Furthermore, in contrast to the cited benefits of non-normalized representations (e.g., Gigerenzer & Hoffrage Reference Gigerenzer and Hoffrage1995), normalization may serve to simplify a problem. For example, is someone offering us the same proportion if he tries to pay us back with 33 out of 47 nuts he has gathered (i.e., 70%), after we have earlier given him 17 out of 22 nuts we have gathered (i.e., 77%)? This question is trivial after normalization, as it is transparent that 70 out of 77 out of 100 are nested sets (Over Reference Over and Roberts2007).

3.3. Reasoning about unique events and associative processes

One objection to the claim that the encoding of natural frequencies supports Bayesian inference is that intuitive probability judgment often concerns (a) beliefs regarding single events, or (b) the assessment of hypotheses about novel or partially novel contexts, for which prior event frequencies are unavailable. For example, the estimated likelihoods of specific outcomes are often based on novel and unique one-time events, such as the likelihood that a particular constellation of political interests will lead to a coalition. Hence, Kahneman and Tversky (Reference Kahneman and Frederick1996, p. 589) argue that the subjective degree of belief in hypotheses derived from single events or novel contexts “cannot be generally treated as a random sample from some reference population, and their judged probability cannot be reduced to a frequency count.”

Furthermore, theories based on natural frequency representations do not allow for the widely observed role of similarity, analogy, association, and causality in human judgment (for recent reviews of the contribution of these factors, see Gilovich et al. Reference Gilovich, Griffin and Kahneman2002 and Sloman Reference Sloman2005). The nested sets hypothesis presupposes these determinants of judgment by appealing to a dual-process model of judgment (Evans & Over Reference Evans and Over1996; Sloman Reference Sloman1996a; Stanovich & West Reference Stanovich2000), a move that natural frequency theorists are not (apparently) willing to make (Gigerenzer & Regier Reference Gigerenzer and Regier1996). The dual-process model attributes responses based on associative principles, such as similarity, or responses based on retrieval from memory, such as analogy, to a primitive associative judgment system. It attributes responses based on more deliberative processing involving rule-based inference, such as the elementary set operations that respect the logic of set inclusion and facilitate Bayesian inference, to a second deliberative system. However, this second system is not limited to analyzing set relations. It can also, under the right conditions, do the kinds of structural analyses required by analogical or causal reasoning.

Within this framework, natural frequency approaches can be viewed as making claims about rule-based processes (i.e., the application of a psychologically plausible rule for calculating Bayesian probabilities), without addressing the role of associative processes in Bayesian inference. In light of the substantial literatures that demonstrate the role of associative processes in human judgment, Kahneman and Tversky (Reference Kahneman and Frederick1996, p. 589) conclude, “there is far more to inductive reasoning and judgment under uncertainty than the retrieval of learned frequencies.”

4. Summary and conclusions

The conclusions drawn from the diverse body of empirical and conceptual issues addressed by this review consistently challenge theories of Bayesian inference that depend on natural frequency representations (see Table 2), demonstrating that coherent probability estimates are not derived according to an equational form for calculating Bayesian posterior probabilities that requires the use of such representations.

The evidence instead supports the nested sets hypothesis that judgmental errors and biases are attenuated when Bayesian inference problems are represented in a way that reveals underlying set structure, thereby demonstrating that the cognitive capacity to perform elementary set operations constitutes a powerful means of reducing associative influences and facilitating probability estimates that conform to Bayes' theorem. An appropriate representation can induce people to substitute reasoning by rules with reasoning by association. In particular, the review demonstrates that judgmental errors and biases were attenuated when (a) the question induced an outside view by prompting the respondent to utilize the sample of category instances presented in the problem, and when (b) the sample of category instances were represented in a nested set structure that partitioned the data into the components needed to compute the Bayesian solution.

Although we disagree with the various theoretical interpretations that could be attributed to natural frequency theorists regarding the architecture of mind, we do believe that they have focused on and enlightened us about an important phenomenon. Frequency formulations are a highly efficient way to obtain drastically improved reasoning performance in some cases. Not only is this an important insight to improve and teach reasoning, but it also focuses theorists on a deep and fundamental problem: What are the conditions that compel people to overcome their natural associative tendencies in order to reason extensionally? Open Peer Commentary: A statistical taxonomy and another “chance” for natural frequencies Barton, AdrienMousavi, Shabnam & Stevens, Jeffrey R. Center for Adaptive Behavior and Cognition, Max Planck Institute for Human Development, 14195 Berlin, Germany; http://www.mpib-berlin.mpg.de/en/forschung/abc/index.htm; Institut d'Histoire et de Philosophie des Sciences et des Techniques, Paris I, and CNRS/ENS – UMR 8590, 75006 Paris, France http://www.stat.psu.edu/people/faculty/smousavi.html/; Department of Statistics, The Pennsylvania State University, University Park, PA 16802. http://www.abc.mpib-berlin.mpg.de/users/jstevens/From base-rate to cumulative respect Beaman, C. Philip & McCloy, Rachel Department of Psychology, University of Reading – Earley Gate, Whiteknights, Reading RG6 6AL, United Kingdom. http://www.personal.rdg.ac.uk/~sxs98cpb/philip_beaman.htm http://www.psychology.rdg.ac.uk/people/lecturing/Dr_Rachel_McCloy.phpKissing cousins but not identical twins: The denominator neglect and base-rate respect models Brainerd, C. J. Departments of Human Development and Psychology and Cornell Law School, Cornell University, Ithaca, NY 148453. http://www.human.cornell.edu/che/bio.cfm?netid=cb299Omissions, conflations, and false dichotomies: Conceptual and empirical problems with the Barbey & Sloman account Brase, Gary L. Department of Psychology, Kansas State University, Manhattan, KS 66506-5302. Why frequencies are natural Butterworth, Brian Institute of Cognitive Neuroscience, University College London, London WC1N 3AR, United Kingdom. Nested sets and base-rate neglect: Two types of reasoning? Neys, Wim De Experimental Psychology Lab, University of Leuven, 3000 Leuven, Belgium. http://ppw.kuleuven.be/reason/wim/Dual-processing explains base-rate neglect, but which dual-process theory and how? Evans, Jonathan St. B. T. & Elqayam, Shira Centre for Thinking and Language, School of Psychology, University of Plymouth, Plymouth PL4 8AA, United Kingdom ; School of Applied Social Sciences, Faculty of Health and Life Sciences, De Montfort University, Leicester LE1 9BH, United Kingdom. http://www.plymouth.ac.uk/pages/dynamic.asp?page=staffdetails&id=jevans&size=lEnhancing sensitivity to base-rates: Natural frequencies are not enough Fantino, Edmund & Stolarz-Fantino, Stephanie Department of Psychology, University of California San Diego, La Jolla, CA 92093-0109. Ecologically structured information: The power of pictures and other effective data presentations Gaissmaier, WolfgangStraubinger, Nils & Funder, David C. Center for Adaptive Behavior and Cognition, Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin, Germany www.abc.mpib-berlin.mpg.de/users/gaissmaier www.abc.mpib-berlin.mpg.de./users/straubinger; Department of Psychology, University of California, Riverside, CA 92521. wwww.rap.ucr.eduThe role of representation in Bayesian reasoning: Correcting common misconceptions Gigerenzer, Gerd & Hoffrage, Ulrich Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin, Germany ; Ecole des Haute Etudes Commerciales (HEC), University of Lausanne, Batiment Internef, 1015 Lausanne, Switzerland. How to elicit sound probabilistic reasoning: Beyond word problems Girotto, Vittorio & Gonzalez, Michel Department of Arts and Industrial Design, University IUAV of Venice, Convento delle Terese, 30123 Venice, Italy http://www.iuav.it/English-Ve/Department/dADI-dep/Faculty-te/Vittorio-G/index.htm; Laboratory of Cognitive Psychology, University of Provence and CNRS, Centre St Charles, 13331 Marseilles, France. http://www.up.univ-mrs.fr/document.php?pagendx=3614&project=lpcFrequency formats are a small part of the base rate story Griffin, DaleKoehler, Derek J. & Brenner, Lyle Sauder School of Business, University of British Columbia, Vancouver, BC V6T 1Z2, Canada ; Department of Psychology, University of Waterloo, Waterloo, ON N2L 3G1, Canada ; Warrington College of Business, University of Florida, Gainesville, FL 32611. One wrong does not justify another: Accepting dual processes by fallacy of false alternatives Keren, GideonRooij, Iris van & Schul, Yaacov Department of Technology Management, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands ; Department of Psychology, The Hebrew University, Mount Scopus, Jerusalem 91905, Israel. Implications of natural sampling in base-rate tasks Kleiter, Gernot D. Department of Psychology, Salzburg University, A-5020 Salzburg, Austria. Dual concerns with the dualist approach Lagnado, David A. & Shanks, David R. Department of Psychology, University College London, London WC1E 6BT, United Kingdom. Ordinary people do not ignore base rates Laming, Donald University of Cambridge, Department of Experimental Psychology, Downing Street, Cambridge, CB2 3EB, United Kingdom. The underinformative formulation of conditional probability Macchi, Laura & Bagassi, Maria Department of Psychology, University of Milano-Bicocca, 20126 Milan, Italy. Nested sets theory, full stop: Explaining performance on Bayesian inference tasks without dual-systems assumptionsFootnote 1 Mandel, David R. Defence Research and Development Canada (Toronto), Toronto, ON M3M 3B9, Canada. http://mandel.socialpsychology.org/Naturally nested, but why dual process? Newell, Ben & Hayes, Brett School of Psychology, University of New South Wales, Sydney 2052, Australia. http://www2.psy.unsw.edu.au/Users/BNewellThe logic of natural sampling Over, David E. Psychology Department, Durham University, Science Laboratories, South Road, Durham City DH1 3LE, United Kingdom. The versatility and generality of nested set operations Patterson, Richard Department of Philosophy, Emory University, Atlanta, GA 30306. Converging evidence supports fuzzy-trace theory's nested sets hypothesis, but not the frequency hypothesis Reyna, Valerie F. & Mills, Britain Departments of Human Development, Psychology, and Cognitive Science, Cornell University, Ithaca, NY 14853; Department of Human Development, Cornell University, Ithaca, NY 14853. http://www.human.cornell.edu/che/HD/reyna/index.cfmVarieties of dual-process theory for probabilistic reasoning Samuels, Richard Philosophy Department, King's College, Strand, London, WC2R 2LS, United Kingdom. http://www.kcl.ac.uk/kis/schools/hums/philosophy/staff/r_samuels.htmlThe effect of base rate, careful analysis, and the distinction between decisions from experience and from description Schurr, Amos & Erev, Ido School of Education, The Hebrew University of Jerusalem, Jerusalem, 91905, Israel; Faculty of Industrial Engineering, Technion – Israel Institute of Technology, Haifa, 32000, Israel. http://ie.technion.ac.il/erev.phtmlBase-rate neglect and coarse probability representation Sun, Yanlong & Wang, Hongbin School of Health Information Sciences, The University of Texas Health Science Center at Houston, Houston, TX 77030. Implications of real-world distributions and the conversation game for studies of human probability judgments Thomas, John C. T. J. Watson Research Center, IBM, Yorktown Heights, NY 10598. http://www.truthtable.comWhy the empirical literature fails to support or disconfirm modular or dual-process models Trafimow, David Department of Psychology, MSC 3452, New Mexico State University, Las Cruces, NM 88003-8001. http://www.psych.nmsu.edu/faculty/trafimow.htmlThe motivated use and neglect of base rates Uhlmann, Eric LuisBrescoll, Victoria L. & Pizarro, David Department of Psychology, Yale University, New Haven, CT 06520 ; Department of Psychology, Cornell University, Ithaca, NY 14853. http://www.peezer.net/Home.htmlBase-rate respect meets affect neglect Whitney, PaulHinson, John M. & Matthews, Allison L. Department of Psychology, Washington State University, Pullman, WA 99164-4820. Adaptive redundancy, denominator neglect, and the base-rate fallacy Wolfe, Christopher R. Western College Program, Miami University, Oxford, OH 45056. http://tappan.wcp.muohio.edu/home/Base-rate respect: From statistical formats to cognitive structures Barbey, Aron K. & Sloman, Steven A. Department of Psychology, Emory University, Atlanta, GA 30322, ; Cognitive and Linguistics Science, Brown University, Providence, RI 02912 http://www.cog.brown.edu/~sloman/

ACKNOWLEDGMENTS

This work was supported by National Science Foundation Grants DGE-0536941 and DGE-0231900 to Aron K. Barbey. We are grateful to Gary Brase, Jonathan Evans, Vittorio Girotto, Philip Johnson-Laird, Gernot Kleiter, and David Over for their very helpful comments on prior drafts of this paper. Barbey would also like to thank Lawrence W. Barsalou, Sergio Chaigneau, Brian R. Cornwell, Pablo A. Escobedo, Shlomit R. Finkelstein, Carla Harenski, Corey Kallenberg, Patricia Marsteller, Robert N. McCauley, Richard Patterson, Diane Pecher, Philippe Rochat, Ava Santos, W. Kyle Simmons, Irwin Waldman, Christine D. Wilson, and Phillip Wolff for their encouragement and support while writing this paper.

Footnotes

1. The respondent's subjective degree of belief in the hypothesis (H) that the patient has breast cancer, given the observed datum (D) that she has a positive mammography (i.e., the posterior probability, Pr(H | D)) can be expressed numerically as the ratio between (a) the probability that the patient has the disease and obtains a positive mammogram (Pr(H ∩D)), and (b) the probability that the patient obtains a positive mammogram (Pr(D)). To calculate this ratio, Bayes' theorem incorporates two axioms of mathematical probability theory: the conditional probability and additivity laws. According to the former, (a) can be expressed by the probability that the patient has the disease (i.e., the base-rate of the hypothesis) multiplied by the probability that the patient obtains a positive mammogram, given that she has the disease (i.e., the hit-rate of the test): Pr(H ∩D)=Pr(H) Pr(D | H). The additivity rule is then applied to express (b) as the probability that the patient has the disease and obtains a positive mammogram, plus the probability that the patient does not have the disease and obtains a positive mammogram: Pr(D)=Pr(H ∩D)+Pr(~H ∩D). The conditional probability rule can be further applied to express this latter quantity as the complement of the base-rate multiplied by the probability that the patient obtains a positive mammogram, given that she does not have the disease (i.e., the false alarm rate of the test): Pr(~H ∩D)=Pr(~H) Pr(D |~H). Thus, according to Bayes' theorem, the probability that the patient has breast cancer, given that she has a positive mammography, equals Pr(H | D)=Pr(H | D) / Pr(D)=Pr(H) Pr(D | H) / Pr(H) Pr(D | H)+Pr(~H) Pr(D |~H)=(0.01)(0.80) / [(0.01)(0.80)+(0.99)(0.096)], or 7.8 per cent.

2. When estimated from natural frequency formats or formats expressing numbers of chances, because they entail the sample and effect sizes, posterior probabilities can be calculated in a way that does not require that the probabilities be multiplied by the base-rates. The following simple form can be used to calculate the probability of a hypothesis (H) given datum (D): Pr(H|D)=[N(HD)/N(HD)+N(~HD)], where N(HD) is the number of cases having the datum in the presence of the hypothesis, and N(~HD) is the number of cases having the datum in the absence of the hypothesis. This form requires that the respondent attend only to the N (HD) and the N(~ HD), whereas estimating posteriors with percentages requires transforming percentage values into conditional probabilities by incorporating base-rates, making the calculation more complex than under natural frequency formats.

3. There may be an important relation between sensitivity to nested-set structure and the law of the excluded middle that appears in logic. By this rule, all propositions of the form “p or not-p” hold. We apply the rule, for example, to infer that everyone either has a disease or does not have the disease. We use it again to infer that everyone has some symptom or does not have it. Thus, the logical trees cited by natural frequency theorists are consistent with this fundamental logical rule (Over Reference Over and Roberts2007).

4. Girotto and Gonzalez (Reference Girotto and Gonzalez2001) point out that the chance representation of probability is commonly employed in everyday situations, such as when someone says, “A tossed coin has one out of two chances of landing head up” or that “there is one out of a million chances of winning the lottery.” Chances preserve information about the size of the reference class (i.e., the total population of chances). Hoffrage et al. (Reference Hoffrage, Gigerenzer, Krauss and Martignon2002) argue that chances are just frequencies. This is false (see Girotto & Gonzalez Reference Girotto and Gonzalez2002). Chances refer to the probability of a single event and are based on the total population of chances rather than a finite sample of observations. The chances, for example, of drawing an ace from a standard deck of playing cards are “4 out of 52”: There are four ways that an ace can be drawn from the deck of 52 cards. In contrast to natural frequencies, the size of the reference class represents the total population (i.e., the deck of 52 cards). We might observe, for example, that one out of 10 cards randomly drawn from the deck is an ace, but this method of “natural sampling” would not represent the chance or number of ways of drawing an ace from the full deck. Chances cannot be directly assessed by “counting occurrences of events as they are encountered and storing the resulting knowledge base for possible use later” (i.e., natural sampling; Brase Reference Brase2002b, p. 384). Chances are thus distinct from natural frequencies.

5. The mind-as-Swiss-army-knife, natural frequency algorithm, and natural frequency heuristic theories do not concern the encoding of event frequencies under naturalistic settings in general, but focus only on event frequencies that have a partitive structure. Therefore, these approaches do not address the encoding of non-partitive event frequencies (e.g., the event frequency of naturally occurring independent events). Given that both frequencies exist in nature, it is unclear why only frequencies of the latter type are deemed important.

6. Bayes' theorem in odds form refers to the probability in favor of a hypothesis (H) over the probability of an alternative hypothesis (~H), given observed datum (D) (i.e., the posterior odds: [Pr(H | D) / Pr(~H | D)]. To compute the posterior odds, Bayes' theorem incorporates two factors: the likelihood ratio and the prior odds. The likelihood ratio is a measure of whether the datum is diagnostic with respect to the hypothesis (H). If the evidence is diagnostic then the likelihood ratio will be positive, demonstrating that the observed datum is more likely to occur under the presence of the hypothesis (H) than under the alternative hypothesis (~H). The prior odds is the ratio of base-rate probabilities [Pr(H) / Pr(~H)]. Bayes' theorem in odds form states that the product of these quantities yields the posterior odds, Pr(H | D) / Pr(~H | D)=[Pr(D | H) / Pr(D | ~H)] * [Pr(H) / Pr(~H)]. To directly estimate the relative weight of the likelihood ratios and prior odds, Bayes' theorem in odds form can be logarithmically transformed to yield log [Pr(H | D) / Pr(~H | D)]=log [Pr(D | H) / Pr(D |~H)]+log [Pr(H) / Pr(~H)]. Under this formulation, the likelihood ratios and prior odds can be treated as independent variables in a regression analysis to assess the relative contribution of each factor in Bayesian inference.

References

Ayton, P. & Wright, G. (1994) Subjective probability: What should we believe? In: Subjective probability, ed Wright, G. & Ayton, P., pp. 163–83. Wiley.Google Scholar
Bauer, M. I. & Johnson-Laird, P. N. (1993) How diagrams can improve reasoning. Psychological Science 4:372–78.CrossRefGoogle Scholar
Bradburn, N. M., Rips, L. J. & Shevell, S. K. (1987) Answering autobiographical questions: The impact of memory and inference on surveys. Science 236:157–61.CrossRefGoogle ScholarPubMed
Brase, G. L. (2002a) Ecological and evolutionary validity: Comments on Johnson-Laird, Legrenzi, Girotto, Legrenzi, & Caverni's (1999) mental-model theory of extensional reasoning. Psychological Review 109:722–28.CrossRefGoogle Scholar
Brase, G. L. (2002b) Which statistical formats facilitate what decisions? The perception and influence of different statistical information formats. Journal of Behavioral Decision Making 15:381401.CrossRefGoogle Scholar
Brase, G. L., Fiddick, L. & Harries, C. (2006) Participant recruitment methods and statistical reasoning performance. The Quarterly Journal of Experimental Psychology 59:965–76.CrossRefGoogle ScholarPubMed
Calvillo, D. P. & Revlin, R. (2005) The role of similarity in deductive categorical inference. Psychonomic Bulletin and Review 12:938–44.CrossRefGoogle ScholarPubMed
Casscells, W., Schoenberger, A. & Graboys, T. B. (1978) Interpretation by physicians of clinical laboratory results. The New England Journal of Medicine 299:9991000.CrossRefGoogle ScholarPubMed
Cosmides, L. & Tooby, J. (1996) Are humans good intuitive statisticians after all? Rethinking some conclusions from the literature on judgment under uncertainty. Cognition 58:173.CrossRefGoogle Scholar
Eddy, D. M. (1982) Probabilistic reasoning in clinical medicine: Problems and opportunities. In: Judgment under uncertainty: Heuristics and biases, ed Kahneman, D., Slovic, P. & Tversky, A., pp. 249–67. Cambridge University Press.CrossRefGoogle Scholar
Estes, W. K., Campbell, J. A., Hatsopoulos, N. & Hurwitz, J. B. (1989) Base-rate effects in category learning: A comparison of parallel network and memory storage-retrieval models. Journal of Experimental Psychology: Learning, Memory, and Cognition 15:556–71.Google ScholarPubMed
Evans, J. St. B. T., Handley, S. J., Over, D. E. & Perham, N. (2002) Background beliefs in Bayesian inference. Memory & Cognition 30:179–90.CrossRefGoogle ScholarPubMed
Evans, J. St. B. T., Handley, S. J., Perham, N., Over, D. E. & Thompson, V. A. (2000) Frequency versus probability formats in statistical word problems. Cognition 77:197213.CrossRefGoogle ScholarPubMed
Evans, J. St. B. T. & Over, D. E. (1996) Rationality and reasoning. Psychology Press.Google Scholar
Fodor, J. A. (1983) Modularity of mind. MIT Press.CrossRefGoogle Scholar
Fox, C. & Levav, J. (2004) Partition-edit-count: Naïve extensional reasoning in judgment of conditional probability. Journal of Experimental Psychology: General 133:626–42.CrossRefGoogle ScholarPubMed
Gigerenzer, G. (1993) The superego, the ego, and the id in statistical reasoning. In: G. A Handbook of Data Analysis in the Behavioral Sciences, ed Keren, G. & Lewis, G., pp. 331–39. Erlbaum.Google Scholar
Gigerenzer, G. (1996) The psychology of good judgment: Frequency formats and simple algorithms. Medical Decision Making 16:273–80.CrossRefGoogle ScholarPubMed
Gigerenzer, G. (2006) Ecological rationality: Center for Adaptive Behavior and Cognition summary of research area II. Retrieved October 1, 2006, from the Center for Adaptive Behavior and Cognition Web site: http://www.mpib-berlin.mpg.de/en/forschung/abc/forschungsfelder/feld2.htmGoogle Scholar
Gigerenzer, G., Hell, W. & Blank, H. (1988) Presentation and content: The use of base-rates as a continuous variable. Journal of Experimental Psychology: Human Perception and Performance 14:513–25.Google Scholar
Gigerenzer, G. & Hoffrage, U. (1995) How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review 102:684704.CrossRefGoogle Scholar
Gigerenzer, G., Hoffrage, U. & Ebert, A. (1998) AIDS counseling for low-risk clients. AIDS Care 10:197211.CrossRefGoogle ScholarPubMed
Gigerenzer, G. & Regier, T. P. (1996) How do we tell an association from a rule? Psychological Bulletin 119:2326.CrossRefGoogle Scholar
Gigerenzer, G. & Selten, R., eds. (2001) Bounded rationality: The adaptive toolbox. MIT Press.Google Scholar
Gigerenzer, G., Todd, P. & the ABC Research Group (1999) Simple heuristics that make us smart. Oxford University Press.Google Scholar
Gilovich, T., Griffin, D. & Kahneman, D., eds. (2002) Heuristics and biases: The psychology of intuitive judgment. Cambridge University Press.CrossRefGoogle Scholar
Girotto, V. & Gonzalez, M. (2001) Solving probabilistic and statistical problems: A matter of information structure and question form. Cognition 78:247–76.CrossRefGoogle ScholarPubMed
Girotto, V. & Gonzalez, M. (2002) Chances and frequencies in probabilistic reasoning: Rejoinder to Hoffrage, Gigerenzer, Krauss, and Martignon. Cognition 84:353–59.CrossRefGoogle ScholarPubMed
Girotto, V. & Gonzalez, M. (in press) Children's understanding of posterior probability. Cognition. DOI:10.1016/j.cognition.2007.02.005.Google Scholar
Gluck, M. A. & Bower, G. H. (1988) From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General 117:227–47.CrossRefGoogle ScholarPubMed
Griffin, D. & Buehler, R. (1999) Frequency, probability, and prediction: Easy solutions to cognitive illusions? Cognitive Psychology 38:4878.CrossRefGoogle ScholarPubMed
Griggs, R. A. & Newstead, S. (1982) The role of problem structure in a deductive reasoning task. Journal of Experimental Psychology: Learning, Memory, and Cognition 8:297307.Google Scholar
Grossen, B. & Carnine, D. (1990) Diagramming a logic strategy: Effects on difficult problem types and transfer. Learning Disability Quarterly 13:168–82.CrossRefGoogle Scholar
Hammerton, M. (1973) A case of radical probability estimation. Journal of Experimental Psychology 101:252–54.CrossRefGoogle Scholar
Hoffrage, U., Gigerenzer, G., Krauss, S. & Martignon, L. (2002) Representation facilitates reasoning: What natural frequencies are and what they are not. Cognition 84:343–52.CrossRefGoogle ScholarPubMed
Johnson-Laird, P. N., Legrenzi, P., Girotto, V., Legrenzi, M. S. & Caverni, J.-P. (1999) Naïve probability: A mental model theory of extensional reasoning. Psychological Review 106:6288.CrossRefGoogle ScholarPubMed
Kahneman, D. & Frederick, S. (2002) Representativeness revisited: Attribute substitution in intuitive judgment. In: Heuristics and biases: The psychology of intuitive judgment, ed. Gilovich, T., Griffin, D. & Kahneman, D., pp. 4981. Cambridge University Press.CrossRefGoogle Scholar
Kahneman, D. & Frederick, S. (2005) A model of heuristic judgment. In: The Cambridge Handbook of Thinking and Reasoning, ed. Holyoak, K. J. & Morris, R. G., pp. 267–93. Cambridge University Press.Google Scholar
Kahneman, D. & Frederick, S. (1973) On the psychology of prediction. Psychological Review 80:237–51.CrossRefGoogle Scholar
Kahneman, D. & Frederick, S. (1996) On the reality of cognitive illusions. Psychological Review 103:582–91.CrossRefGoogle ScholarPubMed
Keren, K. & Thijs, L. J. (1996) The base-rate controversy: Is the glass half-full or half empty? Behavioral and Brain Sciences 19:26.CrossRefGoogle Scholar
Kleiter, G. D. (1994) Natural sampling: Rationality without base-rates. In: Contributions to mathematical psychology, psychometrics, and methodology, ed. Fischer, G. H. & Laming, D., pp. 375–88. Springer-Verlag.CrossRefGoogle Scholar
Kleiter, G. D., Krebs, M., Doherty, M. E., Gavaran, H., Chadwick, R. & Brake, G. B. (1997) Do subjects understand base-rates? Organizational Behavior and Human Decision Processes 72:2561.CrossRefGoogle Scholar
Koehler, J. J. (1996) The base-rate fallacy reconsidered: Descriptive, normative, and methodological challenges. Behavioral and Brain Sciences 19:153.CrossRefGoogle Scholar
Kurzenhauser, S. & Hoffrage, U. (2002) Teaching Bayesian reasoning: An evaluation of a classroom tutorial for medical students. Medical Teacher 24:516–21.CrossRefGoogle ScholarPubMed
Lindsey, S., Hertwig, R. & Gigerenzer, G. (2003) Communicating statistical DNA evidence. Jurimetrics 43:147–63.Google Scholar
Linton, M. (1975) Memory for real-world events. In: Explorations in cognition, ed. Norman, D. A. & Rumelhart, D. E., pp. 376404. Freedman Press.Google Scholar
Linton, M. (1982) Transformations of memory in everyday life. In: Memory observed, ed. Neisser, U., pp. 7791. Freedman Press.Google Scholar
Macchi, L. (2000) Partitive formulation of information in probabilistic problems: Beyond heuristics and frequency format explanations. Organizational Behavior and Human Decision Processes 82:217–36.CrossRefGoogle ScholarPubMed
Mellers, B. & McGraw, A. P. (1999) How to improve Bayesian reasoning: Comments on Gigerenzer & Hoffrage (1995) Psychological Review 106:417–24.CrossRefGoogle Scholar
Monaghan, P. & Stenning, K. (1998) Effects of representational modality and thinking style on learning to solve reasoning problems. In: Proceedings of the 20th Annual Meeting of the Cognitive Science Society, ed. Gernsbacher, M. A. & Derry, S. J., pp. 716–21. Erlbaum.Google Scholar
Newstead, S. E. (1989) Interpretational errors in syllogistic reasoning. Journal of Memory and Language 28:7891.CrossRefGoogle Scholar
Nosofsky, R. M., Kruschke, J. K. & McKinley, S. C. (1992) Combining exemplar-based category representations and connectionist learning rules. Journal of Experimental Psychology: Learning, Memory, and Cognition 18:211–33.Google ScholarPubMed
Over, D. E. (2000a) Ecological rationality and its heuristics. Thinking and Reasoning 6:182–92.Google Scholar
Over, D. E. (2000b) Ecological issues: A reply to Todd, Fiddick, & Krauss. Thinking and Reasoning 6:385–88.CrossRefGoogle Scholar
Over, D. E. (2003) From massive modularity to meta-representation: The evolution of higher cognition. In: Evolution and the psychology of thinking: The debate, ed. Over, D. E., pp. 121–44. Psychology Press.Google Scholar
Over, D. E. (2007) Content-independent conditional inference. In: Integrating the mind: Domain general versus domain specific processes in higher cognition, ed. Roberts, M. J., pp. 83103. Psychology Press.Google Scholar
Over, D. E. & Green, D. W. (2001) Contingency, causation, and adaptive inference. Psychological Review 108:682–84.CrossRefGoogle ScholarPubMed
Pillemer, E. D., Rhinehart, E. D. & White, S. H. (1986) Memories of life transitions: The first year in college. Human Learning: Journal of Practical Research and Applications 5:109–23.Google Scholar
Ramsey, F. P. (1964) Truth and probability. In: Studies in subjective probability, ed. Kyburg, H. E. Jr., & Smokler, E., pp. 6192. Wiley.Google Scholar
Reyna, V. F. (1991) Class inclusion, the conjunction fallacy, and other cognitive illusions. Developmental Review 11:317–36.CrossRefGoogle Scholar
Reyna, V. F. & Brainerd, C. J. (1992) A fuzzy-trace theory of reasoning and remembering: Paradoxes, patterns, and parallelism. In: From learning processes to cognitive processes: Essays in honor of William K. Estes, ed Healy, A., Kosslyn, S. & Shiffrin, R., pp. 235–59. Erlbaum.Google Scholar
Reyna, V. F. & Brainerd, C. J. (1994) The origins of probability judgment: A review of data and theories. In: Subjective probability, ed Wright, G. & Ayton, P., pp. 239–72. Wiley.Google Scholar
Reyna, V. F. & Brainerd, C. J. (1995) Fuzzy-trace theory: An interim synthesis. Learning and Individual Differences 7:175.CrossRefGoogle Scholar
Savage, L. J. (1954) The foundations of statistics. Wiley.Google Scholar
Schwartz, N. & Sudman, S. (1994) Autobiographical memory and the validity of retrospective reports. Springer Verlag.CrossRefGoogle Scholar
Schwartz, N. & Wanke, M. (2002) Experimental and contextual heuristics in frequency judgment: Ease of recall and response scales. In: Etc. Frequency processing and cognition, ed Sedlmeier, P. & Betsch, T., pp. 89108. Oxford University Press.Google Scholar
Sedlmeier, P. & Betsch, T. (2002) Etc. Frequency processing and cognition. Oxford University Press.CrossRefGoogle Scholar
Sedlmeier, P. & Gigerenzer, G. (2001) Teaching Bayesian reasoning in less than two hours. Journal of Experimental Psychology: General 130:380400.CrossRefGoogle ScholarPubMed
Sloman, S. A. (1996a) The empirical case for two systems of reasoning. Psychological Bulletin 119:322.CrossRefGoogle Scholar
Sloman, S. A. (1998) Categorical inference is not a tree: The myth of inheritance hierarchies. Cognitive Psychology 35:133.CrossRefGoogle Scholar
Sloman, S. A. (2005) Causal models: How we think about the world and its alternatives. Oxford University Press.CrossRefGoogle Scholar
Sloman, S. A., Lombrozo, T. & Malt, B. C. (in press) Mild ontology and domain-specific categorization. In: Integrating the mind, Roberts, M. J.. Psychology Press.Google Scholar
Sloman, S. A. & Over, D. E. (2003) Probability judgment from the inside and out. In: Evolution and the psychology of thinking: The debate, ed Over, D. E., pp. 145–70. Psychology Press.Google Scholar
Sloman, S. A., Over, D. E., Slovak, L. & Stibel, J. M. (2003) Frequency illusions and other fallacies. Organizational Behavior and Human Decision Processes 91:296309.CrossRefGoogle Scholar
Stanovich, K. E. (1999) Who is rational? Studies of individual differences in reasoning. Erlbaum.CrossRefGoogle Scholar
Stanovich, K. E. & West, R. F. (1998a) Individual differences in rational thought. Journal of Experimental Psychology: General 127:161–88.CrossRefGoogle Scholar
Stanovich, K. E. (2000) Individual differences in reasoning: Implications for the rationality debate. Behavioral and Brain Sciences 23:645726.CrossRefGoogle ScholarPubMed
Tversky, A. & Kahneman, D. (1974) Judgment under uncertainty: Heuristics and biases. Science 185:1124–31.CrossRefGoogle ScholarPubMed
Tversky, A. & Kahneman, D. (1983) Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review 90:293315.CrossRefGoogle Scholar
Wagenaar, W. A. (1986) My memory: A study of autobiographical memory over six years. Cognitive Psychology 18:225–52.CrossRefGoogle Scholar
Yamagishi, K. (2003) Facilitating normative judgments of conditional probability: Frequency or nested sets? Experimental Psychology 50:97106.CrossRefGoogle ScholarPubMed
Zacks, R. T. & Hasher, L. (2002) Frequency processing: A twenty-five year perspective. In: Etc. Frequency processing and cognition, ed Sedlmeier, P. & Betsch, T., pp. 2136. Oxford University Press.CrossRefGoogle Scholar
Figure 0

Table 1. Prerequisites for reduction of base-rate neglect according to 5 theoretical frameworks

Figure 1

Table 2. Empirical predictions of the five theoretical frameworks

Figure 2

Table 3. Percent correct for Bayesian inference problems reported in the literature (sample sizes in parentheses)

Figure 3

Figure 1. A diagrammatic representation of Bayes theorem: Euler circles (Sloman et al., 2003).

Figure 4

Table 4. Example questions presented by Brase (2002b)

Figure 5

Table 5. Percent correct for Bayesian inference problems reported in the literature (sample sizes in parentheses)